now with more cloud!

18 Feb 2011 now with more cloud!, new and improved with 30% more cloud!

cloudfront status screenshot

Figure 1: Amazon CloudFront “distribution” for our static blog files.

See? Adding the CLOUD to anything is guaranteed to make everything more awesome. Even if it is silly and pointless as our little experiment here to use a full-on commercial Content Distribution Network (CDN) for this tiny little low-traffic blog.

Think we are playing you? View the source to this web page and you’ll see all sorts of content coming from Amazon’s CloudFront CDN:

Figure 2: Cache/CDN debug output showing replaced URLs…

… although we have a bit to learn about CloudFront because as soon as we switched our CDN settings to CloudFront from a standard S3 bucket we seemed to have stopped caching the *.jss and .css files. And to be totally honest you may see various CloudFront and Amazon S3 URLs as we are playing around with both and have switched back and forth several times. The pretty debug output embedded as HTML comments at the bottom of the page (seen in the screencap above) will also disappear at some point as it adds a small performance penalty to page loads.

And the point is…

Well the business point is that our blog is becoming our primary web presence and anything we can do to make it faster and more responsive is a good thing.  That said, however, if that was our only requirement we could have simply deployed the excellent “W3 Total Cache” wordpress plugin, activated the disk cache feature and maybe primed the pump a bit with a cache primer script that crawled the sitemap to pre-populate things.

The actual point is that we here at BioTeam are heavy users and promoters of Amazon’s various Infrastructure-as-service (IAAS) cloud offerings but we primarily approach AWS with an interest in scientific and high performance computing in mind, not the nuts and bolts of deploying high-volume, resource heavy websites and interactive services.

In our cloud training classes we do mention the use of S3 Buckets for hosting web content, especially with the recent news that makes it easier for your HTML root directory to be a simple S3 Bucket in the cloud. We also talk in general terms about CloudFront and how DNS CNAME tricks can be used with both services to make it appear that your own personal domains are handling caching and remote object storage.

So we talk about this stuff but we had not used it in any serious production-worthy sense, the most we’ve done in the past is use CNAME aliases on S3 buckets used by us and our clients. Making some updates and changes to our blog gave us the perfect chance to use this stuff in a real setting even though it is grossly overkill for our tiny little website.

In order to get our feet wet with cloud content caching on Amazon we had to:

  • Find and integrate a caching tool into our blog platform (W3 Total Cache WordPress Plugin)
  • Choose an initial “origin” point for our cached content (An Amazon S3 Bucket called “bt-blog-cache”)
  • Sign up and activate the CloudFront service on our Amazon AWS account
  • Create a brand new CloudFront “distribution”
  • Point our “origin” S3 bucket at the CloudFront Distribution
  • Create some DNS CNAME aliases so that we have a handy BioTeam domain name that points to our cloud cache

At the end of the day, this is just an exercise in eating our own dog food. Its far easier to talk about and train people on things that we know intuitively from being hands-on with. This new blog setup gives us a bit more experience in a cloud use case that we don’t normally see when working on scientific and research workflows and pipelines.

I’ll end this post with a few more screenshots taken during the setup and deployment process.

Figure 3: Our Amazon S3 bucket that holds the cached content for this site

Figure 4: Creating a new CloudFront “distribution” with an S3 origin bucket

Figure 5: CDN settings in the WordPress plugin

Related Posts
Filter by
Post Page
Employee Posts Tech Notes Articles Screencasts Community News Science
Sort by

Musings on the Pros and Cons of Apache Airflow

One of the most important capabilities in a scientific data ecosystem is a workflow management system. The tools in
2020-12-22 12:10:31


No Comments

Post A Comment