20 Jul 2010 Boot, ephemeral & EBS storage performance on amazon cc1.4xlarge instance types
For background and summary writeups of all the various blog posts we have dealing with the new Amazon EC2 “compute cluster” cc1.4xlarge instance types please refer to this summary page: http://bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/?
We talked about the performance of the boot and ephemeral storage in this post: http://bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/?
In this post I’ve finally collected enough data to cover repeated bonnie++ benchmark tests against all the main types of block storage available to the new Amazon cc1.4xlarge “compute cluster” instance types:
- Local boot disk performance
- Performance of a single ephemeral disk
- Performance of the two available ephemeral? disks when striped together with software RAID0
- Performance of a single EBS volumes attached to the instance
- Performance of 4 EBS volumes striped with RAID0 and attached to the instance
- Performance of 8 EBS volumes striped with RAID0 and attached to the instance?
1. Don’t use the boot disk for anything other than booting the operating system. As the results show, the performance of the (no PV driver support) cc1.4xlarge boot disk is the slowest of all possible block storage options available. Really slow. Not worth using for anything other than OS stuff. This also includes not bothering to mess with the size of the available volume.
2. The instance ephemeral storage volumes are fast and should not be ignored. Every cc1.4xlarge EC2 instance comes with a pair of ~840GB ephemeral disk volumes. In living by our own “don’t trust anything to non-persistant storage!” best practice we are guilty of ignoring these drives in situations where they could have been of significant benefit. This will change. The performance of a single ephemeral volume beats the performance we see out of a single EBS volume. A striped pair of ephemeral volumes performs even better and stacks up well even to multiple EBS volumes striped together. The RAID0 pairing of the two ephemeral drives seems to consistently outperform even 8-drive EBS RAID0 volumes when you look at the bonniee++ results for random and sequential file creation and deletion tests. This has major implications for HPC and scientific pipeline processing on the cloud. In particular I can easily envision using the ephemeral drives to build a shared parallel scratch filesystem (think PVFS or GlusterFS) in cluster configurations. This would give you a nice shared scratch storage pool. Even in simpler cluster setups it looks like it would be a win to stage data into the ephemeral storage so it can be used as the target drive for scientific processing (where the input data is not unique and has backup copies elsewhere). We can run the IO heavy analysis against the fast ephemeral storage and send our result data into S3 buckets or a proper EBS volume for downstream handling.
3. Striping EBS volumes into software RAID0 sets is a valid practice. We clearly see performance gains when using more than one EBS volume, the performance gain is significant enough to justify the hassles involved in backing up and protecting EBS-resident sofware RAID sets. We need to do more work (and really need to test 2-volume EBS stripes) but it is clear that there is a measurable performance gain to be had. Not sure if we’d use 8-disk RAID0 sets for production work but looking at 2-disk and 4-disk methods is something that we will be looking seriously at.
Obviously there is much more to be drawn from the data but benchmarking is hard (and controversial) in regular settings let alone trying to get repeatable and consistent results out of a virtualized multi-tennant cloud framework. For now I’d prefer to stick to broad general conclusions and “lessons learned” rather than trying to divine highly specific things out of the raw data.
As usual, you can find all of the raw data in this google spreadsheet. We did not finesse the data at all, the only data munging we did was to run tests repeatedly and then average out the results in order to arrive at the numbers used in the graphs.
Here are the numerical numbers behind the graphs, click on the image for the full-size version:
And here are the graphs. We’ve broken out the graphs to represent the results measured in “K/sec” versus just ” /sec”.
Read, Write & Rewrite results for all cc1.x4large storage types (click on image for full size):
Sequential create, delete & seek results for all cc1.x4large storage types (click on image for full size):