"I/O requests cost $0.10 per million" :)
I wonder what is the amount MySql generates? Any data?
Edit, as I found this (http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/):
"As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful."
> But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7.
I don't know what that monitoring thing does, but if you make it hammer your non-persistent local storage, and then sync logs to S3, you should be fine.
As I understand this service, it's only for transactional stuff that absolutely need to be there after a reboot (e.g. transactional DB), if the plug is pulled accidentially.
It is useful as a media server, but a CDN is a lot more.
"CDN nodes are deployed in multiple locations, often over multiple backbones. These nodes cooperate with each other to satisfy requests for content by end users, transparently moving content to optimize the delivery process. Optimization can take the form of reducing bandwidth costs, improving end-user performance, or increasing global availability of content.
The number of nodes and servers making up a CDN varies, depending on the architecture, some reaching thousands of nodes with tens of thousands of servers.
Requests for content are algorithmically directed to nodes that are optimal in some way. When optimizing for performance, locations that are best for serving content to the user may be chosen. This may be measured by choosing locations that are the fewest hops or fewest number of network seconds away from the requesting client, so as to optimize delivery across local networks. When optimizing for cost, locations that are least expensive may be chosen instead. Often these two goals tend to align, as servers that are close to the end user sometimes have an advantage in serving costs, perhaps because they are located within the same network as the end user. However the value of a CDN is often demonstrated when these two goals do not align i.e. when the best performing servers and network route is located in the furthest geographic distance."
There might be an opportunity here. The facilities of flexiscale and amazon EC2/EBS are now comparable. A service that sat on top of the two and offered geographical redundancy could be a real winner.
"The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% - 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that will typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives."
I'm really impressed they even acknowledge the possibility of failures instead of just touting all availability and reliability stats.
Also, 10 times more reliable than a hard drive is pretty good. For me, the takeaway is that I should treat this as a hard drive and NOT as a magical solution that allows me to ignore everything I have learned about systems administration.
Babysitting servers at 2AM really drives home lessons for me and the urge to simply declare those problems moot is strong--but would be a mistake.
I don't see how this is different from the risks associated with any other storage service. Just be sure to have a proper backup strategy/maintenance plan.
It's unacceptable to the poster because they don't want to have to deal with the realities of computing infrastructure. They want a turn-key, no-hassle, no-think solution.
Which is fine to want.
The EBS failure rate is 0.1-0.5% annually. That's awesome. At 0.1%, the disk is likely (mean time to failure) to fail in 500 years on average. At 0.5%, it is likely to fail in 100 years on average. At either point, I'm more than comfortable. I can snapshot the drive daily/weekly/monthly and when you combine the snapshots as a backup with the likelyhood of failure in any given year being so low, I can rest easy.
Compare that with commercial drives which are likely to fail in a decade give or take. . . Well, I know where I'd rather have my data.
Getting back to the original poster, (s)he wants a system that takes care of the snapshots without intervention. So, package up EBS along with auto-snapshotting with an SLA saying that you won't loose more than a day of data and you have a business. You charge a premium because unlike EBS, you're reliable. All the while, you are just EBS with S3 - something that the individual could do themselves. EBS is unlikely to ever fail for your clients since I'm guessing no one is going to have a client for 100 years and even if that happens, you just restore one of the daily snapshots. Nice! You've lived up to your SLA while getting paid!
Think about it, the backups aren't where the cost is. 1000 PUT requests with 4MB chunks means 4GB is backed up for a penny there in terms of the number of requests. Data transfer from EC2 to S3 is free. So, those aren't the costs. As long as you can consolidate the delta backups so you aren't storing a version for every day since inception, I don't see why this service couldn't be offered for 2x the cost of EBS.
I realize there is no 0% failure, 100% guaranteed storage.
However, saying "yep, your storage space is now dead and you have lost all your files stored there" is not something I feel comfortable with. Why not offer something more resilient?
What magic storage device have you been using that has a failure rate better than the industry average for disk drives? If you are really concerned about this issue you can just run a software RAID over these raw block devices.
Stacking would be the wrong way to go about it. With ZFS, you could still get 99.99999...% but with only a slight percentage increase in necessary storage space.
I don't think you read closely enough: they define the failure rate as describing the likelihood of complete loss of the volume, not minor data corruption.
Unless you're mirroring your data across multiple drives, there is no way ZFS can magically recreate a volume that simply ceases to exist. Think of it this way: how would ZFS help you in your own server room if a non-mirrored disk caught fire and melted? Answer: it wouldn't. You'd go to backups, like you would with any other filesystem.
It is completely unacceptable for a normal FS. Something a bit more advanced like ZFS may be able to cope. What I am wondering about is why amazon didn't implement such a solution on a massive scale themselves and then run EBS on top of it.
Because some users don't need reliability beyond what a normal hard drive offers, and they really shouldn't be compelled to pay for it. If you need reliability on these devices you can run software RAID or ZFS over the raw block devices being offered and tune the cost/reliability equation in whatever manner makes the most sense for your application.
This could be leveraged into a much cheaper (albeit less reliable) version of simple queue service. You can have multiple EC2 instances feeding off of the same disk used as a central repository.
In general, EBS is making me think creatively about how to better architect distributed systems. SQS was both expensive and somewhat limited in scope. But now that I get disks in common, there's no restriction as to how I might partition work among EC2s.
SQS got a lot cheaper, I think around Feb this year. Are you sure you aren't going by the old prices? I find it very affordable now.
$0.01 per 10,000 requests
$0.100 per GB in
$0.170 per GB out (less with higher volume).
I can't remember exactly, but I think it was only a per request cost before of around $0.10 per 1,000 or something.
Also the in/out costs don't apply when they're coming from EC2, which would be the case in your scenario I believe. So from EC2 if you're doing an EC2 based distributed system, you're looking at one dollar per million requests (so for processing a message in a distributed system typically one dollar per 500,000 messages pushed then pulled).
Pretty cheap by my standards. Much cheaper than it was at initial launch.
I've actually been racking my brain trying to come up with excuses to try it out and use it for something. I haven't come up with too much yet, but it seems pretty cool.
I've run http://Dibs.net on EC2 for over a year, with images served up on S3. Several servers have uptimes since I booted them in June '07. I have nothing but good things to say about my experience with AWS.
If you can, I highly recommend going to the AWS Start Up Tour (http://aws.typepad.com/aws/2008/08/2008-aws-start.html) It's a good place to meet people and last year Jeff Bezos and John Doerr showed up to the one in Palo Alto.
Edit: Forgot to mention - I run PostgreSQL on instance storage with log shipping and backups to S3, and a couple of MySQL DBs for Cacti and geolocation lookups. EBS will help a lot in this area, which is kind of a pain right now. Can't wait to start using it.
This might reflect what I think of the whole cloud-hype, but I read that as "Easy BS". Maybe not the name you want coined for something which is supposed to be good, not shit ;)
Edit, as I found this (http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/): "As a point of reference, our main database server is pretty busy and chugs along at an average of 17 transactions per second, which should total to around $4.40 per month. But our monitoring servers, prior to some recent optimizations, hammered the disks as fast as they would go at over 1000 random writes per second sustained 24×7. That would end up costing over $250 per month! As far as I can tell, for most situations the EBS transaction costs will be in the noise, but you can make it expensive if you’re not careful."