Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems to have only affected one region. Am I missing something?


Yes. It started as a failure in one region, and propagated to others as it overloaded the "control plane" -- the stuff that runs "the cloud", and EBS tried to replicate "failed" disks to the point that Amazon ran out of disk space in the cluster. At the time, I was paying for RDS Multi-AZ which runs your database in multiple availability zones at once with hot failover if the primary goes offline. It failed to fail over despite that. Many large sites went down for a very long time that day, and people couldn't spawn replacement instances even in other AZs than the one the failure started in.


You're confusing region with AZ. They've never had a multi-region outage (yet).


It was one region, multiple availability zones. You're right Multi-AZ != Multi-region (for things like Sandy and natural disasters) BUT they have mostly separate infra which does make multi-az failures very unlikely. Due to its simplicity (with VPC and stuff) some people (perhaps wrongly) treat multi-az like multi-region.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: