Amazon’s Cloud Computing Platform as a Service(PaaS) solution, Amazon Elastic Compute Cloud (EC2) and it’s related services, Elastic Block Storage (EBS) & Relational Database Service were severely affected today as a result of downtime in North Virginia data centers.
This incident has brought down several of Amazon’s high profile clients, including FourSquare, Reddit, Quora, Heroku and several other startups relying on EC2. The problems started at about 01:00 PDT and Amazon is still working on it to recover and bring the services back on their feet. While Amazon hasn’t yet provided the full details of the outage, they have however posted an update saying that a networking event resulted in large re-mirroring of EBS volumes in US-EAST-1 region, which is catered to by the North Virginia data center.
This re-mirroring resulted in significant latency, making sites hosted in this region go offline. At the time of writing this, Reddit was under emergency read-only mode, Quora & FourSquare have put up a static “We’re having technical difficulties” page, while Heroku was crawling.
Amazon’s EBS has come under fire off late due to their elevated error rates, so much so that reddit’s admin jedberg has mentioned about “figuring out ways how to not use EBS anymore”. EBS’s last outage was just about a month ago, taking reddit down for a good part of the day.
It is, however, important to note that some other services using AWS, such as Netflix & Twilio have not suffered downtime due to their use of AWS instances from multiple availability zones, ensuring that even if a data center goes down, the instances from other availability zones are able to continue serving the websites.