“I always mess up some mundane detail.”
How a typo took down S3, the backbone of the internet:
Earlier this week, much of the internet ground to a halt when the servers that power them suddenly vanished. The servers were part of S3, Amazon’s popular web hosting service, and when they went down they took several big services with them. Quora, Trello, and IFTTT were among the sites affected by the disruption. The servers came back online more than four hours later, but not before totally ruining the UK celebration of AWSome Day.
Now we know how it happened. In a note posted to customers today, Amazon revealed the cause of the problem: a typo.
On Tuesday morning, members of the S3 team were debugging the billing system. As part of that, the team needed to take a small number of servers offline. “Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended,” Amazon said. “The servers that were inadvertently removed supported two other S3 subsystems.”
It sounds like a typical Michael Bolton Error to me.
It’s so great we a decentralized network of servers that keeps the Internet up and running all the time.
You know like one of the original goals of ARPANET.