Amazon's huge cloud computing outage tracked to bad keystrokes

By Brian Fung

Washington Post·

2 Mar, 2017 08:24 PM2 mins to read

An outage across large parts of the internet has been blamed on a simple employee mistake.

Amazon is back with an apology and an explanation for a high-profile malfunction that caused websites across the Internet to grind to a halt for hours on Wednesday.

The online retail giant, which runs a popular cloud computing platform for sites such as Airbnb, Netflix, reddit and Quora, is blaming the outage on a simple - and perhaps somewhat amusing - employee mistake.

A team member was doing a bit of maintenance on Amazon Web Services Tuesday, trying to speed up the billing system, when he or she tapped in the wrong codes - and inadvertently took a few more servers offline than the procedure was supposed to, Amazon said in a statement. With a few mistaken keystrokes, the employee wound up knocking out systems that supported other systems that help AWS work properly.

The cascading failure meant that many websites could no longer make changes to the information stored on Amazon's cloud platform. For everyday users, that meant being unable to load pages, transfer files or take other actions on some of the sites they regularly use.

"In this instance, the tool used allowed too much capacity to be removed too quickly," Amazon said. "We have modified this tool to remove capacity more slowly and added safeguards to prevent capacity from being removed when it will take any subsystem below its minimum required capacity level."

Translation: Employees will no longer be able to unplug whole parts of the Internet by mistake.

Amazon said it was sorry for the outage's effect on its customers and vowed to learn from the incident. One immediate next step? The company said it will subdivide its servers even more than before "to reduce blast radius and improve recovery," should something like this happen again.

Latest from Business

Premium

Stock takes

Stock Takes: Rain not enough to stop dry weather from hurting NZ power company earnings

01 May 07:00 PM

Construction

On The Up: Christchurch builders rally to support St John with charity house project

01 May 05:00 PM

Business

Amazon's huge cloud computing outage tracked to bad keystrokes

Latest from Business

Stock Takes: Rain not enough to stop dry weather from hurting NZ power company earnings

On The Up: Christchurch builders rally to support St John with charity house project

Oliver Mander on CEO pay

Boost cashflow before May 7

Latest from Business

Stock Takes: Rain not enough to stop dry weather from hurting NZ power company earnings

Oliver Mander on CEO pay

On The Up: Christchurch builders rally to support St John with charity house project

ASB and POLi strike deal - is the Govt's competition push working?

“Not an invisible footprint”: Why technology supply chains need optimising