Magnify.net Blog

Service Advisory: S3 Server Outage

July 20, 2008 - Steve Rosenbaum

Service Advisory:

No doubt many of you have experienced inconsistent server responses from your Magnify.net pages in the past hour or so. We wanted to let you know the source of this outage, and the steps we're taking to correct it.

Amazon S3 has been suffing a major outage this morning.  The outage has been at the worldwide servers of Amazon's S3 Simple Storage Solution offering that powers certain file storage and file delivery elements of the Magnify.net service.

Amazon S3 Service advisories over the past two hours alerted us to a problem, and we have been working with our development team to keep track of the problem , and to remove some of the dependencies that have impacted our pages during Amazon's outage:

Here is the link to the Amazon S3 service status page:

http://status.aws.amazon.com/

Here's the historic information about the Service from earlier today:

9:06 AM PDT We are currently experiencing elevated error rates with S3. We are investigating.
9:27 AM PDT We're investigating an issue affecting requests. We'll continue to post updates here.
9:48 AM PDT Just wanted to provide an update that we are currently pursuing several paths of corrective action.
10:13 AM PDT We are continuing to pursue corrective action.
10:33 AM PDT A quick update that we believe this is an issue with the communication between several Amazon S3 internal components. We do not have an ETA at this time but will continue to keep you updated.
11:02 AM PDT We're currently in the process of testing a potential solution.
11:23 AM PDT Testing is still in progress. We're working very hard to restore service to our customers.
11:46 AM PDT We are still in the process of testing a series of configuration changes aimed at bringing the service back online.
12:06 PM PDT We have now restored communication between a small subset of hosts. We are working on restoring internal communication across the rest of the fleet. Once communication is fully restored, then we will work to restore request processing.

Here are other sevices dealing with the outage as well:

http://smugmug.wordpress.com/2008/07/20/amazon-s3-outage-causes-smugmug-outage/
http://blog.slideshare.net/2008/07/20/amazon-web-services-outage-effecting-slidesharenet/
http://donaldkelly.co.uk/20/07/2008/images-offline/

We expect that there will be an investigation into the outtage once the source of the problem has been discovered. We will explore with our Amazon reps their plans to address this - and will keep you posted as to our plans regarding the overall reliability of Amazon as a service provider to Magnify.net and our valued partners and customers.