Amazon S3 Redirection Instructions

April 9, 2015 header

Hosting static web pages on Amazon S3 is easy. Making sure your links don't break isn't.

I've recently moved this blog from a freely hosted wordpress.com blog to this static host on Amazon S3. In hopes of maintaining a steady stream of visitors, I've also purchased the $13/year website forwarding package from Wordpress. Now anyone who tries to visit any of my old blog URL's will be automatically forwarded to this new host. Problem is, I've made slight changes to the blog directory structure, and now most of the automatically forwarded traffic is served a 404 page. Not good! Amazon S3 has the ability to automatically redirect traffic, the only issues being that:

  1. Your limited to 50 redirects total
  2. Redirects must be in Amazon valid XML format

So, with these limits in mind, I'm going to step you through the way to create your XML file and put Amazon S3 redirects to work for you!

First things first: Compile a list of old URL's

With an upper limit of 50 redirects, you need to select your highest traffic pages. I used my Wordpress Site Stats to get a list of my 50 top trafficked pages of all time. I used my hacker skills to 'get source' of the resulting web page, then ran the raw HTML through this online engine which extracts the URLs and puts them into a spreadsheet format.

http://tools.buzzstream.com/link-building-extract-urls
I imported the resulting .csv download into Google Spreadsheets. Remove the 2 second columns (named as Domain and Anchor), as we don't need them. Use the search and replace function to also remove any leading Server URLs. Basically, everything from 'http;//blogname.wordpress.com' at the beginning of every URL. Remove any links to .php pages, and slim it to the top 50 links. Here's my top 5 results:

/2012/07/30/looking-for-clever-funny-best-man-speech-jokes/
/2010/07/21/four-loko-alcopop-taste-test-blue-raspberry-orange-blend-or-loko-uva/
/2010/08/23/four-loko-alcopop-taste-test-part-2-lemonade-watermelon-cranberry-lemonade/
/2012/08/25/phillly-naked-bike-ride-2012/
/2010/04/07/make-your-own-power-over-ethernet-injector/

Second things second: Compile a list of new URL's

Wordpress adds the day of publication to the page directory structure, as noted by the /####/##/##/ format in the URLs above. Using grep, I was able to change it to the following example:

/2012/07/looking-for-clever-funny-best-man-speech-jokes/
/2010/07/four-loko-alcopop-taste-test-blue-raspberry-orange-blend-or-loko-uva/
/2010/08/four-loko-alcopop-taste-test-part-2-lemonade-watermelon-cranberry-lemonade/
/2012/08/phillly-naked-bike-ride-2012/
/2010/04/make-your-own-power-over-ethernet-injector/

In this case, I searched for (\d{4})\/(\d{2})\/\d{2} and replaced it with \1/\2 using my favorite text editor, BB Edit. It basically searches for this pattern: ####/##/## and replaces with this pattern: ####/##.

Copy your new streamlined list of URLs, and go back to your spreadsheet. Paste the column of new URL's in a column to the right of the existing URL's. They should match up, so export the resulting file as a Tab-Delimited file.

Turn your spreadsheet into XML

Open your tab-delimited file in your text editor, select all the text, and head over to this free app that converts that input into valid Amazon XML:

http://quiet-cove-8872.herokuapp.com/

The format is shown on the page, but basically, it accepts the original link, then the new link on the same line, with at least a space between them. As we exported our file as tab-delimited, there's a tab between each URL, and 2 URL's on each line. Perfect format for this tool! Paste it in, and hit the 'Give me the XML' button, and the resulting page will be your newly formed XML. Copy that text file, and head over to your Amazon S3 Management Console.

Add to Amazon S3 host, Enjoy!

amazons3 The Redirection Rules for any bucket are part of the Static Website Hosting section. Click the Magnifing Glass next to the bucket to bring up the Properties. Open the Static Website Hosting section, then the Redirection Rules section. Simply paste your XML into the field, then hit Save. If it validates, you'll find it presents you with a tiny green checkmark.

You're done!