Jeff Atwood (Coding Horror fame) was in for a horror when he realized that his server crashed and his data was gone and due to some reason, the backup mechanism was not working. The complete data in Coding Horror and the StackOverflow blog disappeared.
Since his blog is very popular, many archiving systems including the Google cache have copies of the pages and I hope that they have by now recovered the complete textual data. The biggest problem in this case is getting back the images. There are not many archiving services that may have the complete backup of the images in the website.
So what should Jeff do now?
Since Coding horror is a high traffic blog, I think there is a way to get back at least some of the images. (The probability of this working depends a lot on the traffic to the website and a bit of luck)
Here are the steps:
- Configure the web server to return 304 for every image request. The HTTP status code 304 means that the file is not modified and this means that the browser will fetch the file from its cache if it is present there. (credit: this SuperUser answer)
- In every page in the website, add a small script to capture the image data and send it to the server.
- Save the image data in the server.
- Convert the pixel data to get the original images.Voila!
Capturing the image data
We are going to use the Canvas functionality in HTML 5 to get back the image data.
Here is the code you should insert into the pages of the website. It gets all the images in the current page, loads it to the HTML Canvas, gets the pixel data for the image and sends it to the server through an Ajax post.
This PHP script (Can PHP rescue Jeff?
To be fair, the server side code is trivial) saves the data to files in the server. Note that the files themselves will not be images, they will just contain the pixel data of the images. In addition to this, we are also saving the original file name and the image dimensions. This means that we can easily reconstruct the original images from this data. Data from every visitor is saved in a different file to just to make sure that you have enough redundancy (Watch out for his redundancy filling up your server disks)
Remember that this is a proof of concept code. You will have to modify it to use it in regular production environments and to get some real use from it. There are many limitations to this code. It goes without saying that you will get the image data back from the users only if they have the images cached in their browsers. This script will work only in the latest versions of Chrome, Firefox, Safari, Opera etc. (Don’t ever expect it to work in IE for the next decade). In addition to this, remember that the pixel data will be many times bigger than the original file size and you may have to carefully analyze the disk space usage of this script. (I guess in an emergency, none of these really matters).
You should edit the post URL in the script to match your domain name.
Finally, I have tested the code and it seems to be working (for me, at least). You need to include JQuery in the pages using this script and remember that due to security restrictions in the browsers, you will have to place all these files under the same domain name. Please tell me if there are any other flaws in the code.
[Updated: code changes to reduce the file size by 50%. The decimal numbers were converted to hex and the spaces in between the numbers removed. The file sizes can be further reduced by using the full character set.]

Awesome post!
You now have another subscriber and follower, thanks to @spolsky’s RT
Make that two new followers from that RT.
Awesome! very creative solution, kudos
Congrats, you impressed me.
Very clever.
Truely awesome!
[...] Diovo: Get cached images from your visitors You run a website. Through a combination of oversight and error, you lose every image on your website. O noes—now what? This fellow has the solution: Return a 304 for each image request, triggering past visitors' browsers to load the image from cache, and then using a bit of JavaScript to grab each image's data and send it back to the server, which saves it. Very clever! (tags: javascript php http images) [...]
All the best with your eyes; I hope you get it resolved and can be in front of computers again; clearly it is a place where you have some ability
Kudos for this very clever suggestion! But you can take it further and automate the process.
Craft some PHP that checks for eg. a query parameter; if found, add a redirect to the page. The redirects on each page would be crafted to form a “chain” so a browser will traverse every page on the site. Jeff’s readers can then click one link and leave their browsers to their task (for instance, overnight).
To speed up the process, put the redirect in the window.onload event rather than using a meta tag.
cadams,
No need for any redirects. Just send the browser a list of missing image files. As you can see, any image can be sent back from any page!
I know that this is POC code, but…
You might want to add some checking to the PHP file, since as it stands now, it paints a huge bulls-eye on your site for remote exploits, uploading code, etc…
Aside from the obvious and manageable security issues, this is absolutely brilliant!
Stunning stunt!
At first sight I couldn’t figure out why it’s “i += 5″, not “i += 4″ in line 34.
ende,
That was a mistake. It should be “i += 1″. Corrected now.
Super cool solution.
Fantastic solution.Its really awesome.
[...] hard disk went off to its final resting place. One of Atwood’s readers came up with a remarkably elegant and clever way of recovering some of those images from the browser caches of Coding Horror readers (complete with code samples). This is the sort of hack I love: clever, [...]
[...] Shared Get cached images from your visitors | Diovo. [...]
Nice idea
Congrats, you impressed me.
Very clever.
I think type of site that is useful in sharing information
I think type of site that is
possibility of sharing