Perl saved my vacation!

By
Posted on
Tags: , ,

My wife and I are on vacation. We spent yesterday at Longwood Gardens in eastern Pennsylvania. It’s beautiful: words don’t do it justice – you really do need to see the photos. That’s why I took about 500 photos yesterday. At least, that’s what I thought until I got back to our bed & breakfast and tried to download the photos from my camera to my laptop.

Crap! About a quarter of the photos were missing! I wasn’t sure what had happened, but I suspect that my budget 16-GB SD card had started throwing bad blocks. There go my priceless vacation memories, right down the technological toilet.

But maybe those photos weren’t irretrievably flushed. Maybe the data behind most of them was still on the SD card, if only I could get at it. Hey, I’ve got Perl on my laptop: I can get at that data.

Here’s how it went down.

First, I scanned the raw blocks of the suspected-faulty SD card, looking for the markers that indicate the start of a JPEG file. My laptop runs Linux, so it was easy to access the blocks. I just read the device file corresponding to the SD card’s filesystem. I used a Perl script to walk through the file in block-sized steps, hunting JPEG headers. Here’s the meat of the script:

while (my $bytes_read = sysread($fh, $buffer, $READ_BLOCKS*$BLOCK_SIZE)) {
    for (my $offset = 0; $offset < $bytes_read; $offset += $BLOCK_SIZE) {
        my $tag = substr($buffer, $offset + 6, 4);
        if (grep $tag eq , qw(JFIF Exif)) {
            print $pos + $offset/$BLOCK_SIZE, "\n";  # emit "interesting" block
        }
    }
    $pos += $bytes_read/$BLOCK_SIZE;
}

I handled the second half of the rescue mission with another Perl script. This script grabbed the data starting at each interesting block, as determined by the first script, and tried to decode the data losslessly as a JPEG file, writing the result to a new file on my laptop’s hard drive. Here are the tasty bits of that script:

{PIPE} = 'IGNORE';  # pipe will break on damaged images

while (my $target = <>) {

    # seek forward until we hit the desired block
    while ($pos != $target) {
        my $diff = $target - $pos;
        $diff = $MAX_SEEK_BLOCKS if $diff > $MAX_SEEK_BLOCKS;
        sysseek($fh, $diff * $BLOCK_SIZE, 1);
        $pos += $diff;
    }

    # read the data starting at that block, attempting to decode as JPEG
    if (my $bytes_read = sysread($fh, $buffer, $BLOCK_SIZE*$DATA_READ_SIZE)) {
        my $outfile = sprintf("%s/%010d.jpg", $outdir, $target);
        open(my $pipe, "|jpegtran -copy all -outfile $outfile")
            or die "can't open pipe: ";
        print $pipe $buffer;
        close($pipe);
        $pos += $DATA_READ_SIZE;
    }
}

With these two scripts, I was able to retrieve almost all of the lost photos. That’s 120 more priceless memories that I can post to Flickr to annoy my friends. Yay, Perl!

(BTW, if you want to see some of the rescued photos, see my Longwood Gardens, Spring 2009 photo set on Flickr.)