Solving the Google Code Jam "countPaths" problem in Perl

Posted on August 17, 2006

Tags: google, code-jam, perl, wordpaths, countpaths, puzzles

As promised, here’s a Perl version of a dynamic-programming-based solver for the Google Code Jam “countPaths” problem. It is a straight translation of my improved Ruby implementation. As you might expect, the Perl version was pretty fast. It proved faster than the other scripting-language implementations I tried (in this rather unscientific benchmark, not to be taken seriously):

Haskell	0.9 s
Perl (code below)	1.7 s
Python	2.8 s
Ruby	4.2 s

All timings were taken while solving the maximum-size, all-the-same-letter problem on my 1.8-GHz Opteron box.

Here’s the Perl implementation:

#!/usr/bin/perl

# Tom Moertel <tom@moertel.com>
# 2006-08-16
#
# Perl-based solution to the Google Code Jam problem "countPaths".
# See http://www.cs.uic.edu/~hnagaraj/articles/code-jam/ for more.

use strict;
use warnings;

use List::Util 'sum';
use Math::BigInt;

sub count_paths {

  my ($grid, $word) = @_;

  my $rword  = reverse $word;
  my $rowmax = $#$grid;
  my $colmax = length($grid->[0]);
  my ($slab, $sum);

  for my $i (0 .. length($rword) - 1) {
    my $char = substr $rword, $i, 1;
    ($slab, my $previous_slab) = ([], $slab);
    for my $r (0 .. $rowmax) {
      my ($row, $line) = ($grid->[$r], $slab->[$r] ||= []);
      for my $c (0 .. $colmax) {
        $line->[$c] = $char ne substr($row,$c,1) ? 0 : $i == 0 ? 1 : do {
          $sum = 0;
          my $clo = $c > 0 ? $c - 1 : $c;
          my $chi = $c < $colmax ? $c + 1 : $c;
          for my $nr (($r>0 ? $r-1 : $r) .. ($r<$rowmax ? $r+1 : $r)) {
            for my $nc ($clo .. $chi) {
              $sum += $previous_slab->[$nr][$nc]
                if $nr != $r || $nc != $c;
            }
          }
          $sum;
        }
      }
    }
  }

  sum map @$_, @$slab;
}

print count_paths([("A"x50)x50], "A"x50), $/;
# 3.03835410591851e+47

Update: I simplified the code a whisper by removing an unnecessary variable $counts. Here’s a diff if you’re curious about what’s changed:

--- countpaths.pl.orig  2006-08-18 00:16:56.000000000 -0400
+++ /countpaths.pl      2006-08-18 00:19:30.000000000 -0400
@@ -19,11 +19,11 @@
   my $rword  = reverse $word;
   my $rowmax = $#$grid;
   my $colmax = length($grid->[0]);
-  my ($counts, $slab, $sum);
+  my ($slab, $sum);

   for my $i (0 .. length($rword) - 1) {
     my $char = substr $rword, $i, 1;
-    ($slab, my $previous_slab) = ($counts->[$i] ||= [], $slab);
+    ($slab, my $previous_slab) = ([], $slab);
     for my $r (0 .. $rowmax) {
       my ($row, $line) = ($grid->[$r], $slab->[$r] ||= []);
       for my $c (0 .. $colmax) {

Update 2: Augmented the introductory paragraph with a parenthetical comment that reminds readers that these single-fuzzy-data-point-style timings should not be taken seriously. Also removed the word “bested,” which might suggest that there is an optimization contest in play. Please, no wagering.

Update 3: Stripped another variable ($j), which was completely unused and leftover from previous implementation. (See why you shouldn’t code late at night?)