Solving the Google Code Jam "countPaths" problem in Ruby
Posted by Tom Moertel Wed, 16 Aug 2006 22:54:00 GMT
Here’s a Ruby version of a dynamic-programming-based solver for the Google Code Jam “countPaths” problem. It is essentially the same as my earlier Haskell-based solution (see Update 2), but much slower. Whereas the Haskell version solves the maximum-size, all-the-same-letter problem in about 0.9 second, the Ruby version requires about 71 seconds. Maybe somebody who understands Ruby’s internals better than I do can come up with some optimizations.
Here’s the code:
# Tom Moertel <tom@moertel.com>
# 2006-08-16
#
# Ruby-based solution to the Google Code Jam problem "countPaths"
# See http://www.cs.uic.edu/~hnagaraj/articles/code-jam/ for more.
class WordPath
include Enumerable
def initialize(grid, word)
@grid, @rword, @counts = grid, word.reverse, {}
end
def self.count_paths(grid, word)
new(grid, word).solve
end
def solve
final_index = @rword.length - 1
inject(0) { |sum, rc| sum + count_from(final_index, *rc) }
end
private
def count_from(i, r, c)
@counts[[r, c, i]] ||= begin
match = @rword[i] == @grid[r][c]
case
when i == 0 && match then 1
when match then subsum_of_neighbors(r, c, i - 1)
else 0
end
end
end
def subsum_of_neighbors(r, c, i)
sum = 0
rowlen = @grid[0].size
for nr in [r - 1, r, r + 1]
next if nr < 0 or nr >= @grid.size
for nc in [c - 1, c, c + 1]
next if nc < 0 || nc >= rowlen
next unless r != nr || c != nc
if count = count_from(i, nr, nc)
sum += count
end
end
end
sum
end
def each
@grid.each_index do |r|
@grid[0].size.times { |c| yield([r, c]) }
end
end
end
# TESTS
if ENV["TEST"] || ENV["BIG_TEST"]
require "test/unit"
class TestWordPath < Test::Unit::TestCase
if ENV["BIG_TEST"]
def test_big_problem
assert_equal \
303835410591851117616135618108340196903254429200,
WordPath.count_paths(["A"*50] * 50, "A"*50)
end
end
if ENV["TEST"]
def test_count_paths
w = WordPath
assert_equal 1,
w.count_paths(%w{ABC FED GHI}, "ABCDEFGHI")
assert_equal 2,
w.count_paths(%w{ABC FED GAI}, "ABCDEA")
assert_equal 0,
w.count_paths(%w{ABC DEF GHI}, "ABCD")
assert_equal 108,
w.count_paths(%w{AA AA}, "AAAA")
assert_equal 56448,
w.count_paths(%w{ABABA BABAB ABABA BABAB ABABA}, "ABABABBA")
assert_equal 2745564336,
w.count_paths(%w{AAAAA AAAAA AAAAA AAAAA AAAAA}, "AAAAAAAAAAA")
assert_equal 0,
w.count_paths(%w{AB CD}, "AA" )
assert_equal 1,
w.count_paths(%w{A}, "A")
end
end
end
end
Set the BIG_TEST and/or TEST environment
variables to run the test suites. For example:
$ TEST=1 ./countpaths.rb
Loaded suite countpaths Started . Finished in 0.02062 seconds. 1 tests, 8 assertions, 0 failures, 0 errors
Unless somebody beats me to it, I’ll whip up a Perl version for comparison.
A performance killer in the previous version was using a single big hash for my cache. Now I use a 3D array:
counts[[i,r,c]] # one big hash (slower)
counts[i][r][c] # 3D-array (faster)
An additional advantage of the 3D-array is that I can peel off slabs as I descend the outer layers of nested loops. For instance, instead of writing:
for i in 0 .. 10
for j in 0 .. 10
sum += counts[i][j]
end
end
I can lift the counts[i] slab out of the inner
loop to eliminate j array-indexing operations:
for i in 0 .. 10
slab = counts[i]
for j in 0 .. 10
sum += slab[j]
end
end
Here’s the new code (sans the unit tests, which haven’t changed):
class WordPath
A = Array
def self.count_paths(grid, word)
rword = word.reverse
rowmax = grid.size - 1
colmax = grid.first.size - 1
for i in 0 .. rword.size - 1
letter = rword[i]
previous_slab, slab = slab, A.new(rowmax+1) { A.new(colmax+1) }
for r in 0 .. rowmax
row, line = grid[r], slab[r]
for c in 0 .. colmax
line[c] = unless letter == row[c]
0
else
if i == 0
1
else
sum = 0
clo = c > 0 ? c - 1 : c
chi = c < colmax ? c + 1 : c
for nr in (r > 0 ? r - 1 : r) .. (r < rowmax ? r + 1 : r)
for nc in clo .. chi
sum += previous_slab[nr][nc] if nr != r || nc != c
end
end
sum
end
end
end
end
end
sum = 0
for r in 0 .. rowmax
for c in 0 .. colmax
sum += slab[r][c]
end
end
sum
end
end
Update 2: I tweaked the code snippet above to remove a variable that I just noticed wasn’t actually doing anything.
readers
I posted my solution to the countPaths problem in Ruby on my blog (http://dreynolds.blogspot.com) and the extreme case takes my implementation about 20 seconds.
David, I’m glad to know that Ruby can do much better than 71 seconds on this problem. (BTW, on my 1.8-GHz Opteron system, your code takes about 18 seconds.) I’ll play around with my version to see if I can figure out where it’s spending all that extra time.
Cheers,
Tom
Hopefully someone can beat 18 seconds because even that is pretty damn slow compared to the Haskell and Python equivalents.
Great blog btw. I’m definitely coming back here for your great articles.