Here’s a Ruby version of a dynamic-programming-based solver for the Google Code Jam “countPaths” problem. It is essentially the same as my earlier Haskell-based solution (see Update 2), but much slower. Whereas the Haskell version solves the maximum-size, all-the-same-letter problem in about 0.9 second, the Ruby version requires about 71 seconds. Maybe somebody who understands Ruby’s internals better than I do can come up with some optimizations.
Here’s the code:
# Tom Moertel <tom@moertel.com>
# 2006-08-16
#
# Ruby-based solution to the Google Code Jam problem "countPaths"
# See http://www.cs.uic.edu/~hnagaraj/articles/code-jam/ for more.
class WordPath
include Enumerable
def initialize(grid, word)
@grid, @rword, @counts = grid, word.reverse, {}
end
def self.count_paths(grid, word)
.solve
new(grid, word)end
def solve
= @rword.length - 1
final_index 0) { |sum, rc| sum + count_from(final_index, *rc) }
inject(end
private
def count_from(i, r, c)
@counts[[r, c, i]] ||= begin
= @rword[i] == @grid[r][c]
match case
when i == 0 && match then 1
when match then subsum_of_neighbors(r, c, i - 1)
else 0
end
end
end
def subsum_of_neighbors(r, c, i)
= 0
sum = @grid[0].size
rowlen for nr in [r - 1, r, r + 1]
next if nr < 0 or nr >= @grid.size
for nc in [c - 1, c, c + 1]
next if nc < 0 || nc >= rowlen
next unless r != nr || c != nc
if count = count_from(i, nr, nc)
+= count
sum end
end
end
sumend
def each
@grid.each_index do |r|
@grid[0].size.times { |c| yield([r, c]) }
end
end
end
# TESTS
if ENV["TEST"] || ENV["BIG_TEST"]
require "test/unit"
class TestWordPath < Test::Unit::TestCase
if ENV["BIG_TEST"]
def test_big_problem
assert_equal \303835410591851117616135618108340196903254429200,
WordPath.count_paths(["A"*50] * 50, "A"*50)
end
end
if ENV["TEST"]
def test_count_paths
= WordPath
w 1,
assert_equal .count_paths(%w{ABC FED GHI}, "ABCDEFGHI")
w2,
assert_equal .count_paths(%w{ABC FED GAI}, "ABCDEA")
w0,
assert_equal .count_paths(%w{ABC DEF GHI}, "ABCD")
w108,
assert_equal .count_paths(%w{AA AA}, "AAAA")
w56448,
assert_equal .count_paths(%w{ABABA BABAB ABABA BABAB ABABA}, "ABABABBA")
w2745564336,
assert_equal .count_paths(%w{AAAAA AAAAA AAAAA AAAAA AAAAA}, "AAAAAAAAAAA")
w0,
assert_equal .count_paths(%w{AB CD}, "AA" )
w1,
assert_equal .count_paths(%w{A}, "A")
wend
end
end
end
Set the BIG_TEST
and/or TEST
environment
variables to run the test suites. For example:
$ <code>TEST=1 ./countpaths.rb</code>
Loaded suite countpaths
Started
.
Finished in 0.02062 seconds.
1 tests, 8 assertions, 0 failures, 0 errors
Unless somebody beats me to it, I’ll whip up a Perl version for comparison.
Update: I managed to speed up my code by a factor of 17. Now the execution time for the maximum-size, all-the-same-letter problem is down to 4.2 seconds, which is comparable with implementations in other languages. Ivan Peev’s Python implementation, for example, is only slightly faster at 2.8 seconds.
A performance killer in the previous version was using a single big hash for my cache. Now I use a 3D array:
[[i,r,c]] # one big hash (slower)
counts[i][r][c] # 3D-array (faster) counts
An additional advantage of the 3D-array is that I can peel off slabs as I descend the outer layers of nested loops. For instance, instead of writing:
for i in 0 .. 10
for j in 0 .. 10
+= counts[i][j]
sum end
end
I can lift the counts[i]
slab out of the inner
loop to eliminate j array-indexing operations:
for i in 0 .. 10
= counts[i]
slab for j in 0 .. 10
+= slab[j]
sum end
end
Here’s the new code (sans the unit tests, which haven’t changed):
class WordPath
= Array
A
def self.count_paths(grid, word)
= word.reverse
rword = grid.size - 1
rowmax = grid.first.size - 1
colmax
for i in 0 .. rword.size - 1
= rword[i]
letter = slab, A.new(rowmax+1) { A.new(colmax+1) }
previous_slab, slab for r in 0 .. rowmax
= grid[r], slab[r]
row, line for c in 0 .. colmax
[c] = unless letter == row[c]
line0
else
if i == 0
1
else
= 0
sum = c > 0 ? c - 1 : c
clo = c < colmax ? c + 1 : c
chi for nr in (r > 0 ? r - 1 : r) .. (r < rowmax ? r + 1 : r)
for nc in clo .. chi
+= previous_slab[nr][nc] if nr != r || nc != c
sum end
end
sumend
end
end
end
end
= 0
sum for r in 0 .. rowmax
for c in 0 .. colmax
+= slab[r][c]
sum end
end
sum
end
end
Update 2: I tweaked the code snippet above to remove a variable that I just noticed wasn’t actually doing anything.