Posted by Tom Moertel
Wed, 07 Feb 2007 17:08:00 GMT
Via
eigenclass.org
I learned that Ruby 1.9 will sport a new Object method
called
tap,
which is something I’ve been hoping
for.
What’s tap? It’s a helper for call chaining. It
passes its object into the given block and, after the block finishes,
returns the object:
The benefit is that tap always returns the object it’s called on, even if the block returns some other result. Thus you can insert a tap block into the middle of an existing method pipleline without breaking the flow. MenTaLguY has some nifty examples of other things you can do with tap.
Fans of Ruby on Rails may recognize tap as similar to RoR’s own
returning helper.
Looks like Ruby 1.9 is going to be extra cool for a number of reasons.
Posted in ruby, rails
Tags helpers, rails, ruby, tap
1 comment
no trackbacks

Posted by Tom Moertel
Wed, 01 Nov 2006 22:01:00 GMT
Last night on #haskell, Don
Stewart asked if I had seen
HsColour
for rendering syntax-highlighted Haskell in HTML. He had
used it recently, he noted in passing, to add syntax highlighting to planet.haskell.org.
Now, I can’t be certain about this, but I suspect that Don’s question
was cleverly designed to instill in me a subtle case of
syntax-highlighting envy. For on my blog, Haskell code snippets
were rendered in dreadfully boring uncolored text.
But on his blog, the
snippets dance in joyous polychromatic splendor.
Thus I was compelled to add Haskell syntax-highlighting to my blog.
Adding Haskell syntax-highlighting to Typo
My blog runs on the Ruby-on-Rails-powered Typo
system, which allows for plug-in text filters. One of the included filters, in fact, is a syntax-highlighting filter for snippets of Ruby, XML, and YAML code. This filter is built upon the Ruby Syntax module, which wasn’t exactly designed for Haskell syntax analysis. So I set out to create a new plug-in filter based upon HsColour.
This task turned out to be easy. All I did was duplicate
Typo’s existing syntax-highlighting filter and swap out its filtering
code for the following:
IO.popen("HsColour -css", "r+") do |f|
pid = fork { f.write text; f.close; exit! 0 }
f.close_write
text = f.read
Process.waitpid pid
end
I also tweaked the post-processing regular expressions so that they
would whittle away the HTML filler before and after the
syntax-highlighted output of HsColour:
text.gsub!(/.*<p()re>/m, ...)
text.gsub!(/<\/pre>.*/m, ...)
A few more tweaks and I was done.
Now I can wrap my Haskell code in <typo:haskell> tags and it, too, will
dance in joyous polychromatic splendor:
constructTable tspecs = do
ecolspecs <- during "argument evaluation" $ do
toNvps . concat =<< mapM splice tspecs
let names = map fst ecolspecs
let evecs = map snd ecolspecs
vecs <- argof nm $ mapM evalVector evecs
let vlens = map vlen vecs
if length (group vlens) == 1
then return . VTable $ mkTable (zip names vecs)
else throwError $
"table columns must be non-empty vectors of equal length"
where
nm = "table(...) constructor"
splice (TCol envp) = return [envp]
splice (TSplice e) = do
val <- eval e
case val of
VTable t ->
return $ zipWith mkNVP (tcnames t) (elems (tvecs t))
VList gl ->
liftM (zipWith mkNVP (map name . elems $ glnames gl)) $
mapM asVectorNull (elems $ glvals gl)
_ -> throwError $
"can't construct table columns from (" ++
show val ++ ")"
mkNVP n vec = NVP n (mkNoPosExpr . EVal $ VVector vec)
name "" = "NA"
name n = n
If you want the filter code, here it is: haskell_controller.rb. Just drop it into components/plugins/textfilters and restart Typo. The corresponding CSS styles can be found in my user-styles.css.
Posted in haskell, ruby, typo
Tags haskell, hscolour, ruby, typo
no comments
no trackbacks

Posted by Tom Moertel
Thu, 19 Oct 2006 01:40:00 GMT
Even skilled programmers have a hard time keeping their web
applications free of XSS and SQL-injection vulnerabilities. And it
shows: a sobering portion of web sites are open to some scary security threats.
Why are so many sites vulnerable to these well-known holes? Probably
because it’s insanely hard for programmers to solve the fundamental
“strings problem” at the heart of these vulnerabilities. The problem
itself is easy to understand, but we humans aren’t equipped to carry
out the solution. Simply put, we just plain suck at keeping a
bazillion different strings straight in our heads, let alone
consistently and reliably rendering their interactions safe whenever they
cross paths in a modern web application. It’s easy to say, “just
escape the little buggers,” but it’s hard to get it right, every single time.
Computers, on the other hand, are pretty good at keeping track of
details by the bucket-full. Wouldn’t it be nice, then,
if our programming languages gave us the power to delegate this nasty “strings
problem” to our computers, which could then devote their unwavering mechanical precision to grinding the problem out of existence? Isn’t that the kind of thing modern programming languages are supposed to be good at?
I’d like to think the answer to that question is a big, you betcha.
So let’s grab a modern programming language and solve the strings problem.
Let’s solve the strings problem in Haskell
In this article, we will look at one way (among many) to solve the strings
problem: by adding Ruby-style string templates to Haskell. These
templates support “interpolation” via the usual, convenient #{var}
syntax, but here interpolation is type safe. Haskell’s type system
will prevent us from inadvertently mixing incompatible string types,
and it will detect mistakes at compile time, before they can become
live XSS or SQL-injection holes. Further, our solution will offer
us these benefits without making us jump through hoops or pay some
onerous syntax penalty.
To be more specific, the system offers the following benefits:
- It provides a string-management kernel that lets you create “safe strings” by certifying a regular string as representing either text or a fragment of a known language.
- It allows you to conveniently define new language types for any string-based language that you can provide an escaping rule for (e.g., XML, URLs, SQL, untrusted user input).
- It provides compile-time syntactic sugar (via Template Haskell) that makes working with safe strings as convenient as working with string interpolation in languages like Ruby and Perl.
- It catches and reports (at compile time) the following commonly made programming errors:
- failing to escape a plain-old-text string before mixing it into a string that represents a language fragment
- mixing strings that represent fragments of incompatible languages
- mixing strings that represent fragments of compatible languages in an ambiguous way (the system will force you to disambiguate)
(This is a long one, so grab an espresso, lean back, and read on in
style. Also, if you have a smoking jacket, you might want to get it now.)
Read more...
Posted in programming, programming languages, haskell, ruby, web development, testing, rails
Tags haskell, ruby, strings, testing, types
37 comments
no trackbacks

Posted by Tom Moertel
Thu, 24 Aug 2006 19:41:00 GMT
In an earlier post I wrote about stability
problems that have plagued my blog since upgrading from Typo 4.0.0 to 4.0.3. I have finally traced the problem to its source, and here’s the deal:
If you’re serving Typo up via Mongrel, do not configure ActiveRecord to allow concurrency.
One of the changes between Typo 4.0.0 and 4.0.3 is this
addition to the environment.rb file:
config.active_record.allow_concurrency = true
Comment out this line, restart Typo, and the problem is solved.
Apply Changeset 1255, and the problem is solved. (See
Update 2, below.)
Discussion
When ActiveRecord::Base.allow_concurrency is set to
true, AR will give each thread its own database
connections and cache them in thread-localized storage. The idea is
that, in a multi-threaded environment, this simple policy prevents
unsafe interactions between threads and the database. (Imagine what
would happen if one thread “borrowed” a connection over which
another thread had opened a transaction. Oops, there goes
transactional isolation.)
This policy, however, does place a burden on the owner of the threads to
make sure that each thread’s local connection cache is cleared when
the thread is joined, a burden that is not, it would seem, being
carried by Typo under Mongrel. As a result, Typo rapidly chews
through the allotment of file descriptors that the operating system
kindly had reserved for Mongrel:

(On my Linux server, the Mongrel process gets an allotment of 1024
file descriptors.)
Lucky for us, this each-thread-gets-its-own-connections policy is unnecessary under
Mongrel because Mongrel, while being multi-threaded itself, serializes
all access to the Rails-based applications it serves up:
Q: Is [Mongrel] multi-threaded or can it handle concurrent requests?
Mongrel is uses a pool of thread workers to do it’s processing. This means that it is able to handle concurrent access and should be thread safe. This also means that you have to be more careful about how you use Mongrel. You can’t just write your application assuming that there are no threads involved. ...
Ruby on Rails is not thread safe so there is a synchronized block around the calls to Dispatcher.dispatch. This means that everything is threaded right before and right after Rails runs. While Rails is running there is only one controller in operation at a time.
(Source: Mongrel FAQ list)
Thus we can safely turn off (i.e., comment out in Typo’s
environment.rb file) ActiveRecord’s allow-currency option
without having to worry about nasty concurrency or performance issues:
# the following line is commented out
# config.active_record.allow_concurrency = true
For more on this subject, see Rails ticket
#2162 and Rails ticket
#2742.
Now, here’s my question: Are there any environments in which
Typo can run with the allow-concurrency option enabled and not
leak database connections? Inquiring minds want to know.
Update: Upon further investigation, turning off
concurrency might not be altogether without risk. Some of the Typo
code that handles potentially long tasks, such as making trackbacks
and pings, spawns new threads in which to carry out its work. I’m
looking further into this risk. Updates to come.
Update 2: Piers Cawley added Changeset
1255, which turns AR’s
allow-concurrency flag back off and revises the ping code so that
it does not attempt concurrent database access. Apply the patch version of
1255
and restart Typo to get the fix. A tip of the hat to Piers for making
the quick fix when he was supposed to be on holiday.
Posted in ruby, typo, rails
Tags activerecord, concurrency, rails, sqlite3, typo
4 comments
no trackbacks

Posted by Tom Moertel
Thu, 24 Aug 2006 04:41:00 GMT
Since I upgraded my blog from Typo 4.0.0 to
4.0.3, it has been somewhat unstable. About once a day it starts
responding with “500 Internal Server Error” and stays that way until I
restart it.
The root of the problem seems to be the database
connection, as evidenced by this exception showing up in the
production log:
SQLite3::CantOpenException (could not open database)
Unfortunately, the exception doesn’t provide anything specific
to go on.
A quick look at the
sqlite3-ruby code
suggested that I was not going to get the specifics, either. The Ruby-based wrapper
never calls sqlite3_errmsg after a call to sqlite3_open fails on behalf of SQLite3::Database.new.
A quick patch, however, fixed the problem:
--- sqlite3-ruby-1.1.0.orig/lib/sqlite3/database.rb
+++ sqlite3-ruby-1.1.0/lib/sqlite3/database.rb
@@ -109,7 +109,7 @@
@statement_factory = options[:statement_factory] || Statement
result, @handle = @driver.open( file_name, utf16 )
- Error.check( result, nil, "could not open database" )
+ Error.check( result, self, "could not open database" )
@closed = false
@results_as_hash = options.fetch(:results_as_hash,false)
(Submitted as Ticket 5504 on RubyForge.)
Before applying the patch, opening a database at a nonexistent path results in
a generic error message:
$ ruby -r rubygems -e 'require_gem "sqlite3-ruby";
SQLite3::Database.new("/no/such/path/db")'
... could not open database (SQLite3::CantOpenException) ...
After applying the patch, we get additional error information:
... could not open database: unable to open database file
(SQLite3::CantOpenException) ...
With the patch in place, all I have to do is wait for Typo to start
acting up again. Then I’ll have some interesting information in the
log.
Until then, I’m relying on cron
and a short monitoring script to restart Typo when it tips into
foolishness:
#!/bin/bash
url=http://blog.moertel.com/admin
addrs=tom@moertel.com
response=$(GET -sd $url 2>&1)
if [ "$response" != "200 OK" ]; then
{ echo "Response was: $response"; echo; service typo restart; } |
mail -s "Blog site not responding! (Restarting)" $addrs
fi
We’ll see how it goes.
Update: That was fast. The error popped up
again and this time the log told me something useful: “unable to open
database file.” Now, why couldn’t Typo open the database file,
especially since the file is perfectly fine and had been opened
successfully (many times) by the very same Typo process earlier? Here’s
a hint:
$ ls /proc/28788/fd | wc -l
1023
Seems like there’s a resource leak in Typo 4.0.3 (or Rails 1.1.6).
Under some conditions, instead of reusing existing database
connections, Typo keeps trying to open new ones. Eventually, it uses
up its allotment of file descriptors and the operating system is forced
to say, “That’s enough, pal,” (EMFILE).
I’ll look in to it more in the morning.
Update 2: Problem solved.
Posted in ruby, typo, rails, sysadmin
Tags rails, sqlite3, typo
1 comment
no trackbacks

Posted by Tom Moertel
Wed, 16 Aug 2006 22:54:00 GMT
Here’s a Ruby version of a dynamic-programming-based solver
for the Google Code Jam “countPaths” problem. It is essentially
the same as my earlier Haskell-based solution (see Update 2), but much slower. Whereas the Haskell version solves the maximum-size, all-the-same-letter problem in about 0.9 second, the Ruby version requires about 71 seconds. Maybe somebody who understands Ruby’s internals better than I do can come up with some optimizations.
Here’s the code:
# Tom Moertel <tom@moertel.com>
# 2006-08-16
#
# Ruby-based solution to the Google Code Jam problem "countPaths"
# See http://www.cs.uic.edu/~hnagaraj/articles/code-jam/ for more.
class WordPath
include Enumerable
def initialize(grid, word)
@grid, @rword, @counts = grid, word.reverse, {}
end
def self.count_paths(grid, word)
new(grid, word).solve
end
def solve
final_index = @rword.length - 1
inject(0) { |sum, rc| sum + count_from(final_index, *rc) }
end
private
def count_from(i, r, c)
@counts[[r, c, i]] ||= begin
match = @rword[i] == @grid[r][c]
case
when i == 0 && match then 1
when match then subsum_of_neighbors(r, c, i - 1)
else 0
end
end
end
def subsum_of_neighbors(r, c, i)
sum = 0
rowlen = @grid[0].size
for nr in [r - 1, r, r + 1]
next if nr < 0 or nr >= @grid.size
for nc in [c - 1, c, c + 1]
next if nc < 0 || nc >= rowlen
next unless r != nr || c != nc
if count = count_from(i, nr, nc)
sum += count
end
end
end
sum
end
def each
@grid.each_index do |r|
@grid[0].size.times { |c| yield([r, c]) }
end
end
end
# TESTS
if ENV["TEST"] || ENV["BIG_TEST"]
require "test/unit"
class TestWordPath < Test::Unit::TestCase
if ENV["BIG_TEST"]
def test_big_problem
assert_equal \
303835410591851117616135618108340196903254429200,
WordPath.count_paths(["A"*50] * 50, "A"*50)
end
end
if ENV["TEST"]
def test_count_paths
w = WordPath
assert_equal 1,
w.count_paths(%w{ABC FED GHI}, "ABCDEFGHI")
assert_equal 2,
w.count_paths(%w{ABC FED GAI}, "ABCDEA")
assert_equal 0,
w.count_paths(%w{ABC DEF GHI}, "ABCD")
assert_equal 108,
w.count_paths(%w{AA AA}, "AAAA")
assert_equal 56448,
w.count_paths(%w{ABABA BABAB ABABA BABAB ABABA}, "ABABABBA")
assert_equal 2745564336,
w.count_paths(%w{AAAAA AAAAA AAAAA AAAAA AAAAA}, "AAAAAAAAAAA")
assert_equal 0,
w.count_paths(%w{AB CD}, "AA" )
assert_equal 1,
w.count_paths(%w{A}, "A")
end
end
end
end
Set the BIG_TEST and/or TEST environment
variables to run the test suites. For example:
$ TEST=1 ./countpaths.rb
Loaded suite countpaths
Started
.
Finished in 0.02062 seconds.
1 tests, 8 assertions, 0 failures, 0 errors
Unless somebody beats me to it,
I’ll whip up a Perl version for comparison.
Update: I managed to speed up my code by a
factor of 17. Now the execution time for the maximum-size,
all-the-same-letter problem is down to 4.2 seconds,
which is comparable with implementations in other
languages.
Ivan Peev’s Python implementation, for example, is only slightly faster
at 2.8 seconds.
A performance killer in the previous version was using
a single big hash for my cache. Now I use a 3D array:
counts[[i,r,c]] # one big hash (slower)
counts[i][r][c] # 3D-array (faster)
An additional advantage of the 3D-array is that I can peel off slabs
as I descend the outer layers of nested loops. For instance,
instead of writing:
for i in 0 .. 10
for j in 0 .. 10
sum += counts[i][j]
end
end
I can lift the counts[i] slab out of the inner
loop to eliminate j array-indexing operations:
for i in 0 .. 10
slab = counts[i]
for j in 0 .. 10
sum += slab[j]
end
end
Here’s the new code (sans the unit tests, which haven’t changed):
class WordPath
A = Array
def self.count_paths(grid, word)
rword = word.reverse
rowmax = grid.size - 1
colmax = grid.first.size - 1
for i in 0 .. rword.size - 1
letter = rword[i]
previous_slab, slab = slab, A.new(rowmax+1) { A.new(colmax+1) }
for r in 0 .. rowmax
row, line = grid[r], slab[r]
for c in 0 .. colmax
line[c] = unless letter == row[c]
0
else
if i == 0
1
else
sum = 0
clo = c > 0 ? c - 1 : c
chi = c < colmax ? c + 1 : c
for nr in (r > 0 ? r - 1 : r) .. (r < rowmax ? r + 1 : r)
for nc in clo .. chi
sum += previous_slab[nr][nc] if nr != r || nc != c
end
end
sum
end
end
end
end
end
sum = 0
for r in 0 .. rowmax
for c in 0 .. colmax
sum += slab[r][c]
end
end
sum
end
end
Update 2: I tweaked the code snippet above to remove a variable
that I just noticed wasn’t actually doing anything.
Posted in ruby, fun stuff
Tags code, countpaths, google, jam, ruby
3 comments
no trackbacks

Posted by Tom Moertel
Fri, 07 Apr 2006 15:55:00 GMT
One of the things I miss when coding in Ruby is
inexpensive function composition. In Haskell, for example,
I can compose functions using the dot (.) operator:
inc = (+1)
twice = (*2)
twiceOfInc = twice . inc
Because of Ruby’s open classes, however, I can easily
add the feature to the language. In
the code below, I introduce
Proc.compose and overload the
star (
*) operator for the purpose:
# func_composition.rb
class Proc
def self.compose(f, g)
lambda { |*args| f[g[*args]] }
end
def *(g)
Proc.compose(self, g)
end
end
And that’s all it takes:
$ irb --simple-prompt -r func_composition.rb
>> inc = lambda { |x| x + 1 }
=> #<Proc:0x00002aaaaaad7810@(irb):1>
>> twice = lambda { |x| x * 2 }
=> #<Proc:0x00002aaaaabd2d18@(irb):2>
>> inc[1]
=> 2
>> twice[2]
=> 4
>> twice_of_inc = twice * inc
=> #<Proc:0x00002aaaaab32458@./func_composition.rb:3>
>> twice_of_inc[1]
=> 4
>> twice_of_inc[2]
=> 6
Now, isn’t that refreshing?
Posted in functional programming, ruby
Tags fp, functional_programming, ruby
2 comments
no trackbacks

Posted by Tom Moertel
Thu, 01 Dec 2005 03:09:00 GMT
From why the lucky stiff comes Try Ruby, an interactive online Ruby tutorial that connects your web browser to a live Ruby (irb) session. As the tutorial leads you to Rubyriffic delights, you follow along via the live command line – complete with a history and support for editing keys.
It’s slick. And because it’s an honest-to-goodness interactive Ruby session, you don’t need to stick to the script. If you want to play with continuations, for example, go for it:
>> i, c = 0; puts callcc { |c| c[] }; i += 1
=> 1
>> c["hello"]
hello
=> 2
>> c["world"]
world
=> 3
>>
Do check it out: Try Ruby.
Posted in ruby
no comments
no trackbacks

Posted by Tom Moertel
Tue, 30 Aug 2005 18:56:00 GMT
I came across Tim Bray’s thoughts on
Ruby via
the ever-delightful Lambda the Ultimate and found the following bit fascinating:
I’ve had access to languages with closures and continuations and
suchlike constructs for years and years, and I’ve never ever written
one. While I’m impressed by how natural this stuff is in Ruby, I’m
still unconvinced that these are a necessary part of the professional
programmer’s arsenal. [Emphasis mine.]
While Tim Bray may be unconvinced, I am a true believer.
Read more...
Posted in programming, functional programming, programming languages, ruby
Tags closures, programming, ruby
12 comments
1 trackback

Posted by Tom Moertel
Mon, 22 Aug 2005 16:00:00 GMT
Many people don’t realize that changing the target of a symbolic link (symlink) is
not an atomic operation. “Changing” a symlink really means deleting it
and creating a new link with the same file name. For example, if I have a
symlink current that points to a directory old, and I want to change
it to point to a directory new, I might use the following command:
$ ln -snf new current
Strace shows what really happens when I run the command:
$ strace ln -snf new current 2>&1 | grep link
unlink("current") = 0
symlink("new", "current") = 0
First, the existing symlink is deleted via the unlink system
call. Then a new, identically named symlink is created via the symlink
system call. It’s a two-step process, and in between the steps, there
is no symlink.
This can be a problem if you expect the symlink to be there always,
such as when using the link to point to the active version of a live
web site. If you change the symlink while deploying a new version of
your site, for example, the web server might try to dereference the
link during the small window of time when it doesn’t exist. Oops.
The solution to this problem is to effect the change by creating a new
symlink and then renaming it over the old symlink. On Unix-like
systems, renaming is an atomic operation, and thus the symlink
“change” will be atomic too. By hand, the process looks like this:
$ ln -s new current_tmp && mv -Tf current_tmp current
In Ruby, I make atomic symlinking available everywhere by extending
the Pathname class with a new method atomic_symlink:
require 'pathname'
class Pathname
def atomic_symlink(old)
suffix = [Array.new(6){rand(256).chr}.join].pack("m").strip.tr('/','_');
tmplink = Pathname.new(self.to_s + "_" + suffix)
tmplink.make_symlink(old)
begin
tmplink.rename(self)
rescue
File.unlink(tmplink.to_s)
raise
end
end
end
This code is nothing more than a robustified version of the by-hand
method. It picks better names for temporary links, and it cleans up
after itself, should something go wrong, but otherwise it does the
same thing.
Given how easy it is to change symlinks atomically, why do it any
other way? Life is hard enough without having to worry about another
race condition.
Posted in programming, ruby, web development
Tags ruby, safe, symlink
5 comments
no trackbacks
