Ruby 1.9 gets handy new method Object#tap

Posted by Tom Moertel Wed, 07 Feb 2007 17:08:00 GMT

Via eigenclass.org I learned that Ruby 1.9 will sport a new Object method called tap, which is something I’ve been hoping for.

What’s tap? It’s a helper for call chaining. It passes its object into the given block and, after the block finishes, returns the object:

an_object.tap do |o|
  # do stuff with an_object, which is in o
end # ===> an_object

The benefit is that tap always returns the object it’s called on, even if the block returns some other result. Thus you can insert a tap block into the middle of an existing method pipleline without breaking the flow. MenTaLguY has some nifty examples of other things you can do with tap.

Fans of Ruby on Rails may recognize tap as similar to RoR’s own returning helper.

Looks like Ruby 1.9 is going to be extra cool for a number of reasons.

Posted in ,
Tags , , ,
2 comments
no trackbacks
Reddit Delicious

Adding Haskell syntax highlighting to the Typo blogging system

Posted by Tom Moertel Wed, 01 Nov 2006 22:01:00 GMT

Last night on #haskell, Don Stewart asked if I had seen HsColour for rendering syntax-highlighted Haskell in HTML. He had used it recently, he noted in passing, to add syntax highlighting to planet.haskell.org.

Now, I can’t be certain about this, but I suspect that Don’s question was cleverly designed to instill in me a subtle case of syntax-highlighting envy. For on my blog, Haskell code snippets were rendered in dreadfully boring uncolored text. But on his blog, the snippets dance in joyous polychromatic splendor.

Thus I was compelled to add Haskell syntax-highlighting to my blog.

Adding Haskell syntax-highlighting to Typo

My blog runs on the Ruby-on-Rails-powered Typo system, which allows for plug-in text filters. One of the included filters, in fact, is a syntax-highlighting filter for snippets of Ruby, XML, and YAML code. This filter is built upon the Ruby Syntax module, which wasn’t exactly designed for Haskell syntax analysis. So I set out to create a new plug-in filter based upon HsColour.

This task turned out to be easy. All I did was duplicate Typo’s existing syntax-highlighting filter and swap out its filtering code for the following:

IO.popen("HsColour -css", "r+") do |f|
  pid = fork { f.write text; f.close; exit! 0 }
  f.close_write
  text = f.read
  Process.waitpid pid
end

I also tweaked the post-processing regular expressions so that they would whittle away the HTML filler before and after the syntax-highlighted output of HsColour:

text.gsub!(/.*<p()re>/m, ...)
text.gsub!(/<\/pre>.*/m, ...)

A few more tweaks and I was done.

Now I can wrap my Haskell code in <typo:haskell> tags and it, too, will dance in joyous polychromatic splendor:

constructTable tspecs = do
    ecolspecs <- during "argument evaluation" $ do
        toNvps . concat =<< mapM splice tspecs
    let names = map fst ecolspecs
    let evecs = map snd ecolspecs
    vecs <- argof nm $ mapM evalVector evecs
    let vlens = map vlen vecs
    if length (group vlens) == 1
        then return . VTable $ mkTable (zip names vecs)
        else throwError $
             "table columns must be non-empty vectors of equal length"
  where
    nm = "table(...) constructor"
    splice (TCol envp)  = return [envp]
    splice (TSplice e)  = do
        val <- eval e
        case val of
            VTable t ->
                return $ zipWith mkNVP (tcnames t) (elems (tvecs t))
            VList gl ->
                liftM (zipWith mkNVP (map name . elems $ glnames gl)) $
                mapM asVectorNull (elems $ glvals gl)
            _ -> throwError $
                "can't construct table columns from (" ++
                show val ++ ")"
    mkNVP n vec = NVP n (mkNoPosExpr . EVal $ VVector vec)
    name ""     = "NA"
    name n      = n

If you want the filter code, here it is: haskell_controller.rb. Just drop it into components/plugins/textfilters and restart Typo. The corresponding CSS styles can be found in my user-styles.css.

Posted in , ,
Tags , , ,
no comments
no trackbacks
Reddit Delicious

A type-based solution to the "strings problem": a fitting end to XSS and SQL-injection holes?

Posted by Tom Moertel Thu, 19 Oct 2006 01:40:00 GMT

Even skilled programmers have a hard time keeping their web applications free of XSS and SQL-injection vulnerabilities. And it shows: a sobering portion of web sites are open to some scary security threats.

Why are so many sites vulnerable to these well-known holes? Probably because it’s insanely hard for programmers to solve the fundamental “strings problem” at the heart of these vulnerabilities. The problem itself is easy to understand, but we humans aren’t equipped to carry out the solution. Simply put, we just plain suck at keeping a bazillion different strings straight in our heads, let alone consistently and reliably rendering their interactions safe whenever they cross paths in a modern web application. It’s easy to say, “just escape the darn things,” but it’s hard to get it right, every single time.

Computers, on the other hand, are pretty good at keeping track of details by the bucket-full. Wouldn’t it be nice, then, if our programming languages gave us the power to delegate this nasty “strings problem” to our computers, which could then devote their unwavering mechanical precision to grinding the problem out of existence? Isn’t that the kind of thing modern programming languages are supposed to be good at?

I’d like to think the answer to that question is a big, you betcha.

So let’s grab a modern programming language and solve the strings problem.

Let’s solve the strings problem in Haskell

In this article, we will look at one way (among many) to solve the strings problem: by adding Ruby-style string templates to Haskell. These templates support “interpolation” via the usual, convenient #{var} syntax, but here interpolation is type safe. Haskell’s type system will prevent us from inadvertently mixing incompatible string types, and it will detect mistakes at compile time, before they can become live XSS or SQL-injection holes. Further, our solution will offer us these benefits without making us jump through hoops or pay some onerous syntax penalty.

To be more specific, the system offers the following benefits:

  • It provides a string-management kernel that lets you create “safe strings” by certifying a regular string as representing either text or a fragment of a known language.
  • It allows you to conveniently define new language types for any string-based language that you can provide an escaping rule for (e.g., XML, URLs, SQL, untrusted user input).
  • It provides compile-time syntactic sugar (via Template Haskell) that makes working with safe strings as convenient as working with string interpolation in languages like Ruby and Perl.
  • It catches and reports (at compile time) the following commonly made programming errors:
    • failing to escape a plain-old-text string before mixing it into a string that represents a language fragment
    • mixing strings that represent fragments of incompatible languages
    • mixing strings that represent fragments of compatible languages in an ambiguous way (the system will force you to disambiguate)

(This is a long one, so grab an espresso, lean back, and read on in style. Also, if you have a smoking jacket, you might want to get it now.)

Read more...

Posted in , , , , , ,
Tags , , , ,
42 comments
no trackbacks
Reddit Delicious

Solving the Google Code Jam "countPaths" problem in Ruby

Posted by Tom Moertel Wed, 16 Aug 2006 22:54:00 GMT

Here’s a Ruby version of a dynamic-programming-based solver for the Google Code Jam “countPaths” problem. It is essentially the same as my earlier Haskell-based solution (see Update 2), but much slower. Whereas the Haskell version solves the maximum-size, all-the-same-letter problem in about 0.9 second, the Ruby version requires about 71 seconds. Maybe somebody who understands Ruby’s internals better than I do can come up with some optimizations.

Here’s the code:

# Tom Moertel <tom@moertel.com>
# 2006-08-16
#
# Ruby-based solution to the Google Code Jam problem "countPaths" 
# See http://www.cs.uic.edu/~hnagaraj/articles/code-jam/ for more.

class WordPath

  include Enumerable

  def initialize(grid, word)
    @grid, @rword, @counts = grid, word.reverse, {}
  end

  def self.count_paths(grid, word)
    new(grid, word).solve
  end

  def solve
    final_index = @rword.length - 1
    inject(0) { |sum, rc| sum + count_from(final_index, *rc) }
  end

  private

  def count_from(i, r, c)
    @counts[[r, c, i]] ||= begin
      match = @rword[i] == @grid[r][c]
      case
        when i == 0 && match then 1
        when match then subsum_of_neighbors(r, c, i - 1)
        else 0
      end
    end
  end

  def subsum_of_neighbors(r, c, i)
    sum = 0
    rowlen = @grid[0].size
    for nr in [r - 1, r, r + 1]
      next if nr < 0 or nr >= @grid.size
      for nc in [c - 1, c, c + 1]
        next if nc < 0 || nc >= rowlen
        next unless r != nr || c != nc
        if count = count_from(i, nr, nc)
          sum += count
        end
      end
    end
    sum
  end

  def each
    @grid.each_index do |r|
      @grid[0].size.times { |c| yield([r, c]) }
    end
  end

end

# TESTS

if ENV["TEST"] || ENV["BIG_TEST"]

  require "test/unit" 

  class TestWordPath < Test::Unit::TestCase

    if ENV["BIG_TEST"]
      def test_big_problem
        assert_equal \
          303835410591851117616135618108340196903254429200,
          WordPath.count_paths(["A"*50] * 50, "A"*50)
      end
    end

    if ENV["TEST"]
      def test_count_paths
        w = WordPath
        assert_equal 1,
          w.count_paths(%w{ABC FED GHI}, "ABCDEFGHI")
        assert_equal 2,
          w.count_paths(%w{ABC FED GAI}, "ABCDEA")
        assert_equal 0,
          w.count_paths(%w{ABC DEF GHI}, "ABCD")
        assert_equal 108,
          w.count_paths(%w{AA AA}, "AAAA")
        assert_equal 56448,
          w.count_paths(%w{ABABA BABAB ABABA BABAB ABABA}, "ABABABBA")
        assert_equal 2745564336,
          w.count_paths(%w{AAAAA AAAAA AAAAA AAAAA AAAAA}, "AAAAAAAAAAA")
        assert_equal 0,
          w.count_paths(%w{AB CD}, "AA" )
        assert_equal 1,
          w.count_paths(%w{A}, "A")
      end
    end

  end

end

Set the BIG_TEST and/or TEST environment variables to run the test suites. For example:

$ TEST=1 ./countpaths.rb
Loaded suite countpaths
Started
.
Finished in 0.02062 seconds.

1 tests, 8 assertions, 0 failures, 0 errors

Unless somebody beats me to it, I’ll whip up a Perl version for comparison.

Update: I managed to speed up my code by a factor of 17. Now the execution time for the maximum-size, all-the-same-letter problem is down to 4.2 seconds, which is comparable with implementations in other languages. Ivan Peev’s Python implementation, for example, is only slightly faster at 2.8 seconds.

A performance killer in the previous version was using a single big hash for my cache. Now I use a 3D array:


counts[[i,r,c]]   # one big hash (slower)
counts[i][r][c]   # 3D-array (faster)

An additional advantage of the 3D-array is that I can peel off slabs as I descend the outer layers of nested loops. For instance, instead of writing:

for i in 0 .. 10
  for j in 0 .. 10
    sum += counts[i][j]
  end
end

I can lift the counts[i] slab out of the inner loop to eliminate j array-indexing operations:

for i in 0 .. 10
  slab = counts[i]
  for j in 0 .. 10
    sum += slab[j]
  end
end

Here’s the new code (sans the unit tests, which haven’t changed):

class WordPath

  A = Array

  def self.count_paths(grid, word)

    rword  = word.reverse
    rowmax = grid.size - 1
    colmax = grid.first.size - 1

    for i in 0 .. rword.size - 1
      letter = rword[i]
      previous_slab, slab = slab, A.new(rowmax+1) { A.new(colmax+1) }
      for r in 0 .. rowmax
        row, line = grid[r], slab[r]
        for c in 0 .. colmax
          line[c] = unless letter == row[c]
            0
          else
            if i == 0
              1
            else
              sum = 0
              clo = c > 0 ? c - 1 : c
              chi = c < colmax ? c + 1 : c
              for nr in (r > 0 ? r - 1 : r) .. (r < rowmax ? r + 1 : r)
                for nc in clo .. chi
                  sum += previous_slab[nr][nc] if nr != r || nc != c
                end
              end
              sum
            end
          end
        end
      end
    end

    sum = 0
    for r in 0 .. rowmax
      for c in 0 .. colmax
        sum += slab[r][c]
      end
    end

    sum

  end
end

Update 2: I tweaked the code snippet above to remove a variable that I just noticed wasn’t actually doing anything.

Posted in ,
Tags , , , ,
3 comments
no trackbacks
Reddit Delicious

Composing functions in Ruby

Posted by Tom Moertel Fri, 07 Apr 2006 15:55:00 GMT

One of the things I miss when coding in Ruby is inexpensive function composition. In Haskell, for example, I can compose functions using the dot (.) operator:

inc        = (+1)
twice      = (*2)
twiceOfInc = twice . inc
Because of Ruby’s open classes, however, I can easily add the feature to the language. In the code below, I introduce Proc.compose and overload the star (*) operator for the purpose:
# func_composition.rb
class Proc
  def self.compose(f, g)
    lambda { |*args| f[g[*args]] }
  end
  def *(g)
    Proc.compose(self, g)
  end
end

And that’s all it takes:

$ irb --simple-prompt -r func_composition.rb

>> inc = lambda { |x| x + 1 }
=> #<Proc:0x00002aaaaaad7810@(irb):1>

>> twice = lambda { |x| x * 2 }
=> #<Proc:0x00002aaaaabd2d18@(irb):2>

>> inc[1]
=> 2

>> twice[2]
=> 4

>> twice_of_inc = twice * inc
=> #<Proc:0x00002aaaaab32458@./func_composition.rb:3>

>> twice_of_inc[1]
=> 4

>> twice_of_inc[2]
=> 6

Now, isn’t that refreshing?

Update: Vincent Foley pointed out on comp.lang.ruby that Ruby Facets has a nearly identical implementation that also uses the star operator for composition. (Its version of compose, however, is an instance method whereas my version is a class method.)

Posted in ,
Tags , ,
2 comments
no trackbacks
Reddit Delicious

Improving Typo's spam protection

Posted by Tom Moertel Mon, 16 Jan 2006 06:34:00 GMT

I noticed that my site has been picking up more comment spam recently. Typo has built-in spam protection, but for some reason a few spam comments that ought to have been caught slipped through its filters. Curious, I investigated.

Most spam comments contain links to sites favored by the spammers. The sites are almost always of the form x.domain.com, where domain is one of a few higher-level domains and x is drawn from a large set of values from the realms of gambling, pornography, and male enhancement. It seems that the spammers pay for a few real domains and then create a ton of subdomains under them.

One of the ways to detect comment spam is to find URIs in comments and look up the sites they point to in DNS-based SURBLs, such as multi.surbl.org and bsb.empty.us. The thing is, when SURBLs list a spammy site x.domain.com, sometimes they list it under the full hostname x.domain.com and sometimes they list it under the higher-level domain domain.com. To be safe, Typo looks up both forms when it checks for spam.

Here’s the code it uses:

HOST_RBLS.each do |rbl|
  begin
    if [
        IPSocket.getaddress([host, rbl].join('.')),
        IPSocket.getaddress((domain + [rbl]).join('.'))
       ].include?("127.0.0.2")
      throw :hit, "#{rbl} positively resolved #{domain.join('.')}"
    end
  rescue SocketError
  end
end

The code iterates over the list of SURBLs it has and queries each twice – once for the host and once for the domain in question – saving the results of the queries in an array. Then if the array includes a positive response (127.0.0.2), it throws a “hit” notice to the calling code, which will block the associated comment.

Unfortunately, the code doesn’t quite work as intended. Although a positive response for either the host or the domain should register as a hit, the code requires both queries to return positive responses. As a result, the code yields a lot of false negatives because most lists don’t include both host and domain forms of spammy sites; the required double positive is thus hard to obtain.

The cause of the problem is the attempt to query for both forms of the site before checking either response. The queries are performed by calling IPSocket.getaddress, which performs a DNS query for the “A” record associated with its argument. If the record exists, the call returns it; otherwise, the call raises a SocketError exception.

The exception is what causes the logic to break down. When either the host or domain is not in the queried SURBL, which will almost always be the case for reasons I explained earlier, one of the queries will result in a SocketError exception. The exception will be caught by the rescue clause later in the code, but not before the opportunity to test the other query’s response and throw a “hit” has been lost.

My fix was to replace the above code with a call to a new helper method:

query_rbls(HOST_RBLS, host, domain.join('.'))

The helper, defined later, makes the actual queries:

def query_rbls(rbls, *subdomains)
  rbls.each do |rbl|
    subdomains.uniq.each do |d|
      begin
        response = IPSocket.getaddress([d, rbl].join('.'))
        throw :hit, "#{rbl} positively resolved #{d} => #{response}"
      rescue SocketError
        # NXDOMAIN response => negative:  d is not in RBL
      end
    end
  end
  return false
end

Because some SURBLs don’t use 127.0.0.2 but some other “A” record to indicate a positive response, my helper removes the hard-coded address test.

I also made a few more improvements to the spam-protection code. The full set of changes is available as Patch 657 on the Typo Trac site.

Posted in
Tags , ,
no comments
no trackbacks
Reddit Delicious

Closures and the professional programmer

Posted by Tom Moertel Tue, 30 Aug 2005 18:56:00 GMT

I came across Tim Bray’s thoughts on Ruby via the ever-delightful Lambda the Ultimate and found the following bit fascinating:

I’ve had access to languages with closures and continuations and suchlike constructs for years and years, and I’ve never ever written one. While I’m impressed by how natural this stuff is in Ruby, I’m still unconvinced that these are a necessary part of the professional programmer’s arsenal. [Emphasis mine.]

While Tim Bray may be unconvinced, I am a true believer.

Read more...

Posted in , , ,
Tags , ,
12 comments
1 trackback
Reddit Delicious

How to change symlinks atomically

Posted by Tom Moertel Mon, 22 Aug 2005 16:00:00 GMT

Many people don’t realize that changing the target of a symbolic link (symlink) is not an atomic operation. “Changing” a symlink really means deleting it and creating a new link with the same file name. For example, if I have a symlink current that points to a directory old, and I want to change it to point to a directory new, I might use the following command:

$ ln -snf new current

Strace shows what really happens when I run the command:

$ strace ln -snf new current 2>&1 | grep link
unlink("current")         = 0
symlink("new", "current") = 0

First, the existing symlink is deleted via the unlink system call. Then a new, identically named symlink is created via the symlink system call. It’s a two-step process, and in between the steps, there is no symlink.

This can be a problem if you expect the symlink to be there always, such as when using the link to point to the active version of a live web site. If you change the symlink while deploying a new version of your site, for example, the web server might try to dereference the link during the small window of time when it doesn’t exist. Oops.

The solution to this problem is to effect the change by creating a new symlink and then renaming it over the old symlink. On Unix-like systems, renaming is an atomic operation, and thus the symlink “change” will be atomic too. By hand, the process looks like this:

$ ln -s new current_tmp && mv -Tf current_tmp current

In Ruby, I make atomic symlinking available everywhere by extending the Pathname class with a new method atomic_symlink:

require 'pathname'

class Pathname
  def atomic_symlink(old)
    suffix = [Array.new(6){rand(256).chr}.join].pack("m").strip.tr('/','_');
    tmplink = Pathname.new(self.to_s + "_" + suffix)
    tmplink.make_symlink(old)
    begin
      tmplink.rename(self)
    rescue
      # if rename fails, we must remove the temporary link manually
      File.unlink(tmplink.to_s)
      raise
    end
  end
end

This code is nothing more than a robustified version of the by-hand method. It picks better names for temporary links, and it cleans up after itself, should something go wrong, but otherwise it does the same thing.

Given how easy it is to change symlinks atomically, why do it any other way? Life is hard enough without having to worry about another race condition.

Posted in , ,
Tags , ,
8 comments
no trackbacks
Reddit Delicious

Writing a simple Ruby evaluator in Haskell

Posted by Tom Moertel Sat, 26 Mar 2005 00:06:00 GMT

In the last few days I have been learning Ruby, something I have had on my to-do list for a long time. Luckily, I now have a project for which Ruby on Rails is perfect, and so now is the perfect time to get more into Ruby.

Naturally, I am making much use of the second edition of “The Pickaxe,” (pragmatic) Dave Thomas’s book Programming Ruby (the first edition of which is available online). Overall, it is a great book: good organization, lively writing, and superb examples.

But I must say I have one source of frustration. I am a computer-language guy, and I frequently find myself thinking, that’s great, but what exactly does this mean? I’ll give you an example, which I found quite surprising.

Using the following Ruby code, you can create what is in effect your own while-loop construct:

def my_while(cond)
  break unless cond
  yield
  retry
end

i = 0
my_while i < 10 do
  print i
  i += 1
end

# output: 0123456789

At first, this blew my mind. Why? Because while reading the book, I was building an understanding of the semantics of the Ruby language in my head, and my understanding of what it means to call a function (iterator) with a block was wrong. In my understanding, calling a function results in evaluating its arguments in the caller’s evaluation context, entering the evaluation context of the function, binding the argument values, passing in the associated block, and then evaluating the function body. When retry is called, I reasoned mistakenly, the evaluation stops and begins anew at the beginning of the function body, in this case, back at the break expression.

But, clearly, that cannot be what is happening. If it were, the loop would never terminate. The condition i < 10 would be evaluated only once – when the my_while function was called – and thus true would be forever bound to cond within the evaluation context of the function’s body.

At this point, I became curious. What’s really going on with retry? To understand its relationship to function calls, I started looking for the calling semantics of Ruby. (No luck finding them, btw.) Are arguments passed as thunks that get reevaluated upon each access? No, that seemed too wasteful and bizarre.

Pickaxe II said, “retry will reevaluate any arguments to the iterator before restarting it.” Yes, clearly, that is what is happening. But what is happening under the hood? What does that statement really mean?

So, after thinking about it, I concluded that what is going on is that a function call in Ruby works like this. Given a function f, a block b, and arguments xs, the call f(xs){b} means this:

  1. let k be the current continuation (i.e., just before the call)
  2. evaulate xs and bind the resulting values to f’s formal arguments
  3. bind b internally to the current block
  4. evaluate the body of f

Now, if inside of f’s body we encounter a retry, the evaluator basically calls k (with a nil argument, I expect). This jumps back to step 2, from which evaluation continues. Any side effects up to this point are retained (so we could have previously incremented i, for example), which is what eventually allows the code within the function body to choose an execution path which does not contain a retry expression, and thus avoid looping forever.

Just to make sure I really had the semantics down, I wrote an evaluator for a mini-Ruby in Haskell. (I find that I understand something better after I build it from the ground up.)

{-# OPTIONS -fglasgow-exts #-}

-- Tom Moertel <tom@moertel.com>
-- 2005-03-26

module MiniRuby where

import Control.Monad.Cont
import Control.Monad.Reader
import Control.Monad.State
import Data.List
import Data.Maybe

-- ========================================================
-- DATA TYPES

-- To keep things simple, there is only value type: string.

type Identifier = String
type Value = String

-- Evaluation occurs with in the continuation monad (which is used to
-- handle control flow), wrapped around a reader monad (which keeps
-- track of the calling context), wrapped around a state monad (which
-- keeps track of the evaluator's variables.)

type Env = [(Identifier, Value)] -- identifiers => values (strings)
type RubyEvalCxt a = ContT a (ReaderT FcallCxt (State Env)) a
data FcallCxt = FC { retryCall :: Exp
                   , blockCont :: Value -> Exp }

-- An expression is just something that can be evaluated to a value
-- within a ruby evaluation context.

type Exp = RubyEvalCxt Value

-- ========================================================
-- EVALUATOR

-- The following function evaluates an expression within a given
-- evaluation context.   The result is a value.

eval :: Env -> FcallCxt -> Exp -> Value
eval env fc =
    (`evalState` env) . (`runReaderT` fc) . (`runContT` return)

-- This version evaluates an expression at the "top level."

evalTop = eval [] $
    FC { retryCall = return "TOPLEVEL RETRY"
       , blockCont = const $ return "TOPLEVEL BLOCK" }

-- A function call takes a function, a list of variable bindings, and
-- corresponding block.  It then creates a new execution context,
-- binds the variables, and evaluates the function body.

fcall :: Exp -> [(Identifier, Exp)] -> Exp -> Exp
fcall fn args blk = callCC evalFn
  where
    evalFn fnCont = (`local` do { bindArgs; fn }) $ \fc ->
        fc { retryCall = evalFn fnCont >>= fnCont
           , blockCont = const blk }
    bindArgs = mapM_ (uncurry (=:=)) args

-- Yield remembers the current continuation and passes control to the
-- associated block (which we get from the calling context).  For
-- extra spice, this version differs from Ruby's in that upon yielding
-- it makes the yielding function seem like a block to the block to
-- which it yields.  This lets the function and block yield back and
-- forth to each other, passing values along the way.

yield_ = yield "YIELD" -- yield w/o value

yield :: Value -> Exp
yield value = callCC $ \k -> do
    bc <- asks blockCont
    local (\fc -> fc { blockCont = k }) (bc value)

-- Retry restarts the current computation w/in the calling context's
-- continuation.

retry :: Exp
retry = do
    k <- asks retryCall
    k

-- ========================================================
-- VARIABLES

-- Bind associates a value with an identifier

bind :: Identifier -> Value -> Exp
bind i v = do
    modify ((i,v) :)
    return v

-- More convenient syntax (=:=) for binding

infixr 1 =:=
class Bindable v where (=:=) :: Identifier -> v -> Exp
instance Bindable Value where (=:=) = bind
instance Bindable Exp where i =:= e = bind i =<< e

-- Lookup the value associated with an identifier

val :: Identifier -> Exp
val i = gets $ fromMaybe (i ++ "=UNDEFINED") . lookup i

-- ========================================================
-- SAMPLE CODE

-- This code shows how "retry" works.  It is equivalent
-- to the following Ruby code:
--
--   def my_while(cond)
--     if cond
--       yield
--       retry
--     end
--   end
--
--   i = 0
--   my_while i < 10 do
--     i += 1
--   end
--
--   i

test1 = do
    "i" =:= "0"
    my_while [("cond", condExp)] $
        "i" += 1 -- block, passed to my_while
    val "i"
  where
    my_while = fcall $ do
        cond <- val "cond"
        if cond == "true"
           then do { yield_; retry }
           else return cond

    -- Ruby's += operator
    a += b = a =:= (liftM $ show . (b+) . read) (val a)

    -- the expression "i < 10"
    condExp = do
        i <- val "i"
        return $ if (read i) < 10 then "true" else "false"

-- This example tests out yield.  It is equivalent to the following
-- Ruby code:
--
--   def f
--     @i = "I"
--     @k = "K"
--     yield
--   end
--
--   f do
--     @j = "J"
--     @l = "L"
--   end
--
--   [@i,@j,@k,@l].join(" ")

test2 = do
    f [] $ do
        "j" =:= "J"
        "l" =:= "L"
    mapM val (words "i j k l") >>= return . unwords
  where
    f = fcall $ do
        "i" =:= "I"
        "k" =:= "K"
        yield_

-- This sample is somewhat trickier.  It uses the evaluator's
-- extended yield semantics to do what this Ruby code would
-- do if it were legal:
--
-- def f
--   @i = "I"
--   @j = yield
--   @k = "K"
--   @l = yield "Right-back-atcha!"
--   @m = yield
-- end
--
-- f do
--   rba = yield "J-via-yield"  # not Ruby: yields *back* to f's body
--   yield rba
--   yield "M-via-yield"
-- end
--
-- [@i,@j,@k,@l,@m].join(" ")

test3 = do
    f [] $ do
        rba <- yield "J-via-yield"
        yield rba
        yield "M-via-yield"
    mapM val (words "i j k l m") >>= return . unwords
  where
    f = fcall $ do
        "i" =:= "I"
        "j" =:= yield_
        "k" =:= "K"
        "l" =:= yield "Right-back-atcha!"
        "m" =:= yield_

Here’s what the code does when executed:

> evalTop test1
"10"

> evalTop test2
"I J K L"

> evalTop test3
"I J-via-yield K Right-back-atcha! M-via-yield"

I must say that I really like Ruby’s semantics. So far, I find Ruby to be a seriously cool programming language.

Posted in , , ,
Tags , ,
6 comments
no trackbacks
Reddit Delicious