<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="/stylesheets/rss.css" type="text/css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Tom Moertel's Weblog: Tag ruby</title>
    <link>http://blog.moertel.com/articles/tag/ruby?tag=ruby</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>Quality rants on programming theory and stuff geeks like</description>
    <item>
      <title>Ruby 1.9 gets handy new method Object#tap</title>
      <description>&lt;p&gt;Via
&lt;a href="http://eigenclass.org/hiki.rb?Changes-in-Ruby-1.9-update-6"&gt;eigenclass.org&lt;/a&gt;
I learned that Ruby 1.9 will sport a new &lt;code&gt;Object&lt;/code&gt; method
called
&lt;a href="http://eigenclass.org/hiki.rb?Changes+in+Ruby+1.9#l25"&gt;&lt;code&gt;tap&lt;/code&gt;&lt;/a&gt;,
which is something I&amp;#8217;ve been &lt;a href="http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/189541"&gt;hoping
for&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;What&amp;#8217;s &lt;code&gt;tap&lt;/code&gt;?  It&amp;#8217;s a helper for call chaining.  It
passes its object into the given block and, after the block finishes,
returns the object:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;an_object&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;tap&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;o&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
  &lt;span class="comment"&gt;# do stuff with an_object, which is in o&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt; &lt;span class="comment"&gt;# ===&amp;gt; an_object&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The benefit is that &lt;code&gt;tap&lt;/code&gt; always returns the object it&amp;#8217;s called on, even if the block returns some other result.  Thus you can insert a &lt;code&gt;tap&lt;/code&gt; block into the middle of an existing method pipleline without breaking the flow.  MenTaLguY has some &lt;a href="http://moonbase.rydia.net/mental/blog/programming/eavesdropping-on-expressions"&gt;nifty examples&lt;/a&gt; of other things you can do with &lt;code&gt;tap&lt;/code&gt;.&lt;/p&gt;


	&lt;p&gt;Fans of Ruby on Rails may recognize &lt;code&gt;tap&lt;/code&gt; as similar to RoR&amp;#8217;s own
&lt;a href="http://weblog.jamisbuck.org/2006/10/27/mining-activesupport-object-returning"&gt;&lt;code&gt;returning&lt;/code&gt;&lt;/a&gt; helper.&lt;/p&gt;


	&lt;p&gt;Looks like Ruby 1.9 is going to be extra cool for a number of reasons.&lt;/p&gt;</description>
      <pubDate>Wed, 07 Feb 2007 12:08:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:9a2e5bfe-f2b1-462f-88c5-cd231503292a</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2007/02/07/ruby-1-9-gets-handy-new-method-object-tap</link>
      <category>ruby</category>
      <category>rails</category>
      <category>ruby</category>
      <category>rails</category>
      <category>tap</category>
      <category>helpers</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/362</trackback:ping>
    </item>
    <item>
      <title>Adding Haskell syntax highlighting to the Typo blogging system</title>
      <description>&lt;p&gt;Last night on &lt;a href="irc://irc.freenode.net/%23haskell"&gt;#haskell&lt;/a&gt;, &lt;a href="http://www.cse.unsw.edu.au/~dons/"&gt;Don
Stewart&lt;/a&gt; asked if I had seen
&lt;a href="http://www.cs.york.ac.uk/fp/darcs/hscolour/"&gt;HsColour&lt;/a&gt;
for rendering syntax-highlighted Haskell in &lt;span class="caps"&gt;HTML&lt;/span&gt;.  He had
used it recently, he noted in passing, &lt;a href="http://cgi.cse.unsw.edu.au/~dons/blog/2006/09/10#colours"&gt;to add syntax highlighting to planet.haskell.org&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;Now, I can&amp;#8217;t be certain about this, but I suspect that Don&amp;#8217;s question
was cleverly designed to instill in me a subtle case of
syntax-highlighting envy.  For on &lt;a href="http://blog.moertel.com/"&gt;&lt;em&gt;my&lt;/em&gt; blog&lt;/a&gt;, Haskell code snippets
were rendered in dreadfully boring uncolored text.
But on &lt;a href="http://cgi.cse.unsw.edu.au/~dons/blog"&gt;&lt;em&gt;his&lt;/em&gt; blog&lt;/a&gt;, the
snippets dance in joyous polychromatic splendor.&lt;/p&gt;


	&lt;p&gt;Thus I was compelled to add Haskell syntax-highlighting to my blog.&lt;/p&gt;


	&lt;h3&gt; Adding Haskell syntax-highlighting to Typo&lt;/h3&gt;


	&lt;p&gt;My blog runs on the Ruby-on-Rails-powered &lt;a href="http://typosphere.org/"&gt;Typo&lt;/a&gt;
system, which &lt;a href="http://scottstuff.net/blog/articles/2005/08/23/introduction-to-typo-filters"&gt;allows for plug-in text filters&lt;/a&gt;.  One of the included filters, in fact, is a syntax-highlighting filter for snippets of Ruby, &lt;span class="caps"&gt;XML&lt;/span&gt;, and &lt;span class="caps"&gt;YAML&lt;/span&gt; code.  This filter is built upon the Ruby &lt;a href="http://syntax.rubyforge.org/"&gt;Syntax&lt;/a&gt; module, which wasn&amp;#8217;t exactly designed for Haskell syntax analysis.  So I set out to create a new plug-in filter based upon HsColour.&lt;/p&gt;


	&lt;p&gt;This task turned out to be easy.  All I did was duplicate
Typo&amp;#8217;s existing syntax-highlighting filter and swap out its filtering
code for the following:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="constant"&gt;IO&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;popen&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;HsColour -css&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;,&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;r+&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
  &lt;span class="ident"&gt;pid&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;fork&lt;/span&gt; &lt;span class="punct"&gt;{&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;write&lt;/span&gt; &lt;span class="ident"&gt;text&lt;/span&gt;&lt;span class="punct"&gt;;&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;close&lt;/span&gt;&lt;span class="punct"&gt;;&lt;/span&gt; &lt;span class="ident"&gt;exit!&lt;/span&gt; &lt;span class="number"&gt;0&lt;/span&gt; &lt;span class="punct"&gt;}&lt;/span&gt;
  &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;close_write&lt;/span&gt;
  &lt;span class="ident"&gt;text&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;f&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;read&lt;/span&gt;
  &lt;span class="constant"&gt;Process&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;waitpid&lt;/span&gt; &lt;span class="ident"&gt;pid&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;I also tweaked the post-processing regular expressions so that they
would whittle away the &lt;span class="caps"&gt;HTML&lt;/span&gt; filler before and after the
syntax-highlighted output of HsColour:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;text&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;gsub!&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;.*&amp;lt;p()re&amp;gt;&lt;/span&gt;&lt;span class="punct"&gt;/&lt;/span&gt;&lt;span class="ident"&gt;m&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;...)&lt;/span&gt;
&lt;span class="ident"&gt;text&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;gsub!&lt;/span&gt;&lt;span class="punct"&gt;(/&lt;/span&gt;&lt;span class="regex"&gt;&amp;lt;&lt;span class="escape"&gt;\/&lt;/span&gt;pre&amp;gt;.*&lt;/span&gt;&lt;span class="punct"&gt;/&lt;/span&gt;&lt;span class="ident"&gt;m&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;...)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;A few more tweaks and I was done.&lt;/p&gt;


	&lt;p&gt;Now I can wrap my Haskell code in &amp;lt;typo:haskell&amp;gt; tags and it, too, will
dance in joyous polychromatic splendor:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;constructTable&lt;/span&gt; &lt;span class='varid'&gt;tspecs&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
    &lt;span class='varid'&gt;ecolspecs&lt;/span&gt; &lt;span class='keyglyph'&gt;&amp;lt;-&lt;/span&gt; &lt;span class='varid'&gt;during&lt;/span&gt; &lt;span class='str'&gt;"argument evaluation"&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
        &lt;span class='varid'&gt;toNvps&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;concat&lt;/span&gt; &lt;span class='varop'&gt;=&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class='varid'&gt;mapM&lt;/span&gt; &lt;span class='varid'&gt;splice&lt;/span&gt; &lt;span class='varid'&gt;tspecs&lt;/span&gt;
    &lt;span class='keyword'&gt;let&lt;/span&gt; &lt;span class='varid'&gt;names&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;map&lt;/span&gt; &lt;span class='varid'&gt;fst&lt;/span&gt; &lt;span class='varid'&gt;ecolspecs&lt;/span&gt;
    &lt;span class='keyword'&gt;let&lt;/span&gt; &lt;span class='varid'&gt;evecs&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;map&lt;/span&gt; &lt;span class='varid'&gt;snd&lt;/span&gt; &lt;span class='varid'&gt;ecolspecs&lt;/span&gt;
    &lt;span class='varid'&gt;vecs&lt;/span&gt; &lt;span class='keyglyph'&gt;&amp;lt;-&lt;/span&gt; &lt;span class='varid'&gt;argof&lt;/span&gt; &lt;span class='varid'&gt;nm&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='varid'&gt;mapM&lt;/span&gt; &lt;span class='varid'&gt;evalVector&lt;/span&gt; &lt;span class='varid'&gt;evecs&lt;/span&gt;
    &lt;span class='keyword'&gt;let&lt;/span&gt; &lt;span class='varid'&gt;vlens&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;map&lt;/span&gt; &lt;span class='varid'&gt;vlen&lt;/span&gt; &lt;span class='varid'&gt;vecs&lt;/span&gt;
    &lt;span class='keyword'&gt;if&lt;/span&gt; &lt;span class='varid'&gt;length&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;group&lt;/span&gt; &lt;span class='varid'&gt;vlens&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varop'&gt;==&lt;/span&gt; &lt;span class='num'&gt;1&lt;/span&gt;
        &lt;span class='keyword'&gt;then&lt;/span&gt; &lt;span class='varid'&gt;return&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='conid'&gt;VTable&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='varid'&gt;mkTable&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;zip&lt;/span&gt; &lt;span class='varid'&gt;names&lt;/span&gt; &lt;span class='varid'&gt;vecs&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
        &lt;span class='keyword'&gt;else&lt;/span&gt; &lt;span class='varid'&gt;throwError&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;
             &lt;span class='str'&gt;"table columns must be non-empty vectors of equal length"&lt;/span&gt;
  &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;nm&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"table(...) constructor"&lt;/span&gt;
    &lt;span class='varid'&gt;splice&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;TCol&lt;/span&gt; &lt;span class='varid'&gt;envp&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;return&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='varid'&gt;envp&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
    &lt;span class='varid'&gt;splice&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;TSplice&lt;/span&gt; &lt;span class='varid'&gt;e&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
        &lt;span class='varid'&gt;val&lt;/span&gt; &lt;span class='keyglyph'&gt;&amp;lt;-&lt;/span&gt; &lt;span class='varid'&gt;eval&lt;/span&gt; &lt;span class='varid'&gt;e&lt;/span&gt;
        &lt;span class='keyword'&gt;case&lt;/span&gt; &lt;span class='varid'&gt;val&lt;/span&gt; &lt;span class='keyword'&gt;of&lt;/span&gt;
            &lt;span class='conid'&gt;VTable&lt;/span&gt; &lt;span class='varid'&gt;t&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt;
                &lt;span class='varid'&gt;return&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='varid'&gt;zipWith&lt;/span&gt; &lt;span class='varid'&gt;mkNVP&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;tcnames&lt;/span&gt; &lt;span class='varid'&gt;t&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;elems&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;tvecs&lt;/span&gt; &lt;span class='varid'&gt;t&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
            &lt;span class='conid'&gt;VList&lt;/span&gt; &lt;span class='varid'&gt;gl&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt;
                &lt;span class='varid'&gt;liftM&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;zipWith&lt;/span&gt; &lt;span class='varid'&gt;mkNVP&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;map&lt;/span&gt; &lt;span class='varid'&gt;name&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;elems&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='varid'&gt;glnames&lt;/span&gt; &lt;span class='varid'&gt;gl&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;
                &lt;span class='varid'&gt;mapM&lt;/span&gt; &lt;span class='varid'&gt;asVectorNull&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;elems&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='varid'&gt;glvals&lt;/span&gt; &lt;span class='varid'&gt;gl&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
            &lt;span class='keyword'&gt;_&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='varid'&gt;throwError&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;
                &lt;span class='str'&gt;"can't construct table columns from ("&lt;/span&gt; &lt;span class='varop'&gt;++&lt;/span&gt;
                &lt;span class='varid'&gt;show&lt;/span&gt; &lt;span class='varid'&gt;val&lt;/span&gt; &lt;span class='varop'&gt;++&lt;/span&gt; &lt;span class='str'&gt;")"&lt;/span&gt;
    &lt;span class='varid'&gt;mkNVP&lt;/span&gt; &lt;span class='varid'&gt;n&lt;/span&gt; &lt;span class='varid'&gt;vec&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;NVP&lt;/span&gt; &lt;span class='varid'&gt;n&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;mkNoPosExpr&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='conid'&gt;EVal&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='conid'&gt;VVector&lt;/span&gt; &lt;span class='varid'&gt;vec&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
    &lt;span class='varid'&gt;name&lt;/span&gt; &lt;span class='str'&gt;""&lt;/span&gt;     &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"NA"&lt;/span&gt;
    &lt;span class='varid'&gt;name&lt;/span&gt; &lt;span class='varid'&gt;n&lt;/span&gt;      &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;n&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;If you want the filter code, here it is: &lt;a href="http://community.moertel.com/~thor/blog/haskell_controller.rb.txt"&gt;haskell_controller.rb&lt;/a&gt;.  Just drop it into &lt;code&gt;components/plugins/textfilters&lt;/code&gt; and restart Typo.  The corresponding &lt;span class="caps"&gt;CSS&lt;/span&gt; styles can be found in my &lt;a href="http://blog.moertel.com/stylesheets/user-styles.css"&gt;user-styles.css&lt;/a&gt;.&lt;/p&gt;</description>
      <pubDate>Wed, 01 Nov 2006 17:01:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:62648231-3b46-4d96-a657-69565f7ee784</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2006/11/01/adding-haskell-syntax-highlighting-to-the-typo-blogging-system</link>
      <category>haskell</category>
      <category>ruby</category>
      <category>typo</category>
      <category>typo</category>
      <category>ruby</category>
      <category>haskell</category>
      <category>hscolour</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/207</trackback:ping>
    </item>
    <item>
      <title>A type-based solution to the &amp;quot;strings problem&amp;quot;: a fitting end to XSS and SQL-injection holes?</title>
      <description>&lt;p&gt;Even skilled programmers have a hard time keeping their web
applications free of &lt;span class="caps"&gt;XSS&lt;/span&gt; and &lt;span class="caps"&gt;SQL&lt;/span&gt;-injection vulnerabilities.  And it
shows:  &lt;a href="http://portal.spidynamics.com/blogs/msutton/archive/2006/09/26/How-Prevalent-Are-SQL-Injection-Vulnerabilities_3F00_.aspx"&gt;a sobering portion of web sites are open to some scary security threats&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;Why are so many sites vulnerable to these well-known holes?  Probably
because it&amp;#8217;s insanely hard for programmers to solve the fundamental
&amp;#8220;strings problem&amp;#8221; at the heart of these vulnerabilities. The problem
itself is easy to understand, but we humans aren&amp;#8217;t equipped to carry
out the solution.  Simply put, we just plain suck at keeping a
bazillion different strings straight in our heads, let alone
consistently and reliably rendering their interactions safe whenever they
cross paths in a modern web application.  It&amp;#8217;s easy to say, &amp;#8220;just
escape the little buggers,&amp;#8221; but it&amp;#8217;s hard to get it right, every single time.&lt;/p&gt;


	&lt;p&gt;Computers, on the other hand, are pretty good at keeping track of
details by the bucket-full. Wouldn&amp;#8217;t it be nice, then,
if our programming languages gave us the power to delegate this nasty &amp;#8220;strings
problem&amp;#8221; to our computers, which could then devote their unwavering mechanical precision to grinding the problem out of existence?  &lt;a href="http://weblog.raganwald.com/2006/03/ill-take-static-typing-for-800-alex.html" title="Raganwald: I'll take Static Typing for $800, Alex."&gt;Isn&amp;#8217;t that the kind of thing modern programming languages are supposed to be good at?&lt;/a&gt;&lt;/p&gt;


	&lt;p&gt;I&amp;#8217;d like to think the answer to that question is a big, &lt;em&gt;you betcha&lt;/em&gt;.&lt;/p&gt;


	&lt;p&gt;So let&amp;#8217;s grab a modern programming language and solve the strings problem.&lt;/p&gt;


	&lt;h3&gt; Let&amp;#8217;s solve the strings problem in Haskell&lt;/h3&gt;


	&lt;p&gt;In this article, we will look at one way (among many) to solve the strings
problem: by adding Ruby-style string templates to Haskell.  These
templates support &amp;#8220;interpolation&amp;#8221; via the usual, convenient &lt;code&gt;#{var}&lt;/code&gt;
syntax, but here interpolation is type safe. Haskell&amp;#8217;s type system
will prevent us from inadvertently mixing incompatible string types,
and it will detect mistakes at compile time, before they can become
live &lt;span class="caps"&gt;XSS&lt;/span&gt; or &lt;span class="caps"&gt;SQL&lt;/span&gt;-injection holes.  Further, our solution will offer
us these benefits without making us jump through hoops or pay some
onerous syntax penalty.&lt;/p&gt;


	&lt;p&gt;To be more specific, the system offers the following benefits:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;It provides a string-management kernel that lets you create &amp;#8220;safe strings&amp;#8221; by &lt;em&gt;certifying&lt;/em&gt; a regular string as representing either text or a fragment of a known language.&lt;/li&gt;
		&lt;li&gt;It allows you to conveniently define new language types for any string-based language that you can provide an escaping rule for (e.g., &lt;span class="caps"&gt;XML&lt;/span&gt;, URLs, &lt;span class="caps"&gt;SQL&lt;/span&gt;, untrusted user input).&lt;/li&gt;
		&lt;li&gt;It provides compile-time syntactic sugar (via Template Haskell) that makes working with safe strings as convenient as working with string interpolation in languages like Ruby and Perl.&lt;/li&gt;
		&lt;li&gt;It catches and reports (at compile time) the following commonly made programming errors:
	&lt;ul&gt;
	&lt;li&gt;failing to escape a plain-old-text string before mixing it into a string that represents a language fragment&lt;/li&gt;
		&lt;li&gt;mixing strings that represent fragments of incompatible languages&lt;/li&gt;
		&lt;li&gt;mixing strings that represent fragments of compatible languages in an ambiguous way (the system will force you to disambiguate)&lt;/li&gt;
	&lt;/ul&gt;&lt;/li&gt;
	&lt;/ul&gt;


	&lt;p&gt;(This is a long one, so grab an espresso, lean back, and read on in
style.  Also, if you have a smoking jacket, you might want to get it now.)&lt;/p&gt;&lt;p&gt;Before I describe this Haskell-based solution, let&amp;#8217;s take a closer
look at the strings problem and review why a type-based approach makes
sense.  (If you already understand the strings problem and are
convinced that it is both important and tricky to solve, feel free
to skim the first third of this article.)&lt;/p&gt;


	&lt;h3&gt; Examining the &amp;#8220;strings problem&amp;#8221;&lt;/h3&gt;


	&lt;p&gt;Most web applications are just business-logic-driven string processors.  They
take strings from user-submitted forms, database queries, web-service
responses, templates, and myriad other sources, and they combine the
strings to generate yet more strings, which they emit as output and
fling across the Internet, into your web browser.&lt;/p&gt;


	&lt;p&gt;For example, consider this snippet of Ruby (on Rails) code that I used &lt;a href="http://blog.moertel.com/articles/2006/08/09/adding-reddit-and-del-icio-us-buttons-to-articles-in-typo"&gt;to
add submit-to-Reddit and submit-to-del.icio.us
buttons&lt;/a&gt;
to articles on my blog:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;def submit_this_article_links(article)
  site_list(article).map do |submit_title, submit_url, image_tag|
    %(&amp;lt;a href="#{h submit_url}" 
         title="#{h submit_title}: &amp;amp;#x201C;#{h article.title}&amp;amp;#x201D;" 
      &amp;gt;#{image_tag}&amp;lt;/a&amp;gt;)
  end.join("&amp;amp;#160;")
end

def site_list(article)
  u_title = u(article.title)
  u_url = u(url_of(article, false))
  [  # I really belong in a database table
    [ "Submit to Reddit.com",
      "http://reddit.com/submit?url=#{u_url}&amp;#38;title=#{u_title}",
      image_tag("reddit.gif", :size =&amp;gt; "18x18", :border =&amp;gt; 0)
    ],
    [ "Save to del.icio.us",
      "http://del.icio.us/post?v=2&amp;#38;url=#{u_url}&amp;#38;title=#{u_title}",
      image_tag("delicious.gif", :size =&amp;gt; "16x16", :border =&amp;gt; 0)
    ]
  ]
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;When writing this code, I had to keep track of at least three
different kinds of strings:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;&lt;strong&gt;Plain-old text&lt;/strong&gt;, e.g., article titles&lt;/li&gt;
		&lt;li&gt;&lt;strong&gt;URLs&lt;/strong&gt;, e.g., article permalinks&lt;/li&gt;
		&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;XHTML&lt;/span&gt; fragments&lt;/strong&gt;, e.g., the hypertext link to Reddit&amp;#8217;s submission form&lt;/li&gt;
	&lt;/ul&gt;


	&lt;p&gt;In code like this, each type of string must conform to the
requirements of its own little language, and it&amp;#8217;s the programmer&amp;#8217;s job &amp;#8211; your job &amp;#8211; to make sure that differences in these requirements are accounted for
when combining strings.  Getting it right is a
difficult trick to pull off, and getting it right consistently is
&lt;a href="http://blog.moertel.com/articles/2006/10/12/if-unit-testing-cant-keep-rails-safe-from-string-escaping-problems-what-makes-you-think-it-will-keep-your-projects-safe"&gt;something even the best developers have difficulty doing&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;In the tiny snippet of code above, for example, I had to remember to
do all of these things:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;&lt;span class="caps"&gt;URL&lt;/span&gt;-escape (using the &lt;code&gt;u&lt;/code&gt; helper method) the article&amp;#8217;s title before inserting it into the submit-URL template&lt;/li&gt;
		&lt;li&gt;&lt;span class="caps"&gt;URL&lt;/span&gt;-escape the &lt;span class="caps"&gt;URL&lt;/span&gt; for the article&amp;#8217;s permalink before inserting it into the submit-URL template&lt;/li&gt;
		&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;-escape (using the &lt;code&gt;h&lt;/code&gt; helper method) the final, expanded submit-URL template before inserting it into the hypertext-link template&lt;/li&gt;
		&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;-escape the submit-title (e.g., &amp;#8220;Submit to Reddit&amp;#8221;) before inserting it into the hypertext-link template&lt;/li&gt;
		&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;-escape the article&amp;#8217;s title before inserting it into the hypertext-link template&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;That&amp;#8217;s a lot to keep track of when coding.&lt;/p&gt;


	&lt;p&gt;But that&amp;#8217;s not all.  I also had to know &lt;em&gt;not&lt;/em&gt; to escape the result of
calling &lt;code&gt;image_tag&lt;/code&gt;, because that helper method returns
an &lt;span class="caps"&gt;HTML&lt;/span&gt; fragment, which is already in the language of the
hypertext-link template into which it is inserted.  Escaping it would
have turned the image-element markup into embedded text that happens
to look a lot like &lt;span class="caps"&gt;HTML&lt;/span&gt; markup.&lt;/p&gt;


	&lt;p&gt;And that&amp;#8217;s not the worst of it.  If you screw up any one of these
steps for the typical web application, you open
the door to a host of nasty problems.  If you&amp;#8217;re lucky, the damage
will be contained to broken links or a rendering problem that
most people won&amp;#8217;t notice, maybe a weird database error now and again.
In the worst case, however, you&amp;#8217;re screwed: Your application&amp;#8217;s
customers become vulnerable to &lt;a href="http://en.wikipedia.org/wiki/Cross_site_scripting"&gt;cross-site-scripting (XSS)
attacks&lt;/a&gt; and your
database is opened to &lt;a href="http://en.wikipedia.org/wiki/SQL_injection"&gt;injected
&lt;span class="caps"&gt;SQL&lt;/span&gt;&lt;/a&gt;, through which
enterprising crackers might steal your customers&amp;#8217; account data
or do even nastier things.&lt;/p&gt;


	&lt;p&gt;Clearly, the strings problem is common enough and nasty enough to merit
our attention.  Many of our favorite problem-stomping practices,
however, have not proved effective on the ever-tricky strings problem.&lt;/p&gt;


	&lt;h3&gt;Unit testing is an inefficient solution to the strings problem&lt;/h3&gt;


	&lt;p&gt;Unit testing is one of the most efficient programming practices for
increasing the quality of software.  If you write unit tests pervasively
as you code, you are likely to nip many kinds of programming problems
in the bud, saving time and effort, which you can then re-invest in
your code.  Further, unit-testing suites make for swell
regression-detection nets and thus free you to refactor crufty code
without fear of introducing breakage elsewhere.  As a result, you&amp;#8217;re
more likely to keep your code lean and mean.&lt;/p&gt;


	&lt;p&gt;Despite its general effectiveness, unit testing is an inefficient way
to defend against the perils of the strings problem.  That&amp;#8217;s because
the strings problem is caused by knowledge deficits, which you can&amp;#8217;t
test for.  If you don&amp;#8217;t realize that you must escape one &lt;span class="caps"&gt;URL&lt;/span&gt;
before you stuff it into another &lt;span class="caps"&gt;URL&lt;/span&gt;, you probably won&amp;#8217;t think to
write tests for that requirement.&lt;/p&gt;


	&lt;p&gt;Moreover, if you do think to write the tests, it&amp;#8217;s expensive to get
them right.  In most unit testing scenarios, getting the tests right
is usually easier or at least comparable in difficulty to getting the
code that&amp;#8217;s being tested right.  That&amp;#8217;s why unit testing is usually
so efficient.  For the strings problem, however, getting
the tests right is often much more expensive than writing typical
string-handling code.  In my code sample
above, for example, there are at least six ways the strings problem
can cause trouble.  How do you test for them all without making
a mistake?  It&amp;#8217;s not easy.&lt;/p&gt;


	&lt;p&gt;In sum, unit testing probably isn&amp;#8217;t the answer to the strings problem.&lt;/p&gt;


	&lt;h3&gt;Other solutions to the strings problem&lt;/h3&gt;


	&lt;p&gt;If unit testing isn&amp;#8217;t the answer, what is?&lt;/p&gt;


	&lt;p&gt;Joel Spolsky wrote about
the strings problem and &lt;a href="http://www.joelonsoftware.com/articles/Wrong.html"&gt;suggested that using Hungarian notation was
an effective
solution&lt;/a&gt;.
It might work, but it&amp;#8217;s clunky.&lt;/p&gt;


	&lt;p&gt;In the database-programming world, many programmers have adopted the
convention of never inserting a string into a &lt;span class="caps"&gt;SQL&lt;/span&gt; template by hand.
Instead, they insert placeholders, typically question marks,
into a template to indicate where they would like strings to be
inserted.  The template and the strings are then given
to a special function that safely inserts the strings, escaping them
as necessary.  In Ruby on Rails, which has a fairly typical
implementation, template expansion looks like this:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Post.find_by_sql \
  [ "SELECT * FROM posts WHERE author = ? AND created &amp;gt; ?",
    author_id, start_date ]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The question-marks-in-the-template solution is effective, but it&amp;#8217;s
also clunky, especially when you&amp;#8217;re trying to insert a lot of strings.
By comparison, Ruby&amp;#8217;s native string-interpolation feature, in which the syntax
&lt;code&gt;#{...}&lt;/code&gt; lets us inject strings into a string template, is
unsafe but much easier to follow:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;chunkiness = "extra chunky" 
"I love #{chunkiness} bacon!" 
# ==&amp;gt; "I love extra chunky bacon!" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;In sum, the Hungarian-notation solution and the question-marks
solution are reasonable responses to the strings problem, but both are
clunky, especially when compared to the straightforwardness of
good-old string interpolation.&lt;/p&gt;


	&lt;p&gt;Perhaps we can do better.&lt;/p&gt;


	&lt;h3&gt; Eating and having one&amp;#8217;s cake: a type-based solution&lt;/h3&gt;


	&lt;p&gt;An ideal solution would combine the safety of the question-marks
solution with the straightforward convenience of string interpolation,
and it would work for all kinds of strings, not just &lt;span class="caps"&gt;SQL&lt;/span&gt;, and, because
I&amp;#8217;m implementing it in Haskell, it would lovingly nestle into
Haskell&amp;#8217;s type system and gain the full benefits of type-inferencing
goodness.&lt;/p&gt;


	&lt;p&gt;How would it work?  Well, let&amp;#8217;s back up and think about strings for a
moment.  We can divide strings into two classes: (1) those that
represent text, in which every character represents literally itself;
and (2) those that represent fragments of interpreted languages, such
as &lt;span class="caps"&gt;XML&lt;/span&gt; or &lt;span class="caps"&gt;SQL&lt;/span&gt;, where each character&amp;#8217;s interpretation depends on the
rules of the associated language.  In text, for example, an ampersand
(&amp;#8220;&amp;#38;&amp;#8221;) represents an ampersand, but in &lt;span class="caps"&gt;XML&lt;/span&gt; an ampersand represents the
start of a character-entity reference.&lt;/p&gt;


	&lt;p&gt;It doesn&amp;#8217;t make sense, then, to join text strings directly with
language-fragment strings.  If you did join them, text characters
could be misinterpreted as language characters.  For the same reason,
it doesn&amp;#8217;t make sense to join fragments of different languages
together.  (It does make sense, however, to &lt;em&gt;escape&lt;/em&gt; text strings or
language fragments &amp;#8220;into&amp;#8221; a target language and &lt;em&gt;then&lt;/em&gt; join them with
strings in the target language.)&lt;/p&gt;


	&lt;p&gt;A sound solution, therefore, should enforce the following fundamental,
safe-string-handling rule: &lt;em&gt;Do not allow strings that represent
fragments of one language to be directly joined with strings that
represent either plain text or fragments of another language&lt;/em&gt;.&lt;/p&gt;


	&lt;p&gt;The trick is making the computer enforce this rule for us.  As
it turns out, modern type systems absolutely love to do this kind of thing.&lt;/p&gt;


	&lt;h3&gt; A solution to the strings problem in Haskell&lt;/h3&gt;


	&lt;p&gt;Making the computer enforce our safe-string-handling rule in Haskell
is fairly easy.  All it takes is a little code.
(As we go through the following code, remember that
we&amp;#8217;re writing a library.  Normally, as users of the library, this
code would be invisible to us.)&lt;/p&gt;


	&lt;p&gt;To begin, we create a module for our code and export
the essential types and functions that make up our about-to-be-written
safe-string kernel:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;module&lt;/span&gt; &lt;span class='conid'&gt;SafeStrings&lt;/span&gt;
&lt;span class='layout'&gt;(&lt;/span&gt;
  &lt;span class='conid'&gt;Language&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='keyglyph'&gt;..&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt;
&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='comment'&gt;-- we export the data type but not the constructors&lt;/span&gt;
&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;empty&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;frag&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;text&lt;/span&gt;
&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;cat&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;+++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;render&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;renders&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;lang&lt;/span&gt;
&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;q&lt;/span&gt;
&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;declareSafeString&lt;/span&gt;
&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='keyword'&gt;where&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;In order to create safe strings that correspond to particular
languages, we need to tell the computer what we mean by &lt;em&gt;Language&lt;/em&gt;:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;class&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;litfrag&lt;/span&gt;  &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;   &lt;span class='comment'&gt;-- String is a literal language fragment&lt;/span&gt;
    &lt;span class='varid'&gt;littext&lt;/span&gt;  &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;   &lt;span class='comment'&gt;-- String is literal text&lt;/span&gt;
    &lt;span class='varid'&gt;natrep&lt;/span&gt;   &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt;   &lt;span class='comment'&gt;-- Gets the native-language representation&lt;/span&gt;
    &lt;span class='varid'&gt;language&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt;   &lt;span class='comment'&gt;-- Gets the name of the language&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Here we&amp;#8217;re saying that &lt;em&gt;Language&lt;/em&gt; is the class of languages, i.e., all
data types &lt;em&gt;l&lt;/em&gt; for which we can provide four functions:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;&lt;em&gt;litfrag&lt;/em&gt; &amp;#8211; converts a string that represents a language fragment into a language fragment&lt;/li&gt;
		&lt;li&gt;&lt;em&gt;littext&lt;/em&gt; &amp;#8211; converts a string that represents plain text into a language fragment that represents the text (via escaping)&lt;/li&gt;
		&lt;li&gt;&lt;em&gt;natrep&lt;/em&gt; &amp;#8211;  converts a language fragment, verbatim, into a string that represents the language fragment&lt;/li&gt;
		&lt;li&gt;&lt;em&gt;language&lt;/em&gt; &amp;#8211; returns the name of the language associated with a given fragment&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;Further, we need to declare a few &amp;#8220;language laws&amp;#8221; that conforming
&lt;em&gt;Language&lt;/em&gt; types must obey.  These laws are for us.  They will keep us
honest when teaching the computer about new languages.  Here are the
two laws we will require language types to satisfy:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;&lt;em&gt;natrep&lt;/em&gt; (&lt;em&gt;litfrag&lt;/em&gt; &lt;em&gt;s&lt;/em&gt;) &lt;code&gt;==&lt;/code&gt; &lt;em&gt;s&lt;/em&gt;&lt;/li&gt;
		&lt;li&gt;&lt;em&gt;natrep&lt;/em&gt; (&lt;em&gt;littext&lt;/em&gt; &lt;em&gt;s&lt;/em&gt;) &lt;code&gt;==&lt;/code&gt; (&lt;em&gt;escape&lt;sub&gt;L&lt;/sub&gt;&lt;/em&gt; &lt;em&gt;s&lt;/em&gt;)&lt;/li&gt;
	&lt;/ul&gt;


	&lt;p&gt;The first law requires that (&lt;em&gt;natrep&lt;/em&gt;&amp;#160;.&amp;#160;&lt;em&gt;litfrag&lt;/em&gt;) be
equivalent to the identity function for strings.  The second law
requires that (&lt;em&gt;natrep&lt;/em&gt;&amp;#160;.&amp;#160;&lt;em&gt;littext&lt;/em&gt;) be equivalent to
the text-escaping function for a given language &lt;em&gt;L&lt;/em&gt;.  For example,
for the language &lt;span class="caps"&gt;XML&lt;/span&gt;:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;natrep (litfrag "&amp;lt;em&amp;gt;wow!&amp;lt;/em&amp;gt;") ==&amp;gt; "&amp;lt;em&amp;gt;wow!&amp;lt;/em&amp;gt;" 
natrep (littext "ham &amp;#38; eggs")    ==&amp;gt; "ham &amp;amp;amp; eggs" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Next, let&amp;#8217;s construct a type-safe container for strings having
a known language:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;data&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;
    &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SSEmpty&lt;/span&gt;
    &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='conid'&gt;SSFragment&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;
    &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='conid'&gt;SSCat&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;This data-type definition says that if &lt;em&gt;l&lt;/em&gt; is a language, we
can construct &lt;em&gt;SafeString&lt;/em&gt; values for that language.  Each value can
represent an empty fragment of the language (via &lt;em&gt;SSEmpty&lt;/em&gt;), a
non-empty fragment of the language (via &lt;em&gt;SSFragment&lt;/em&gt;), or the
concatenation of two other &lt;em&gt;SafeString&lt;/em&gt; values for the language
(via &lt;em&gt;SSCat&lt;/em&gt;).&lt;/p&gt;


	&lt;p&gt;Now comes the interesting part.  We are going to leverage the type
system to enforce the safe-string-handling rule for us.&lt;/p&gt;


	&lt;p&gt;We will do this using the &lt;em&gt;SafeString&lt;/em&gt; data type we just defined.
We have already placed the data type&amp;#8217;s definition into a module that
does &lt;em&gt;not&lt;/em&gt; export the type&amp;#8217;s data constructors.  That means we will not
be able to create &lt;em&gt;SafeString&lt;/em&gt; values for ourselves.  Instead, we must
ask a small set of kernel functions, which &lt;em&gt;are&lt;/em&gt; exported, to create the
values on our behalf.&lt;/p&gt;


	&lt;p&gt;These kernel functions, which we are about to write,
will create &lt;em&gt;SafeString&lt;/em&gt; values only in accordance with our
safe-string-handling rule.  In particular, they will require us
to &lt;em&gt;certify&lt;/em&gt; that an existing string represents either text or a language
fragment before creating a corresponding &lt;em&gt;SafeString&lt;/em&gt; value
for us.  From then on, the type system will know
which language the string is associated with and prevent us from
joining it to regular strings or to &lt;em&gt;SafeString&lt;/em&gt; values associated
with other languages.&lt;/p&gt;


	&lt;p&gt;Let&amp;#8217;s write these constructor functions now:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;empty&lt;/span&gt;      &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;
&lt;span class='varid'&gt;empty&lt;/span&gt;       &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SSEmpty&lt;/span&gt;

&lt;span class='varid'&gt;frag&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;text&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;
&lt;span class='varid'&gt;frag&lt;/span&gt; &lt;span class='varid'&gt;f&lt;/span&gt;      &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SSFragment&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;litfrag&lt;/span&gt; &lt;span class='varid'&gt;f&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='varid'&gt;text&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;      &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SSFragment&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;littext&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Here&amp;#8217;s what the functions do:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;&lt;em&gt;empty&lt;/em&gt; &amp;#8211; creates an empty &lt;em&gt;SafeString&lt;/em&gt; in the &lt;em&gt;Language l&lt;/em&gt;&lt;/li&gt;
		&lt;li&gt;&lt;em&gt;frag f&lt;/em&gt; &amp;#8211; takes a string that you certify as representing a fragment in the &lt;em&gt;Language l&lt;/em&gt; and returns a corresponding &lt;em&gt;SafeString&lt;/em&gt;&lt;/li&gt;
		&lt;li&gt;&lt;em&gt;text s&lt;/em&gt; &amp;#8211; takes a string that you certify as representing text and returns a corresponding &lt;em&gt;SafeString&lt;/em&gt; in the &lt;em&gt;Language l&lt;/em&gt;&lt;/li&gt;
	&lt;/ul&gt;


	&lt;p&gt;Once the kernel creates &lt;em&gt;SafeString&lt;/em&gt; values for us, we need some way
to combine them safely.  Thus we define the &lt;code&gt;(+++)&lt;/code&gt;
operator and the &lt;em&gt;cat&lt;/em&gt; function:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='comment'&gt;-- join two SafeStrings of the same language&lt;/span&gt;
&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;+++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;
&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;+++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SSCat&lt;/span&gt;

&lt;span class='comment'&gt;-- join a list of same-language SafeStrings&lt;/span&gt;
&lt;span class='varid'&gt;cat&lt;/span&gt;   &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;
&lt;span class='varid'&gt;cat&lt;/span&gt;    &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;foldr&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;+++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varid'&gt;empty&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Finally, we need a way to convert &lt;em&gt;SafeString&lt;/em&gt; values into normal
strings so that we can pass them through the boundaries of our
safe-string-protected code and into the outside world.  For this,
we write the &lt;em&gt;render&lt;/em&gt; function:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;render&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='str'&gt;""&lt;/span&gt;

&lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='conid'&gt;SSEmpty&lt;/span&gt;        &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;id&lt;/span&gt;
&lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;SSFragment&lt;/span&gt; &lt;span class='varid'&gt;a&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;natrep&lt;/span&gt; &lt;span class='varid'&gt;a&lt;/span&gt; &lt;span class='varop'&gt;++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;SSCat&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='varid'&gt;r&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;    &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='varid'&gt;r&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;(Don&amp;#8217;t worry about the &lt;em&gt;renders&lt;/em&gt; stuff.  It implements
a Haskell idiom for fast string concatenation.)&lt;/p&gt;


	&lt;p&gt;As a convenience, let&amp;#8217;s round out our kernel with a &lt;em&gt;Show&lt;/em&gt; instance
that tells Haskell how to format
&lt;em&gt;SafeString&lt;/em&gt; values for display.&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;instance&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt; &lt;span class='keyglyph'&gt;=&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Show&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='varid'&gt;l&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;showsPrec&lt;/span&gt; &lt;span class='keyword'&gt;_&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
        &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;lang&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='varop'&gt;++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='str'&gt;":\""&lt;/span&gt; &lt;span class='varop'&gt;++&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;renders&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='chr'&gt;'"'&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;

&lt;span class='varid'&gt;lang&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='keyword'&gt;let&lt;/span&gt; &lt;span class='conid'&gt;SSFragment&lt;/span&gt; &lt;span class='varid'&gt;e&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;ss&lt;/span&gt; &lt;span class='keyword'&gt;in&lt;/span&gt; &lt;span class='varid'&gt;language&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;undefined&lt;/span&gt; &lt;span class='varop'&gt;`asTypeOf`&lt;/span&gt; &lt;span class='varid'&gt;e&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;And that&amp;#8217;s our SafeStrings kernel.&lt;/p&gt;


	&lt;h3&gt; Another look at the SafeStrings kernel&lt;/h3&gt;


	&lt;p&gt;The following illustration, complete with poorly chosen colors, provides a
visual summary of our system:&lt;/p&gt;


&lt;p style="text-align: center"&gt;
&lt;img src="http://community.moertel.com/~thor/pix/20060908/safe-strings.png" title="Stunning visual interpretation of the SafeStrings kernel and its relationship to the evil outside world" alt="Stunning visual interpretation of the SafeStrings kernel and its relationship to the evil outside world" /&gt;
&lt;/p&gt;

	&lt;p&gt;(Don&amp;#8217;t worry about the &lt;code&gt;$(q ...)&lt;/code&gt; stuff for the
moment, we&amp;#8217;ll talk about it later.)&lt;/p&gt;


	&lt;p&gt;Activating our mad art-interpretation skillz, we can
now decipher the illustration:&lt;/p&gt;


	&lt;p&gt;&lt;em&gt;Regular strings gain &amp;#8220;admittance&amp;#8221; to the SafeStrings kernel only
via the &lt;/em&gt;text&lt;em&gt; and &lt;/em&gt;frag&lt;em&gt; certification functions, which
we use to create corresponding safe strings for a given language.
Once created, the safe strings live their entire lives in the
fleshy-colored, egg-shaped protective sac that is the kernel, whose
safe-string functions and operators use Haskell&amp;#8217;s type system to
prevent us from accidentally mixing the strings in unsafe
ways. Further, because the kernel does not export its underlying data
structures, we can&amp;#8217;t screw around with the innards of our safe strings to
break the kernel&amp;#8217;s promises.  When our safe strings have finally
reached their ultimate, beautiful state, we can &lt;/em&gt;render&lt;em&gt; them
into regular strings and pass them bravely into the cruel outside
world &amp;#8211; where, most likely, somebody else&amp;#8217;s broken code will screw
them up anyway.  But at least we tried.&lt;/em&gt;&lt;/p&gt;


	&lt;h3&gt;Our first SafeString module: SafeXml&lt;/h3&gt;


	&lt;p&gt;Now that we have written our SafeStrings kernel, let&amp;#8217;s use it to
create a SafeXml module that we can use for working with &lt;span class="caps"&gt;XML&lt;/span&gt;.
Again, we will be writing library code that under normal
circumstances would be hidden from view.&lt;/p&gt;


	&lt;p&gt;First, we will create a new module that uses the SafeStrings kernel:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;module&lt;/span&gt; &lt;span class='conid'&gt;SafeXml&lt;/span&gt;
&lt;span class='layout'&gt;(&lt;/span&gt; &lt;span class='conid'&gt;Xml&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;xml&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;renderXml&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='keyword'&gt;module&lt;/span&gt; &lt;span class='conid'&gt;SafeStrings&lt;/span&gt; &lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='keyword'&gt;where&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;SafeStrings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Next, we will create a wrapper type to testify
that a string represents a fragment of &lt;span class="caps"&gt;XML&lt;/span&gt;:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;newtype&lt;/span&gt; &lt;span class='conid'&gt;XmlString&lt;/span&gt;
    &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;XmlString&lt;/span&gt; &lt;span class='layout'&gt;{&lt;/span&gt; &lt;span class='varid'&gt;unXmlString&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='layout'&gt;}&lt;/span&gt;
    &lt;span class='keyword'&gt;deriving&lt;/span&gt; &lt;span class='conid'&gt;Show&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;If you go back and look at the export list for the module, you&amp;#8217;ll see
that the &lt;em&gt;XmlString&lt;/em&gt; data type is not exported.  It is internal to the
module, and thus we, as clients of the module, can&amp;#8217;t create values of
that type.  That means we can&amp;#8217;t &amp;#8220;forge&amp;#8221; &lt;span class="caps"&gt;XML&lt;/span&gt; strings into existence.
We can create them only through the safe-string kernel, and even then
only by certifying a regular string as representing text or a language
fragment.  (The kernel, in turn, will create the needed values through
the &lt;em&gt;Language&lt;/em&gt; interface, which we now discuss.)&lt;/p&gt;


	&lt;p&gt;Like all good language types, &lt;em&gt;XmlString&lt;/em&gt; needs to be a member of the
&lt;em&gt;Language&lt;/em&gt; type class, so we provide the necessary instance functions:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;instance&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt; &lt;span class='conid'&gt;XmlString&lt;/span&gt; &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;litfrag&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;XmlString&lt;/span&gt;
    &lt;span class='varid'&gt;littext&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;XmlString&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;escapeXml&lt;/span&gt;
    &lt;span class='varid'&gt;natrep&lt;/span&gt;   &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;unXmlString&lt;/span&gt;
    &lt;span class='varid'&gt;language&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;const&lt;/span&gt; &lt;span class='str'&gt;"xml"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Note that the functions satisfy the language laws
we defined earlier.  (The proof follows immediately from the definitions
of &lt;em&gt;XmlString&lt;/em&gt;, &lt;em&gt;unXmlString&lt;/em&gt;, and &lt;em&gt;escapeXml&lt;/em&gt;.)&lt;/p&gt;


	&lt;p&gt;Next, we need to write a function to implement the escaping
rule for &lt;span class="caps"&gt;XML&lt;/span&gt;:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;escapeXml&lt;/span&gt; &lt;span class='varid'&gt;xs&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='varid'&gt;concatMap&lt;/span&gt; &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='varid'&gt;xs&lt;/span&gt;
  &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='chr'&gt;'&amp;lt;'&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"&amp;amp;lt;"&lt;/span&gt;
    &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='chr'&gt;'&amp;amp;'&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"&amp;amp;amp;"&lt;/span&gt;
    &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='chr'&gt;'"'&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"&amp;amp;#34;"&lt;/span&gt;
    &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='chr'&gt;'\''&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"&amp;amp;#39;"&lt;/span&gt;
    &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt;    &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='varid'&gt;x&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Next, because we expect to work with &lt;span class="caps"&gt;XML&lt;/span&gt; frequently, we will create a
convenient type synonym, &lt;em&gt;Xml&lt;/em&gt;, for &lt;em&gt;SafeString&lt;/em&gt; values that represent
&lt;span class="caps"&gt;XML&lt;/span&gt;:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;type&lt;/span&gt; &lt;span class='conid'&gt;Xml&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SafeString&lt;/span&gt; &lt;span class='conid'&gt;XmlString&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Finally, we will create
a few convenience functions to create and render &lt;span class="caps"&gt;XML&lt;/span&gt; fragments.  These
functions are identical to the SafeString kernel&amp;#8217;s &lt;em&gt;frag&lt;/em&gt; and &lt;em&gt;render&lt;/em&gt;
functions but for the &lt;em&gt;Xml&lt;/em&gt; type exclusively.  When we use these
functions, we won&amp;#8217;t need to provide additional type annotations; the
computer will know we are dealing with &lt;span class="caps"&gt;XML&lt;/span&gt; strings:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;xml&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Xml&lt;/span&gt;
&lt;span class='varid'&gt;xml&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;frag&lt;/span&gt;

&lt;span class='varid'&gt;renderXml&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Xml&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt;
&lt;span class='varid'&gt;renderXml&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;render&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;And we&amp;#8217;re done.&lt;/p&gt;


	&lt;p&gt;Before going on, let me point out two things:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;If you think the code we have written so far is long or perhaps confusing, please remember that it is &lt;em&gt;library code&lt;/em&gt;.  Typically, you would never see it.  All you would do is &lt;code&gt;import SafeXml&lt;/code&gt; and start using the library.&lt;/li&gt;
		&lt;li&gt;The SafeXml implementation is formulaic, and we can replace all of it except for the escaping function&amp;#8217;s definition with a single line of code, something we will do later.&lt;/li&gt;
	&lt;/ol&gt;


	&lt;h3&gt; A quick test drive of our SafeXml module&lt;/h3&gt;


	&lt;p&gt;Let&amp;#8217;s give our SafeXml module a spin in the &lt;span class="caps"&gt;GHC&lt;/span&gt; interactive shell.&lt;/p&gt;


	&lt;p&gt;We can create an &lt;span class="caps"&gt;XML&lt;/span&gt; fragment by certifying that a regular string
represents a language fragment (via the &lt;em&gt;frag&lt;/em&gt; function) and telling
Haskell that we expect a result of type &lt;em&gt;Xml&lt;/em&gt;.&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;Ok, modules loaded: SafeXml, SafeStrings.
*SafeXml&amp;gt; frag "&amp;lt;em&amp;gt;wow!&amp;lt;/em&amp;gt;" :: Xml
xml:"&amp;lt;em&amp;gt;wow!&amp;lt;/em&amp;gt;" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Note how the output is prefixed with the label &amp;#8220;xml:&amp;#8221; 
to tell us that our kernel certifies this value to represent an &lt;span class="caps"&gt;XML&lt;/span&gt; fragment.&lt;/p&gt;


	&lt;p&gt;Because entering type annotations can be inconvenient, we can instead
use the &lt;em&gt;xml&lt;/em&gt; function, which certifies a string not just as a
fragment but as an &lt;span class="caps"&gt;XML&lt;/span&gt; fragment:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;*SafeXml&amp;gt; xml "&amp;lt;em&amp;gt;wow!&amp;lt;/em&amp;gt;" 
xml:"&amp;lt;em&amp;gt;wow!&amp;lt;/em&amp;gt;" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;If we want to represent text in &lt;span class="caps"&gt;XML&lt;/span&gt;, the kernel will automatically
escape it for us:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;*SafeXml&amp;gt; text "ham &amp;#38; eggs" :: Xml
xml:"ham &amp;amp;amp; eggs" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Now let&amp;#8217;s try to do something naughty.  Will the type system
let us?&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;*SafeXml&amp;gt; let someXml = xml "&amp;lt;em&amp;gt;Hi!&amp;lt;/em&amp;gt;" 
*SafeXml&amp;gt; let plainOldText = "ham &amp;#38; eggs" 
*SafeXml&amp;gt; someXml ++ plainOldText

&amp;lt;interactive&amp;gt;:1:0:
    Couldn't match `[a]' against `Xml'
      Expected type: [a]
      Inferred type: Xml
    In the first argument of `(++)', namely `someXml'
    In the definition of `it': it = someXml ++ plainOldText
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;In Haskell, the &lt;code&gt;(++)&lt;/code&gt; operator is used (among
other things) to join strings.  In the code above, we tried
to use this operator to join an &lt;span class="caps"&gt;XML&lt;/span&gt; fragment to a plain-old
string, which would have violated our safe-string-handling rule.
Fortunately, we were unable to fool the type system into
allowing this ill-conceived union to occur.  Note that our
mistake was caught at compile time, before the code was
ever converted into executable form.&lt;/p&gt;


	&lt;p&gt;Perhaps we can persuade our newly-defined &lt;code&gt;(+++)&lt;/code&gt;
operator to make the union:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;*SafeXml&amp;gt; someXml +++ plainOldText

&amp;lt;interactive&amp;gt;:1:12:
    Couldn't match `SafeString XmlString' against `[Char]'
      Expected type: SafeString XmlString
      Inferred type: [Char]
    In the second argument of `(+++)', namely `plainOldText'
    In the definition of `it': it = someXml +++ plainOldText
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Again, the type system has prevented us from doing something
naughty.  If, however, we certify that the plain-old string represents
text, we can make a safe union:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;*SafeXml&amp;gt; someXml +++ text plainOldText
xml:"&amp;lt;em&amp;gt;Hi!&amp;lt;/em&amp;gt;ham &amp;amp;amp; eggs" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;Syntactic sugar for safe strings&lt;/h3&gt;


	&lt;p&gt;Not having to worry about the strings problem anymore is fabulous and
all, but having to type in &lt;em&gt;frag&lt;/em&gt;, &lt;em&gt;text&lt;/em&gt;, and &lt;code&gt;+++&lt;/code&gt; is
kind of clunky.  Let&amp;#8217;s get rid of the clunkiness by introducing some
syntactic sugar.&lt;/p&gt;


&lt;p&gt;The common case when dealing with strings in web applications is
templates.  For example, here&amp;#8217;s a simplified version of the
&lt;code&gt;link_to&lt;/code&gt; method from the deservedly popular &lt;a href="http://www.rubyonrails.com/"&gt;Ruby on
Rails&lt;/a&gt;.  The method wraps a hypertext link
around some content by &amp;#8220;interpolating&amp;#8221; the content and a &lt;span class="caps"&gt;URL&lt;/span&gt;
into a link template:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;# NOTE: this example is in Ruby

def link_to(content_xhtml, url)
  "&amp;lt;a href=\"#{h url}\"&amp;gt;#{content_xhtml}&amp;lt;/a&amp;gt;" 
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;In this code, we need to &lt;span class="caps"&gt;HTML&lt;/span&gt;-escape the &lt;span class="caps"&gt;URL&lt;/span&gt; (via the &lt;code&gt;h&lt;/code&gt;
helper) before interpolating it
into the template.  We do not need to escape the content, however,
because it is already in the template&amp;#8217;s language, &lt;span class="caps"&gt;XHTML&lt;/span&gt;.&lt;/p&gt;


	&lt;p&gt;Now, to introduce our syntactic sugar, here&amp;#8217;s &lt;code&gt;link_to&lt;/code&gt;
rewritten in Haskell and using safe strings:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='comment'&gt;-- Haskell code&lt;/span&gt;

&lt;span class='varid'&gt;link_to&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Url&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt;
&lt;span class='varid'&gt;link_to&lt;/span&gt; &lt;span class='varid'&gt;content&lt;/span&gt; &lt;span class='varid'&gt;url&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"&amp;lt;a href=\"#{r url}\"&amp;gt;#{=content}&amp;lt;/a&amp;gt;"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The type signature makes clear to everybody that the &lt;em&gt;content&lt;/em&gt;
parameter is &lt;span class="caps"&gt;XHTML&lt;/span&gt;, the &lt;em&gt;url&lt;/em&gt; parameter is a &lt;span class="caps"&gt;URL&lt;/span&gt;, and the result is
&lt;span class="caps"&gt;XHTML&lt;/span&gt;.  The signature isn&amp;#8217;t needed, but &lt;code&gt;link_to&lt;/code&gt; is the
stuff of libraries, and so annotations are good form.&lt;/p&gt;


	&lt;p&gt;The interpolation syntax is like Ruby&amp;#8217;s, but with
slightly different modifiers:&lt;/p&gt;


	&lt;ul&gt;
	&lt;li&gt;The template-quoting syntax is &lt;code&gt;$(q "this is a template")&lt;/code&gt;.  (Mnemonic: &lt;code&gt;q&lt;/code&gt; for quote).&lt;/li&gt;
		&lt;li&gt;Within a template, we can interpolate variables using the familiar &lt;code&gt;#{var}&lt;/code&gt; syntax.&lt;/li&gt;
		&lt;li&gt;If an interpolated variable holds a plain string, it will be escaped into the template automatically.&lt;/li&gt;
		&lt;li&gt;If an interpolated variable holds a safe string, we must use an &lt;em&gt;interpolation modifier&lt;/em&gt; to specify how it should be interpolated (to avoid ambiguity):
	&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;#{r var}&lt;/code&gt; renders the safe string in &lt;em&gt;var&lt;/em&gt; into text, and then interpolates the text into the template, escaping as necessary (mnemonic: &lt;code&gt;r&lt;/code&gt; for &lt;em&gt;render&lt;/em&gt;).&lt;/li&gt;
		&lt;li&gt;&lt;code&gt;#{= var}&lt;/code&gt; inserts the safe string in &lt;em&gt;var&lt;/em&gt; directly into the template, which must be of the same language (mnemonic: &lt;code&gt;=&lt;/code&gt; for &lt;em&gt;equal language types&lt;/em&gt;).&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
		&lt;li&gt;As a bonus, &lt;code&gt;#{s var}&lt;/code&gt; interpolates any &lt;em&gt;Show&lt;/em&gt;-able value in &lt;em&gt;var&lt;/em&gt; into the template as text, escaping as necessary.&lt;/li&gt;
	&lt;/ul&gt;


	&lt;p&gt;It&amp;#8217;s pretty easy to tell which interpolation option is right for any
situation, but late-night coding sessions make fools of us all.
That&amp;#8217;s why the type system is there to catch us when we make a dumb mistake.&lt;/p&gt;


	&lt;p&gt;Let&amp;#8217;s try out the sugary &lt;code&gt;link_to&lt;/code&gt; method:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;&amp;gt; link_to (text "Tom's Weblog") (url "http://blog.moertel.com/")
xml:"&amp;lt;a href="http://blog.moertel.com/"&amp;gt;Tom's Weblog&amp;lt;/a&amp;gt;" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Let&amp;#8217;s take advantage of type inferencing in the next example:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;&amp;gt; link_to $(q "&amp;lt;em&amp;gt;Espresso!&amp;lt;/em&amp;gt;")
          $(q "http://google.com/search?q=espresso&amp;#38;oe=utf-8")

xml:"&amp;lt;a href="http://google.com/search?q=espresso&amp;amp;amp;oe=utf-8"&amp;gt;
     &amp;lt;em&amp;gt;Espresso!&amp;lt;/em&amp;gt;&amp;lt;/a&amp;gt;" 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;In the above example, we supplied templates as input parameters.
Haskell figured out their types and took care of the escaping (or not
escaping) for us.&lt;/p&gt;


	&lt;p&gt;Now that we know what the syntactic sugar looks like, let&amp;#8217;s
see how to implement it.&lt;/p&gt;


	&lt;h3&gt; Implementing the syntactic sugar using Template Haskell&lt;/h3&gt;


	&lt;p&gt;We implement the SafeString library&amp;#8217;s syntactic sugar using Template
Haskell.  A small function &lt;code&gt;q&lt;/code&gt; (for &amp;#8220;quote&amp;#8221;) parses the
sugared syntax at compile time and emits equivalent code using our
safe-string functions &lt;code&gt;frag&lt;/code&gt;, &lt;code&gt;text&lt;/code&gt;, and so on.
For example, the following sugar:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"&amp;lt;em&amp;gt;#{mystr}&amp;lt;/em&amp;gt;"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;becomes the following code:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;cat&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='varid'&gt;frag&lt;/span&gt; &lt;span class='str'&gt;"&amp;lt;em&amp;gt;"&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;text&lt;/span&gt; &lt;span class='varid'&gt;mystr&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;frag&lt;/span&gt; &lt;span class='str'&gt;"&amp;lt;/em&amp;gt;"&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The code that makes it happen is fairly straightforward if you know
Template Haskell, so I&amp;#8217;ll skip the explanation because this article
is already way too long.  As usual, it&amp;#8217;s library code, so normally we
wouldn&amp;#8217;t see or care about it.  All we care about is the &lt;code&gt;$(q
"...")&lt;/code&gt; sugar that the code makes available to us.&lt;/p&gt;


	&lt;p&gt;Here it is:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;Language&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='conid'&gt;Haskell&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='conid'&gt;TH&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='varid'&gt;qualified&lt;/span&gt; &lt;span class='conid'&gt;Text&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='conid'&gt;ParserCombinators&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='conid'&gt;ReadP&lt;/span&gt; &lt;span class='keyword'&gt;as&lt;/span&gt; &lt;span class='conid'&gt;P&lt;/span&gt;

&lt;span class='comment'&gt;-- Convert template sugar into calls to frag, text, cat, etc.&lt;/span&gt;
&lt;span class='comment'&gt;-- This function is exported by the SafeStrings module.&lt;/span&gt;

&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='varid'&gt;spec&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;cat&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;parts&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
  &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;parts&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyword'&gt;case&lt;/span&gt; &lt;span class='varid'&gt;xparse&lt;/span&gt; &lt;span class='varid'&gt;spec&lt;/span&gt; &lt;span class='keyword'&gt;of&lt;/span&gt;
        &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;   &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='varid'&gt;error&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='str'&gt;"bad template: "&lt;/span&gt; &lt;span class='varop'&gt;++&lt;/span&gt; &lt;span class='varid'&gt;show&lt;/span&gt; &lt;span class='varid'&gt;spec&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
        &lt;span class='varid'&gt;ps&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='keyword'&gt;_&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='varid'&gt;foldr&lt;/span&gt; &lt;span class='varid'&gt;gen&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt; &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt; &lt;span class='varid'&gt;ps&lt;/span&gt;
    &lt;span class='varid'&gt;gen&lt;/span&gt; &lt;span class='varid'&gt;p&lt;/span&gt; &lt;span class='varid'&gt;ps'&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='keyglyph'&gt;\&lt;/span&gt;&lt;span class='varid'&gt;p'&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='varid'&gt;p'&lt;/span&gt; &lt;span class='conop'&gt;:&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='varid'&gt;ps'&lt;/span&gt; &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='keyword'&gt;case&lt;/span&gt; &lt;span class='varid'&gt;p&lt;/span&gt; &lt;span class='keyword'&gt;of&lt;/span&gt;
        &lt;span class='conid'&gt;SFrag&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;  &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;frag&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;litE&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;stringL&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;         &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
        &lt;span class='conid'&gt;SIFrag&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;varE&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;mkName&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;               &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
        &lt;span class='conid'&gt;SIShow&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;text&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;show&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;varE&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;mkName&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;   &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
        &lt;span class='conid'&gt;SITxt&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;  &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;text&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;varE&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;mkName&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;          &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;
        &lt;span class='conid'&gt;SIRTxt&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;text&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;render&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;varE&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;mkName&lt;/span&gt; &lt;span class='varid'&gt;s&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;

&lt;span class='comment'&gt;-- AST for template-specification parts&lt;/span&gt;

&lt;span class='keyword'&gt;data&lt;/span&gt; &lt;span class='conid'&gt;SpecPart&lt;/span&gt;
    &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;SFrag&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt;  &lt;span class='comment'&gt;-- ^ language fragment&lt;/span&gt;
    &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='conid'&gt;SIFrag&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='comment'&gt;-- ^ insert fragment by variable reference&lt;/span&gt;
    &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='conid'&gt;SIShow&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='comment'&gt;-- ^ insert rendered variable via show&lt;/span&gt;
    &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='conid'&gt;SITxt&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt;  &lt;span class='comment'&gt;-- ^ insert literal text variable&lt;/span&gt;
    &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='conid'&gt;SIRTxt&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='comment'&gt;-- ^ insert rendered safe string var as text&lt;/span&gt;
  &lt;span class='keyword'&gt;deriving&lt;/span&gt; &lt;span class='conid'&gt;Show&lt;/span&gt;

&lt;span class='comment'&gt;-- Parse a template specification&lt;/span&gt;

&lt;span class='varid'&gt;xparse&lt;/span&gt; &lt;span class='varid'&gt;spec&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
    &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;result&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='str'&gt;""&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;&amp;lt;-&lt;/span&gt; &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;readP_to_S&lt;/span&gt; &lt;span class='varid'&gt;templateP&lt;/span&gt; &lt;span class='varid'&gt;spec&lt;/span&gt;
    &lt;span class='varid'&gt;return&lt;/span&gt; &lt;span class='varid'&gt;result&lt;/span&gt;
 &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;templateP&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
        &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;many&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;liftM&lt;/span&gt; &lt;span class='conid'&gt;SFrag&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;munch1&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;/=&lt;/span&gt; &lt;span class='chr'&gt;'#'&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&amp;lt;++&lt;/span&gt;
                &lt;span class='varid'&gt;interpolationP&lt;/span&gt; &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&amp;lt;++&lt;/span&gt;
                &lt;span class='varid'&gt;liftM&lt;/span&gt; &lt;span class='conid'&gt;SFrag&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;string&lt;/span&gt; &lt;span class='str'&gt;"#"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;

    &lt;span class='varid'&gt;interpolationP&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
        &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;string&lt;/span&gt; &lt;span class='str'&gt;"#{"&lt;/span&gt;
        &lt;span class='varid'&gt;spec&lt;/span&gt; &lt;span class='keyglyph'&gt;&amp;lt;-&lt;/span&gt; &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;manyTill&lt;/span&gt; &lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;get&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;P&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='varid'&gt;char&lt;/span&gt; &lt;span class='chr'&gt;'}'&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
        &lt;span class='varid'&gt;return&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='keyword'&gt;case&lt;/span&gt; &lt;span class='varid'&gt;spec&lt;/span&gt; &lt;span class='keyword'&gt;of&lt;/span&gt;
          &lt;span class='chr'&gt;'r'&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='chr'&gt;' '&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='varid'&gt;var&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SIRTxt&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;strip&lt;/span&gt; &lt;span class='varid'&gt;var&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
          &lt;span class='chr'&gt;'s'&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='chr'&gt;' '&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='varid'&gt;var&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SIShow&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;strip&lt;/span&gt; &lt;span class='varid'&gt;var&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
          &lt;span class='chr'&gt;'='&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='varid'&gt;var&lt;/span&gt;     &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SIFrag&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;strip&lt;/span&gt; &lt;span class='varid'&gt;var&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
          &lt;span class='varid'&gt;var&lt;/span&gt;         &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;SITxt&lt;/span&gt;  &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;strip&lt;/span&gt; &lt;span class='varid'&gt;var&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;

&lt;span class='varid'&gt;strip&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;frontAndBack&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;dropWhile&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;==&lt;/span&gt; &lt;span class='chr'&gt;' '&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='varid'&gt;frontAndBack&lt;/span&gt; &lt;span class='varid'&gt;f&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;reverse&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;f&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;reverse&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;f&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;h3&gt;More sugar: defining additional safe-string types&lt;/h3&gt;


	&lt;p&gt;One additional bit of Template Haskell code, which I won&amp;#8217;t reprint
here, defines &lt;em&gt;declareSafeString&lt;/em&gt;.  This function lets us eliminate
the boilerplate code when defining new safe-string types.  For
example, compare our earlier definition of the SafeXml module with the
following implementation of a module for safe &lt;span class="caps"&gt;URL&lt;/span&gt; strings:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;module&lt;/span&gt; &lt;span class='conid'&gt;SafeUrl&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;Url&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;url&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;renderUrl&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='keyword'&gt;module&lt;/span&gt; &lt;span class='conid'&gt;SafeStrings&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyword'&gt;where&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;SafeStrings&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;Text&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='conid'&gt;Printf&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;Data&lt;/span&gt;&lt;span class='varop'&gt;.&lt;/span&gt;&lt;span class='conid'&gt;Char&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;ord&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;

&lt;span class='varid'&gt;escapeUrl&lt;/span&gt; &lt;span class='varid'&gt;xs&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='varid'&gt;concatMap&lt;/span&gt; &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='varid'&gt;xs&lt;/span&gt;
  &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;esc&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt; &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;isReserved&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt; &lt;span class='varop'&gt;||&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt; &lt;span class='varop'&gt;&amp;gt;&lt;/span&gt; &lt;span class='chr'&gt;'~'&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;urlEncode&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt;
          &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt; &lt;span class='varop'&gt;==&lt;/span&gt; &lt;span class='chr'&gt;' '&lt;/span&gt;                &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='str'&gt;"+"&lt;/span&gt;
          &lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;otherwise&lt;/span&gt;               &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='varid'&gt;x&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;

&lt;span class='varid'&gt;urlEncode&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt;  &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='chr'&gt;'%'&lt;/span&gt; &lt;span class='conop'&gt;:&lt;/span&gt; &lt;span class='varid'&gt;printf&lt;/span&gt; &lt;span class='str'&gt;"%02x"&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;ord&lt;/span&gt; &lt;span class='varid'&gt;x&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='varid'&gt;isReserved&lt;/span&gt;   &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;`elem`&lt;/span&gt; &lt;span class='str'&gt;"!#$&amp;amp;'()*+,/:;=?@[]"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;

&lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;declareSafeString&lt;/span&gt; &lt;span class='str'&gt;"url"&lt;/span&gt; &lt;span class='str'&gt;"Url"&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;&lt;span class='keyglyph'&gt;|&lt;/span&gt; &lt;span class='varid'&gt;escapeUrl&lt;/span&gt; &lt;span class='keyglyph'&gt;|&lt;/span&gt;&lt;span class='keyglyph'&gt;]&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The final line generates the boilerplate code for the wrapper type,
the language definition, the &lt;em&gt;Url&lt;/em&gt; type synonym, and the &lt;em&gt;url&lt;/em&gt; and
&lt;em&gt;renderUrl&lt;/em&gt; language-specific convenience functions.&lt;/p&gt;


	&lt;h3&gt;One big example to wrap things up&lt;/h3&gt;


	&lt;p&gt;Because we have been discussing mainly library code, let&amp;#8217;s take a step
back and see some typical user-level code that uses safe strings.
After all, that&amp;#8217;s what counts.&lt;/p&gt;


	&lt;p&gt;Here is a Haskellized, safe-strings version of the Ruby (on Rails)
code that I presented at the beginning of the article to add
submit-to-Reddit and submit-to-del.icio.us buttons to my blog:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='keyword'&gt;module&lt;/span&gt; &lt;span class='conid'&gt;Example&lt;/span&gt; &lt;span class='keyword'&gt;where&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;List&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;intersperse&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;break&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;SafeXml&lt;/span&gt;
&lt;span class='keyword'&gt;import&lt;/span&gt; &lt;span class='conid'&gt;SafeUrl&lt;/span&gt;

&lt;span class='keyword'&gt;type&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;Xml&lt;/span&gt;

&lt;span class='varid'&gt;submit_this_article_links&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Article&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt;
&lt;span class='varid'&gt;submit_this_article_links&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='conid'&gt;Article&lt;/span&gt; &lt;span class='varid'&gt;title&lt;/span&gt; &lt;span class='varid'&gt;url&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='varid'&gt;cat&lt;/span&gt; &lt;span class='varop'&gt;.&lt;/span&gt; &lt;span class='varid'&gt;intersperse&lt;/span&gt; &lt;span class='varid'&gt;nbsp&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt; &lt;span class='keyword'&gt;do&lt;/span&gt;
    &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;submit_title&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;submit_url&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Url&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;image_tag&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;&amp;lt;-&lt;/span&gt; &lt;span class='varid'&gt;site_list&lt;/span&gt;
    &lt;span class='varid'&gt;return&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt;
      &lt;span class='str'&gt;"&amp;lt;a href=\"#{r submit_url}\" \
         \title=\"#{submit_title}: &amp;amp;#x201C;#{title}&amp;amp;#x201D;\" \
        \&amp;gt;#{=image_tag}&amp;lt;/a&amp;gt;"&lt;/span&gt; &lt;span class='layout'&gt;)&lt;/span&gt;

  &lt;span class='keyword'&gt;where&lt;/span&gt;

    &lt;span class='varid'&gt;nbsp&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;xml&lt;/span&gt; &lt;span class='str'&gt;"&amp;amp;#160;"&lt;/span&gt;

    &lt;span class='varid'&gt;site_list&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='keyglyph'&gt;[&lt;/span&gt;  &lt;span class='comment'&gt;-- move me into a database table&lt;/span&gt;
      &lt;span class='layout'&gt;(&lt;/span&gt; &lt;span class='str'&gt;"Submit to Reddit.com"&lt;/span&gt;
      &lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"http://reddit.com/submit?url=#{r url}&amp;amp;title=#{title}"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
      &lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;image_tag&lt;/span&gt; &lt;span class='str'&gt;"reddit.gif"&lt;/span&gt; &lt;span class='str'&gt;"18x18"&lt;/span&gt; &lt;span class='num'&gt;0&lt;/span&gt;
      &lt;span class='layout'&gt;)&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt;
      &lt;span class='layout'&gt;(&lt;/span&gt; &lt;span class='str'&gt;"Save to del.icio.us"&lt;/span&gt;
      &lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"http://del.icio.us/post?v=2&amp;amp;url=#{r url}&amp;amp;title=#{title}"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
      &lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;image_tag&lt;/span&gt; &lt;span class='str'&gt;"delicious.gif"&lt;/span&gt; &lt;span class='str'&gt;"16x16"&lt;/span&gt; &lt;span class='num'&gt;0&lt;/span&gt;
      &lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The code looks fairly similar to the original Ruby code, with the exception
of some extra backslashes, courtesy of Haskell&amp;#8217;s rather-unfortunate
syntax for multi-line string constants. (Perl and Ruby&amp;#8217;s
&lt;code&gt;&amp;lt;&amp;lt;HERE&lt;/code&gt; syntax would be a welcome addition.)&lt;/p&gt;

	&lt;p&gt;The other big difference is that, in this version, the type system has
automatically checked the code for strings-problem errors.&lt;/p&gt;


	&lt;p&gt;For completeness, here is the example&amp;#8217;s supporting code (again modeled
on Ruby on Rails).  This code also makes
extensive use of safe-string templates:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_haskell "&gt;&lt;span class='varid'&gt;image_tag&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Int&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt;
&lt;span class='varid'&gt;image_tag&lt;/span&gt; &lt;span class='varid'&gt;file_name&lt;/span&gt; &lt;span class='varid'&gt;size&lt;/span&gt; &lt;span class='varid'&gt;border&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"&amp;lt;img src=\"#{r image_url}\" height=\"#{height}\" \
         \width=\"#{width}\" border=\"#{s border}\"/&amp;gt;"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
  &lt;span class='keyword'&gt;where&lt;/span&gt;
    &lt;span class='varid'&gt;image_url&lt;/span&gt;         &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"#{=site_root}images/#{file_name}"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;
    &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;width&lt;/span&gt;&lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='keyword'&gt;_&lt;/span&gt;&lt;span class='conop'&gt;:&lt;/span&gt;&lt;span class='varid'&gt;height&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='varid'&gt;break&lt;/span&gt; &lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varop'&gt;==&lt;/span&gt;&lt;span class='chr'&gt;'x'&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt; &lt;span class='varid'&gt;size&lt;/span&gt;

&lt;span class='varid'&gt;link_to&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Url&lt;/span&gt; &lt;span class='keyglyph'&gt;-&amp;gt;&lt;/span&gt; &lt;span class='conid'&gt;Xhtml&lt;/span&gt;
&lt;span class='varid'&gt;link_to&lt;/span&gt; &lt;span class='varid'&gt;content&lt;/span&gt; &lt;span class='varid'&gt;url&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='varop'&gt;$&lt;/span&gt;&lt;span class='layout'&gt;(&lt;/span&gt;&lt;span class='varid'&gt;q&lt;/span&gt; &lt;span class='str'&gt;"&amp;lt;a href=\"#{r url}\"&amp;gt;#{=content}&amp;lt;/a&amp;gt;"&lt;/span&gt;&lt;span class='layout'&gt;)&lt;/span&gt;

&lt;span class='keyword'&gt;data&lt;/span&gt; &lt;span class='conid'&gt;Article&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt; &lt;span class='conid'&gt;Article&lt;/span&gt;
  &lt;span class='layout'&gt;{&lt;/span&gt; &lt;span class='varid'&gt;article_title&lt;/span&gt;  &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;String&lt;/span&gt;
  &lt;span class='layout'&gt;,&lt;/span&gt; &lt;span class='varid'&gt;article_url&lt;/span&gt;    &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Url&lt;/span&gt;
    &lt;span class='comment'&gt;-- more fields here&lt;/span&gt;
  &lt;span class='layout'&gt;}&lt;/span&gt;

&lt;span class='varid'&gt;sample_article&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;
    &lt;span class='conid'&gt;Article&lt;/span&gt; &lt;span class='str'&gt;"I love chunky bacon!"&lt;/span&gt; &lt;span class='varop'&gt;$&lt;/span&gt;
    &lt;span class='varid'&gt;url&lt;/span&gt; &lt;span class='str'&gt;"http://blog.moertel.com/permalink/to/article"&lt;/span&gt;

&lt;span class='varid'&gt;site_root&lt;/span&gt; &lt;span class='keyglyph'&gt;::&lt;/span&gt; &lt;span class='conid'&gt;Url&lt;/span&gt;
&lt;span class='varid'&gt;site_root&lt;/span&gt; &lt;span class='keyglyph'&gt;=&lt;/span&gt;  &lt;span class='varid'&gt;url&lt;/span&gt; &lt;span class='str'&gt;"http://blog.moertel.com/"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;h3&gt;Have we done it?  Have we learned anything?&lt;/h3&gt;


	&lt;p&gt;Have we rid ourselves of the strings problem?  If we use a programming
language like Haskell and a library like SafeStrings, I think we can
answer yes.&lt;/p&gt;


	&lt;p&gt;To be clear, the fundamental problem of having to manage different
kinds of strings is still with us.  As programmers, we still must
understand the differences between URLs, &lt;span class="caps"&gt;XML&lt;/span&gt;, SQL, untrusted user
input, and so on.  But now, we don&amp;#8217;t have to be perfect.  As long as
we can reliably slap the right type on a string when it first appears,
we can let the computer worry about it from then on.  If we forget to
escape the string later, as it winds its way through the twisty code
of a large web application and interacts with other strings in
potentially dangerous ways, the computer will catch our mistake &amp;#8211; at
compile time, before it can possibly become a live security hole.&lt;/p&gt;


	&lt;p&gt;But if slapping the right types on strings &amp;#8211; certifying them &amp;#8211; is a
pain in the neck, we won&amp;#8217;t do it.  We will happily go back to our days
of winging it, where every string interaction becomes an opportunity
for a perfectly human mistake to give birth to a nasty security
vulnerability.&lt;/p&gt;


	&lt;p&gt;That&amp;#8217;s why syntax matters.  That&amp;#8217;s why Template Haskell, Lisp macros,
and other meta-programming tools are important: they let us craft
friendly syntaxes that encourage the use of programming aids like
SafeStrings.  That&amp;#8217;s why type inferencing is important: it lets us do
away with redundant annotations and makes working with types
convenient, so we can reap the benefits of strong guarantees without
having to pay prohibitive costs.&lt;/p&gt;


	&lt;p&gt;If there is a moral to this story, it&amp;#8217;s that modern type systems and
macro systems are powerful tools.  They let us do things that
otherwise would be impractically inconvenient.  They extend our reach
as programmers and let us solve problems that we couldn&amp;#8217;t solve
before.  Why, then, do so many programmers dismiss these tools
as mere academic curiosities?  Why do so many programmers turn away to
fight unaided against, and frequently lose to, the very problems that
these tools could so easily solve?&lt;/p&gt;


&lt;div class="update"&gt;
&lt;strong&gt;Update:&lt;/strong&gt; minor edits for clarity.
&lt;/div&gt;</description>
      <pubDate>Wed, 18 Oct 2006 21:40:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:4a7fb02b-a1ba-4c4a-a63b-938a19f3076c</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2006/10/18/a-type-based-solution-to-the-strings-problem</link>
      <category>programming</category>
      <category>programming languages</category>
      <category>haskell</category>
      <category>ruby</category>
      <category>web development</category>
      <category>testing</category>
      <category>rails</category>
      <category>ruby</category>
      <category>haskell</category>
      <category>testing</category>
      <category>strings</category>
      <category>types</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/186</trackback:ping>
    </item>
    <item>
      <title>Solving the Google Code Jam &amp;quot;countPaths&amp;quot; problem in Ruby</title>
      <description>&lt;p&gt;Here&amp;#8217;s a Ruby version of a dynamic-programming-based solver
for the Google Code Jam &amp;#8220;countPaths&amp;#8221; problem.  It is essentially
the same as my &lt;a href="http://blog.moertel.com/articles/2006/08/15/solving-the-google-code-jam-countpaths-problem-in-haskell"&gt;earlier Haskell-based solution&lt;/a&gt; (see Update 2), but much slower.  Whereas the Haskell version solves the maximum-size, all-the-same-letter problem in about 0.9 second, the Ruby version requires about 71 seconds.  Maybe somebody who understands Ruby&amp;#8217;s internals better than I do can come up with some optimizations.&lt;/p&gt;


	&lt;p&gt;Here&amp;#8217;s the code:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;# Tom Moertel &amp;lt;tom@moertel.com&amp;gt;
# 2006-08-16
#
# Ruby-based solution to the Google Code Jam problem "countPaths" 
# See http://www.cs.uic.edu/~hnagaraj/articles/code-jam/ for more.

class WordPath

  include Enumerable

  def initialize(grid, word)
    @grid, @rword, @counts = grid, word.reverse, {}
  end

  def self.count_paths(grid, word)
    new(grid, word).solve
  end

  def solve
    final_index = @rword.length - 1
    inject(0) { |sum, rc| sum + count_from(final_index, *rc) }
  end

  private

  def count_from(i, r, c)
    @counts[[r, c, i]] ||= begin
      match = @rword[i] == @grid[r][c]
      case
        when i == 0 &amp;#38;&amp;#38; match then 1
        when match then subsum_of_neighbors(r, c, i - 1)
        else 0
      end
    end
  end

  def subsum_of_neighbors(r, c, i)
    sum = 0
    rowlen = @grid[0].size
    for nr in [r - 1, r, r + 1]
      next if nr &amp;lt; 0 or nr &amp;gt;= @grid.size
      for nc in [c - 1, c, c + 1]
        next if nc &amp;lt; 0 || nc &amp;gt;= rowlen
        next unless r != nr || c != nc
        if count = count_from(i, nr, nc)
          sum += count
        end
      end
    end
    sum
  end

  def each
    @grid.each_index do |r|
      @grid[0].size.times { |c| yield([r, c]) }
    end
  end

end

# TESTS

if ENV["TEST"] || ENV["BIG_TEST"]

  require "test/unit" 

  class TestWordPath &amp;lt; Test::Unit::TestCase

    if ENV["BIG_TEST"]
      def test_big_problem
        assert_equal \
          303835410591851117616135618108340196903254429200,
          WordPath.count_paths(["A"*50] * 50, "A"*50)
      end
    end

    if ENV["TEST"]
      def test_count_paths
        w = WordPath
        assert_equal 1,
          w.count_paths(%w{ABC FED GHI}, "ABCDEFGHI")
        assert_equal 2,
          w.count_paths(%w{ABC FED GAI}, "ABCDEA")
        assert_equal 0,
          w.count_paths(%w{ABC DEF GHI}, "ABCD")
        assert_equal 108,
          w.count_paths(%w{AA AA}, "AAAA")
        assert_equal 56448,
          w.count_paths(%w{ABABA BABAB ABABA BABAB ABABA}, "ABABABBA")
        assert_equal 2745564336,
          w.count_paths(%w{AAAAA AAAAA AAAAA AAAAA AAAAA}, "AAAAAAAAAAA")
        assert_equal 0,
          w.count_paths(%w{AB CD}, "AA" )
        assert_equal 1,
          w.count_paths(%w{A}, "A")
      end
    end

  end

end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Set the &lt;code&gt;BIG_TEST&lt;/code&gt; and/or &lt;code&gt;TEST&lt;/code&gt; environment
variables to run the test suites.  For example:&lt;/p&gt;


&lt;pre&gt;$ &lt;code&gt;TEST=1 ./countpaths.rb&lt;/code&gt;
&lt;/pre&gt;

&lt;pre&gt;Loaded suite countpaths
Started
.
Finished in 0.02062 seconds.

1 tests, 8 assertions, 0 failures, 0 errors
&lt;/pre&gt;

	&lt;p&gt;Unless somebody beats me to it,
I&amp;#8217;ll whip up a Perl version for comparison.&lt;/p&gt;


&lt;div class="update"&gt;
&lt;strong&gt;Update:&lt;/strong&gt; I managed to speed up my code by a
factor of 17.  Now the execution time for the maximum-size,
all-the-same-letter problem is down to 4.2 seconds,
which is comparable with implementations in other
languages.  &lt;a href="http://my.opera.com/ipeev/blog/show.dml/409336"&gt;Ivan Peev&amp;#8217;s Python implementation&lt;/a&gt;, for example, is only slightly faster
at 2.8 seconds.

	&lt;p&gt;A performance killer in the previous version was using
a single big hash for my cache.  Now I use a 3D array:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
counts[[i,r,c]]   # one big hash (slower)
counts[i][r][c]   # 3D-array (faster)
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;An additional advantage of the 3D-array is that I can peel off slabs
as I descend the outer layers of nested loops.  For instance,
instead of writing:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;for i in 0 .. 10
  for j in 0 .. 10
    sum += counts[i][j]
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;I can lift the &lt;code&gt;counts[i]&lt;/code&gt; slab out of the inner
loop to eliminate &lt;em&gt;j&lt;/em&gt; array-indexing operations:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;for i in 0 .. 10
  slab = counts[i]
  for j in 0 .. 10
    sum += slab[j]
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Here&amp;#8217;s the new code (sans the unit tests, which haven&amp;#8217;t changed):&lt;/p&gt;


&lt;pre&gt;&lt;code style="font-size: smaller"&gt;class WordPath

  A = Array

  def self.count_paths(grid, word)

    rword  = word.reverse
    rowmax = grid.size - 1
    colmax = grid.first.size - 1

    for i in 0 .. rword.size - 1
      letter = rword[i]
      previous_slab, slab = slab, A.new(rowmax+1) { A.new(colmax+1) }
      for r in 0 .. rowmax
        row, line = grid[r], slab[r]
        for c in 0 .. colmax
          line[c] = unless letter == row[c]
            0
          else
            if i == 0
              1
            else
              sum = 0
              clo = c &amp;gt; 0 ? c - 1 : c
              chi = c &amp;lt; colmax ? c + 1 : c
              for nr in (r &amp;gt; 0 ? r - 1 : r) .. (r &amp;lt; rowmax ? r + 1 : r)
                for nc in clo .. chi
                  sum += previous_slab[nr][nc] if nr != r || nc != c
                end
              end
              sum
            end
          end
        end
      end
    end

    sum = 0
    for r in 0 .. rowmax
      for c in 0 .. colmax
        sum += slab[r][c]
      end
    end

    sum

  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;&lt;strong&gt;Update 2:&lt;/strong&gt; I tweaked the code snippet above to remove a variable
that I just noticed wasn&amp;#8217;t actually doing anything.&lt;/p&gt;


&lt;/div&gt;</description>
      <pubDate>Wed, 16 Aug 2006 18:54:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:1bf9c8cf-5b20-4039-a64d-020e9ab52830</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2006/08/16/solving-the-google-code-jam-countpaths-problem-in-ruby</link>
      <category>ruby</category>
      <category>fun stuff</category>
      <category>ruby</category>
      <category>google</category>
      <category>code</category>
      <category>jam</category>
      <category>countpaths</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/155</trackback:ping>
    </item>
    <item>
      <title>Composing functions in Ruby</title>
      <description>&lt;p&gt;One of the things I miss when coding in Ruby is
inexpensive function composition.  In Haskell, for example,
I can compose functions using the dot (.) operator:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;inc        = (+1)
twice      = (*2)
twiceOfInc = twice . inc
&lt;/code&gt;&lt;/pre&gt;

Because of Ruby&amp;#8217;s open classes, however, I can easily
add the feature to the language.  In
the code below, I introduce
&lt;code&gt;Proc.compose&lt;/code&gt; and overload the
star (&lt;code&gt;*&lt;/code&gt;) operator for the purpose:

&lt;pre&gt;&lt;code&gt;# func_composition.rb
class Proc
  def self.compose(f, g)
    lambda { |*args| f[g[*args]] }
  end
  def *(g)
    Proc.compose(self, g)
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;And that&amp;#8217;s all it takes:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;$ irb --simple-prompt -r func_composition.rb

&amp;gt;&amp;gt; inc = lambda { |x| x + 1 }
=&amp;gt; #&amp;lt;Proc:0x00002aaaaaad7810@(irb):1&amp;gt;

&amp;gt;&amp;gt; twice = lambda { |x| x * 2 }
=&amp;gt; #&amp;lt;Proc:0x00002aaaaabd2d18@(irb):2&amp;gt;

&amp;gt;&amp;gt; inc[1]
=&amp;gt; 2

&amp;gt;&amp;gt; twice[2]
=&amp;gt; 4

&amp;gt;&amp;gt; twice_of_inc = twice * inc
=&amp;gt; #&amp;lt;Proc:0x00002aaaaab32458@./func_composition.rb:3&amp;gt;

&amp;gt;&amp;gt; twice_of_inc[1]
=&amp;gt; 4

&amp;gt;&amp;gt; twice_of_inc[2]
=&amp;gt; 6
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Now, isn&amp;#8217;t that refreshing?&lt;/p&gt;


&lt;div class="update"&gt;
&lt;strong&gt;Update:&lt;/strong&gt; Vincent Foley &lt;a href="http://groups.google.com/group/comp.lang.ruby/browse_thread/thread/6102f784210bdb32/05dfa12e07513a2c#05dfa12e07513a2c"&gt;pointed out on comp.lang.ruby&lt;/a&gt; that &lt;a href="http://facets.rubyforge.org/"&gt;Ruby Facets&lt;/a&gt; has a &lt;a href="http://facets.rubyforge.org/api/core/classes/Proc.html"&gt;nearly identical implementation&lt;/a&gt; that also uses the star operator for composition.  (Its version of &lt;em&gt;compose&lt;/em&gt;, however, is an instance method whereas my version is a class method.)
&lt;/div&gt;</description>
      <pubDate>Fri, 07 Apr 2006 11:55:00 -0400</pubDate>
      <guid isPermaLink="false">urn:uuid:0d98557f896ddc1d75ff1f85eb283602</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2006/04/07/composing-functions-in-ruby</link>
      <category>functional programming</category>
      <category>ruby</category>
      <category>ruby</category>
      <category>functional_programming</category>
      <category>fp</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/62</trackback:ping>
    </item>
    <item>
      <title>Improving Typo's spam protection</title>
      <description>&lt;p&gt;I noticed that my site has been picking up more comment spam recently.
&lt;a href="http://www.typosphere.org/"&gt;Typo&lt;/a&gt; has built-in spam protection, but for
some reason a few spam comments that ought to have been caught slipped
through its filters.  Curious, I investigated.&lt;/p&gt;


	&lt;p&gt;Most spam comments contain links to sites favored by the spammers.
The sites are almost always of the form &lt;em&gt;x.domain&lt;/em&gt;.com,
where &lt;em&gt;domain&lt;/em&gt; is one of a few higher-level domains and &lt;em&gt;x&lt;/em&gt; is drawn
from a large set of values from the realms of gambling, pornography,
and male enhancement.  It seems that the spammers pay for a few real
domains and then create a ton of subdomains under them.&lt;/p&gt;


	&lt;p&gt;One of the ways to detect comment spam is to find URIs in comments and
look up the sites they point to in &lt;span class="caps"&gt;DNS&lt;/span&gt;-based
&lt;acronym title="spam-URI realtime blackout lists"&gt;SURBL&lt;/acronym&gt;s,
such as &lt;a href="http://www.surbl.org/"&gt;multi.surbl.org&lt;/a&gt; and
&lt;a href="http://bsb.empty.us/"&gt;bsb.empty.us&lt;/a&gt;.  The thing is, when SURBLs list a
spammy site &lt;em&gt;x.domain&lt;/em&gt;.com, sometimes they list it under the full
hostname &lt;em&gt;x.domain&lt;/em&gt;.com and sometimes they list it
under the higher-level domain
&lt;em&gt;domain&lt;/em&gt;.com.  To be safe, Typo looks up both forms when it checks
for spam.&lt;/p&gt;


	&lt;p&gt;Here&amp;#8217;s the code it uses:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="constant"&gt;HOST_RBLS&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;rbl&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
  &lt;span class="keyword"&gt;begin&lt;/span&gt;
    &lt;span class="keyword"&gt;if&lt;/span&gt; &lt;span class="punct"&gt;[&lt;/span&gt;
        &lt;span class="constant"&gt;IPSocket&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getaddress&lt;/span&gt;&lt;span class="punct"&gt;([&lt;/span&gt;&lt;span class="ident"&gt;host&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;rbl&lt;/span&gt;&lt;span class="punct"&gt;].&lt;/span&gt;&lt;span class="ident"&gt;join&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;.&lt;/span&gt;&lt;span class="punct"&gt;')),&lt;/span&gt;
        &lt;span class="constant"&gt;IPSocket&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getaddress&lt;/span&gt;&lt;span class="punct"&gt;((&lt;/span&gt;&lt;span class="ident"&gt;domain&lt;/span&gt; &lt;span class="punct"&gt;+&lt;/span&gt; &lt;span class="punct"&gt;[&lt;/span&gt;&lt;span class="ident"&gt;rbl&lt;/span&gt;&lt;span class="punct"&gt;]).&lt;/span&gt;&lt;span class="ident"&gt;join&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;.&lt;/span&gt;&lt;span class="punct"&gt;'))&lt;/span&gt;
       &lt;span class="punct"&gt;].&lt;/span&gt;&lt;span class="ident"&gt;include?&lt;/span&gt;&lt;span class="punct"&gt;(&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;127.0.0.2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;)&lt;/span&gt;
      &lt;span class="ident"&gt;throw&lt;/span&gt; &lt;span class="symbol"&gt;:hit&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{rbl}&lt;/span&gt; positively resolved &lt;span class="expr"&gt;#{domain.join('.')}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;rescue&lt;/span&gt; &lt;span class="constant"&gt;SocketError&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The code iterates over the list of SURBLs it has and queries each
twice &amp;#8211; once for the host and once for the domain in question &amp;#8211; saving
the results of the queries in an array.  Then if the array includes a
positive response (127.0.0.2), it throws a &amp;#8220;hit&amp;#8221; notice to the
calling code, which will block the associated comment.&lt;/p&gt;


	&lt;p&gt;Unfortunately, the code doesn&amp;#8217;t quite work as intended.  Although a
positive response for &lt;em&gt;either&lt;/em&gt; the host or the domain should register
as a hit, the code requires &lt;em&gt;both&lt;/em&gt; queries to return positive
responses.  As a result, the code yields a lot of false negatives
because most lists don&amp;#8217;t include both host and domain forms of spammy
sites; the required double positive is thus hard to obtain.&lt;/p&gt;


&lt;p&gt;The cause of the problem is the attempt to query for both forms of the
site before checking either response.  The queries are performed by
calling &lt;code&gt;IPSocket.getaddress&lt;/code&gt;, which performs a &lt;span class="caps"&gt;DNS&lt;/span&gt; query
for the &amp;#8220;A&amp;#8221; record associated with its argument.  If the record
exists, the call returns it; otherwise, the call raises a
&lt;code&gt;SocketError&lt;/code&gt; exception.&lt;/p&gt;

	&lt;p&gt;The exception is what causes the logic to break down.  When either the
host or domain is &lt;em&gt;not&lt;/em&gt; in the queried &lt;span class="caps"&gt;SURBL&lt;/span&gt;, which will almost always
be the case for reasons I explained earlier, one of the queries will
result in a &lt;code&gt;SocketError&lt;/code&gt; exception.  The exception will be
caught by the &lt;code&gt;rescue&lt;/code&gt; clause later in the code, but not
before the opportunity to test the other query&amp;#8217;s response and throw a
&amp;#8220;hit&amp;#8221; has been lost.&lt;/p&gt;


	&lt;p&gt;My fix was to replace the above code with a call to a new helper
method:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="ident"&gt;query_rbls&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="constant"&gt;HOST_RBLS&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;host&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;domain&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;join&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;.&lt;/span&gt;&lt;span class="punct"&gt;'))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The helper, defined later, makes the actual queries:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;query_rbls&lt;/span&gt;&lt;span class="punct"&gt;(&lt;/span&gt;&lt;span class="ident"&gt;rbls&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;*&lt;/span&gt;&lt;span class="ident"&gt;subdomains&lt;/span&gt;&lt;span class="punct"&gt;)&lt;/span&gt;
  &lt;span class="ident"&gt;rbls&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;rbl&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
    &lt;span class="ident"&gt;subdomains&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;uniq&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;each&lt;/span&gt; &lt;span class="keyword"&gt;do&lt;/span&gt; &lt;span class="punct"&gt;|&lt;/span&gt;&lt;span class="ident"&gt;d&lt;/span&gt;&lt;span class="punct"&gt;|&lt;/span&gt;
      &lt;span class="keyword"&gt;begin&lt;/span&gt;
        &lt;span class="ident"&gt;response&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="constant"&gt;IPSocket&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;getaddress&lt;/span&gt;&lt;span class="punct"&gt;([&lt;/span&gt;&lt;span class="ident"&gt;d&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="ident"&gt;rbl&lt;/span&gt;&lt;span class="punct"&gt;].&lt;/span&gt;&lt;span class="ident"&gt;join&lt;/span&gt;&lt;span class="punct"&gt;('&lt;/span&gt;&lt;span class="string"&gt;.&lt;/span&gt;&lt;span class="punct"&gt;'))&lt;/span&gt;
        &lt;span class="ident"&gt;throw&lt;/span&gt; &lt;span class="symbol"&gt;:hit&lt;/span&gt;&lt;span class="punct"&gt;,&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;&lt;span class="expr"&gt;#{rbl}&lt;/span&gt; positively resolved &lt;span class="expr"&gt;#{d}&lt;/span&gt; =&amp;gt; &lt;span class="expr"&gt;#{response}&lt;/span&gt;&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
      &lt;span class="keyword"&gt;rescue&lt;/span&gt; &lt;span class="constant"&gt;SocketError&lt;/span&gt;
        &lt;span class="comment"&gt;# NXDOMAIN response =&amp;gt; negative:  d is not in RBL&lt;/span&gt;
      &lt;span class="keyword"&gt;end&lt;/span&gt;
    &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;end&lt;/span&gt;
  &lt;span class="keyword"&gt;return&lt;/span&gt; &lt;span class="constant"&gt;false&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;Because some SURBLs don&amp;#8217;t use 127.0.0.2 but some other &amp;#8220;A&amp;#8221; record to
indicate a positive response, my helper removes the hard-coded address
test.&lt;/p&gt;


	&lt;p&gt;I also made a few more improvements to the spam-protection
code.  The full set of changes is available as &lt;a href="http://www.typosphere.org/trac/ticket/657"&gt;Patch
657&lt;/a&gt; on the Typo Trac site.&lt;/p&gt;</description>
      <pubDate>Mon, 16 Jan 2006 01:34:00 -0500</pubDate>
      <guid isPermaLink="false">urn:uuid:a154474c903e93da6922f1a53a563f0a</guid>
      <author>Tom Moertel</author>
      <link>http://blog.moertel.com/articles/2006/01/16/improving-typos-spam-protection</link>
      <category>typo</category>
      <category>typo</category>
      <category>ruby</category>
      <category>spam</category>
      <trackback:ping>http://blog.moertel.com/articles/trackback/23</trackback:ping>
    </item>
    <item>
      <title>Closures and the professional programmer</title>
      <description>&lt;p&gt;I came across &lt;a href="http://www.tbray.org/ongoing/When/200x/2005/08/27/Ruby"&gt;Tim Bray&amp;#8217;s thoughts on
Ruby&lt;/a&gt; via
the ever-delightful &lt;a href="http://lambda-the-ultimate.org/node/view/934"&gt;Lambda the Ultimate&lt;/a&gt; and found the following bit fascinating:&lt;/p&gt;


	&lt;blockquote&gt;
		&lt;p&gt;I&amp;#8217;ve had access to languages with closures and continuations and
suchlike constructs for years and years, and I&amp;#8217;ve never ever written
one. While I&amp;#8217;m impressed by how natural this stuff is in Ruby, &lt;em&gt;I&amp;#8217;m
still unconvinced that these are a necessary part of the professional
programmer&amp;#8217;s arsenal.&lt;/em&gt; [Emphasis mine.]&lt;/p&gt;
	&lt;/blockquote&gt;


	&lt;p&gt;While Tim Bray may be unconvinced, I am a true believer.&lt;/p&gt;&lt;p&gt;I use closures so much that I feel cheated into doing busy work by
languages that do not support them.  I use continuations less often
but frequently enough to appreciate how much time they save me.
Neither is strictly required for professional work, but they are
potent tools, and a professional who knows how to use them has an
advantage over those who do not.&lt;/p&gt;


	&lt;p&gt;Closures, in particular, are something every professional ought to
master.  Besides their more celebrated uses, closures make refactoring practical on a small scale.  For
example, consider the following Ruby method, which we will assume is
one of several similar methods belonging to a class that implements some kind of Internet server:&lt;/p&gt;


&lt;div class="typocode"&gt;&lt;pre&gt;&lt;code class="typocode_ruby "&gt;&lt;span class="keyword"&gt;def &lt;/span&gt;&lt;span class="method"&gt;process&lt;/span&gt;
  &lt;span class="ident"&gt;sn&lt;/span&gt; &lt;span class="punct"&gt;=&lt;/span&gt; &lt;span class="ident"&gt;next_serial&lt;/span&gt;&lt;span class="punct"&gt;()&lt;/span&gt;
  &lt;span class="ident"&gt;log&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;info&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;process/&lt;span class="expr"&gt;#{sn}&lt;/span&gt;: stage 1&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="comment"&gt;# ... do some work&lt;/span&gt;
  &lt;span class="ident"&gt;log&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;info&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;process/&lt;span class="expr"&gt;#{sn}&lt;/span&gt;: stage 2&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
  &lt;span class="comment"&gt;# ... do some more work&lt;/span&gt;
  &lt;span class="ident"&gt;log&lt;/span&gt;&lt;span class="punct"&gt;.&lt;/span&gt;&lt;span class="ident"&gt;info&lt;/span&gt; &lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;&lt;span class="string"&gt;process/&lt;span class="expr"&gt;#{sn}&lt;/span&gt;: finished&lt;/span&gt;&lt;span class="punct"&gt;&amp;quot;&lt;/span&gt;
&lt;span class="keyword"&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

	&lt;p&gt;The method first gets a unique serial number, which is used during the processing of requests and also to relate log entries generated by the same processing call.  Then the method does its work, loggin