Posted by Tom Moertel
Wed, 01 Nov 2006 22:01:00 GMT
Last night on #haskell, Don
Stewart asked if I had seen
HsColour
for rendering syntax-highlighted Haskell in HTML. He had
used it recently, he noted in passing, to add syntax highlighting to planet.haskell.org.
Now, I can’t be certain about this, but I suspect that Don’s question
was cleverly designed to instill in me a subtle case of
syntax-highlighting envy. For on my blog, Haskell code snippets
were rendered in dreadfully boring uncolored text.
But on his blog, the
snippets dance in joyous polychromatic splendor.
Thus I was compelled to add Haskell syntax-highlighting to my blog.
Adding Haskell syntax-highlighting to Typo
My blog runs on the Ruby-on-Rails-powered Typo
system, which allows for plug-in text filters. One of the included filters, in fact, is a syntax-highlighting filter for snippets of Ruby, XML, and YAML code. This filter is built upon the Ruby Syntax module, which wasn’t exactly designed for Haskell syntax analysis. So I set out to create a new plug-in filter based upon HsColour.
This task turned out to be easy. All I did was duplicate
Typo’s existing syntax-highlighting filter and swap out its filtering
code for the following:
IO.popen("HsColour -css", "r+") do |f|
pid = fork { f.write text; f.close; exit! 0 }
f.close_write
text = f.read
Process.waitpid pid
end
I also tweaked the post-processing regular expressions so that they
would whittle away the HTML filler before and after the
syntax-highlighted output of HsColour:
text.gsub!(/.*<p()re>/m, ...)
text.gsub!(/<\/pre>.*/m, ...)
A few more tweaks and I was done.
Now I can wrap my Haskell code in <typo:haskell> tags and it, too, will
dance in joyous polychromatic splendor:
constructTable tspecs = do
ecolspecs <- during "argument evaluation" $ do
toNvps . concat =<< mapM splice tspecs
let names = map fst ecolspecs
let evecs = map snd ecolspecs
vecs <- argof nm $ mapM evalVector evecs
let vlens = map vlen vecs
if length (group vlens) == 1
then return . VTable $ mkTable (zip names vecs)
else throwError $
"table columns must be non-empty vectors of equal length"
where
nm = "table(...) constructor"
splice (TCol envp) = return [envp]
splice (TSplice e) = do
val <- eval e
case val of
VTable t ->
return $ zipWith mkNVP (tcnames t) (elems (tvecs t))
VList gl ->
liftM (zipWith mkNVP (map name . elems $ glnames gl)) $
mapM asVectorNull (elems $ glvals gl)
_ -> throwError $
"can't construct table columns from (" ++
show val ++ ")"
mkNVP n vec = NVP n (mkNoPosExpr . EVal $ VVector vec)
name "" = "NA"
name n = n
If you want the filter code, here it is: haskell_controller.rb. Just drop it into components/plugins/textfilters and restart Typo. The corresponding CSS styles can be found in my user-styles.css.
Posted in haskell, ruby, typo
Tags haskell, hscolour, ruby, typo
no comments
no trackbacks

Posted by Tom Moertel
Thu, 24 Aug 2006 19:41:00 GMT
In an earlier post I wrote about stability
problems that have plagued my blog since upgrading from Typo 4.0.0 to 4.0.3. I have finally traced the problem to its source, and here’s the deal:
If you’re serving Typo up via Mongrel, do not configure ActiveRecord to allow concurrency.
One of the changes between Typo 4.0.0 and 4.0.3 is this
addition to the environment.rb file:
config.active_record.allow_concurrency = true
Comment out this line, restart Typo, and the problem is solved.
Apply Changeset 1255, and the problem is solved. (See
Update 2, below.)
Discussion
When ActiveRecord::Base.allow_concurrency is set to
true, AR will give each thread its own database
connections and cache them in thread-localized storage. The idea is
that, in a multi-threaded environment, this simple policy prevents
unsafe interactions between threads and the database. (Imagine what
would happen if one thread “borrowed” a connection over which
another thread had opened a transaction. Oops, there goes
transactional isolation.)
This policy, however, does place a burden on the owner of the threads to
make sure that each thread’s local connection cache is cleared when
the thread is joined, a burden that is not, it would seem, being
carried by Typo under Mongrel. As a result, Typo rapidly chews
through the allotment of file descriptors that the operating system
kindly had reserved for Mongrel:

(On my Linux server, the Mongrel process gets an allotment of 1024
file descriptors.)
Lucky for us, this each-thread-gets-its-own-connections policy is unnecessary under
Mongrel because Mongrel, while being multi-threaded itself, serializes
all access to the Rails-based applications it serves up:
Q: Is [Mongrel] multi-threaded or can it handle concurrent requests?
Mongrel is uses a pool of thread workers to do it’s processing. This means that it is able to handle concurrent access and should be thread safe. This also means that you have to be more careful about how you use Mongrel. You can’t just write your application assuming that there are no threads involved. ...
Ruby on Rails is not thread safe so there is a synchronized block around the calls to Dispatcher.dispatch. This means that everything is threaded right before and right after Rails runs. While Rails is running there is only one controller in operation at a time.
(Source: Mongrel FAQ list)
Thus we can safely turn off (i.e., comment out in Typo’s
environment.rb file) ActiveRecord’s allow-currency option
without having to worry about nasty concurrency or performance issues:
# the following line is commented out
# config.active_record.allow_concurrency = true
For more on this subject, see Rails ticket
#2162 and Rails ticket
#2742.
Now, here’s my question: Are there any environments in which
Typo can run with the allow-concurrency option enabled and not
leak database connections? Inquiring minds want to know.
Update: Upon further investigation, turning off
concurrency might not be altogether without risk. Some of the Typo
code that handles potentially long tasks, such as making trackbacks
and pings, spawns new threads in which to carry out its work. I’m
looking further into this risk. Updates to come.
Update 2: Piers Cawley added Changeset
1255, which turns AR’s
allow-concurrency flag back off and revises the ping code so that
it does not attempt concurrent database access. Apply the patch version of
1255
and restart Typo to get the fix. A tip of the hat to Piers for making
the quick fix when he was supposed to be on holiday.
Posted in ruby, typo, rails
Tags activerecord, concurrency, rails, sqlite3, typo
4 comments
no trackbacks

Posted by Tom Moertel
Thu, 24 Aug 2006 04:41:00 GMT
Since I upgraded my blog from Typo 4.0.0 to
4.0.3, it has been somewhat unstable. About once a day it starts
responding with “500 Internal Server Error” and stays that way until I
restart it.
The root of the problem seems to be the database
connection, as evidenced by this exception showing up in the
production log:
SQLite3::CantOpenException (could not open database)
Unfortunately, the exception doesn’t provide anything specific
to go on.
A quick look at the
sqlite3-ruby code
suggested that I was not going to get the specifics, either. The Ruby-based wrapper
never calls sqlite3_errmsg after a call to sqlite3_open fails on behalf of SQLite3::Database.new.
A quick patch, however, fixed the problem:
--- sqlite3-ruby-1.1.0.orig/lib/sqlite3/database.rb
+++ sqlite3-ruby-1.1.0/lib/sqlite3/database.rb
@@ -109,7 +109,7 @@
@statement_factory = options[:statement_factory] || Statement
result, @handle = @driver.open( file_name, utf16 )
- Error.check( result, nil, "could not open database" )
+ Error.check( result, self, "could not open database" )
@closed = false
@results_as_hash = options.fetch(:results_as_hash,false)
(Submitted as Ticket 5504 on RubyForge.)
Before applying the patch, opening a database at a nonexistent path results in
a generic error message:
$ ruby -r rubygems -e 'require_gem "sqlite3-ruby";
SQLite3::Database.new("/no/such/path/db")'
... could not open database (SQLite3::CantOpenException) ...
After applying the patch, we get additional error information:
... could not open database: unable to open database file
(SQLite3::CantOpenException) ...
With the patch in place, all I have to do is wait for Typo to start
acting up again. Then I’ll have some interesting information in the
log.
Until then, I’m relying on cron
and a short monitoring script to restart Typo when it tips into
foolishness:
#!/bin/bash
url=http://blog.moertel.com/admin
addrs=tom@moertel.com
response=$(GET -sd $url 2>&1)
if [ "$response" != "200 OK" ]; then
{ echo "Response was: $response"; echo; service typo restart; } |
mail -s "Blog site not responding! (Restarting)" $addrs
fi
We’ll see how it goes.
Update: That was fast. The error popped up
again and this time the log told me something useful: “unable to open
database file.” Now, why couldn’t Typo open the database file,
especially since the file is perfectly fine and had been opened
successfully (many times) by the very same Typo process earlier? Here’s
a hint:
$ ls /proc/28788/fd | wc -l
1023
Seems like there’s a resource leak in Typo 4.0.3 (or Rails 1.1.6).
Under some conditions, instead of reusing existing database
connections, Typo keeps trying to open new ones. Eventually, it uses
up its allotment of file descriptors and the operating system is forced
to say, “That’s enough, pal,” (EMFILE).
I’ll look in to it more in the morning.
Update 2: Problem solved.
Posted in ruby, typo, rails, sysadmin
Tags rails, sqlite3, typo
1 comment
no trackbacks

Posted by Tom Moertel
Wed, 09 Aug 2006 22:25:00 GMT
Here’s quick patch I made to my Typo 4.0
installation to add Reddit and
del.icio.us buttons to articles. Now one click
is all it takes to submit an article to either site. (These buttons
appear on my blog at the end of each article.)
If you want to apply the patch, be sure to also place copies of the
button images into public/images. You can snag the
images from my site or from the Reddit and del.icio.us sites.
Here’s the patch:
--- typo.orig/app/helpers/articles_helper.rb 2006-07-24 11:04:27.000000000 -0400
+++ typo/app/helpers/articles_helper.rb 2006-08-09 17:06:51.000000000 -0400
@@ -73,7 +74,26 @@
code << tag_links(article) unless article.tags.empty?
code << comments_link(article) if article.allow_comments?
code << trackbacks_link(article) if article.allow_pings?
- end.join(" <strong>|</strong> ")
+ code << submit_this_article_links(article)
+ end.join(" | ")
+ end
+
+ def submit_this_article_links(article)
+ u_url = u(url_of(article, false))
+ u_title = u(article.title)
+ [ # move me into a database table
+ [ "Submit to Reddit.com",
+ "http://reddit.com/submit?url=<URL>&title=<TITLE>",
+ image_tag("reddit.gif", :size => "18x18", :border => 0)
+ ],
+ [ "Save to del.icio.us",
+ "http://del.icio.us/post?v=2&url=<URL>&title=<TITLE>",
+ image_tag("delicious.gif", :size => "16x16", :border => 0)
+ ]
+ ].map do |submit_title, submit_url, image_tag|
+ submit_url = submit_url.gsub(/<URL>/, u_url).gsub(/<TITLE>/, u_title)
+ %(<a href="#{h submit_url}" title="#{h submit_title}: “#{h article.title}”">#{image_tag}</a>)
+ end.join(" ")
end
def category_links(article)
The code is begging for a little refactoring love, but I’m off for vacation
in about twenty minutes, so it will have to wait.
Posted in site news, typo, hacks
Tags delicous, reddit, typo
no comments
no trackbacks

Posted by Tom Moertel
Mon, 24 Jul 2006 17:34:00 GMT
If my blog looks a little weird right now, please bear with me. I am
in the process of upgrading from Typo 2.6.0 to Typo 4.0, and so far
the process has been somewhat painful.
The new Typo installer did not have much luck upgrading my blog to the
new version. After fighting and solving a succession of errors and
confidence-sapping problems, I decided to abandon the upgrade
process. Instead, I changed to the course most likely to result in a
stable configuration: to install a new blog and then move my content
over to it.
The content-moving process was easier than it might sound. I manually
migrated the old blog database to the new database format; dumped it
to a SQL file; edited the file to remove all but the INSERT statements
for articles, comments, pages, and so on; and then I loaded the
statements into the new database.
I did not copy over my configuration and sidebar information, however,
because I figured it would be safer to use the Typo-4.0 defaults,
those being the most tested. I also recreated my user account from
scratch.
So far the blog seems to be running stably, enough at least
for me to restore public access again. But I still have more
restoration ahead. Next I will work on restoring my espresso theme.
Update 2006-07-26: I have now restored my espresso theme. For a while I was considering using
Scribbish, which is delightfully clean by comparison, but it has not yet been updated to support much of Typo 4.0’s goodness. Maybe later.
Posted in site news, typo
Tags typo
no comments
no trackbacks

Posted by Tom Moertel
Mon, 16 Jan 2006 06:34:00 GMT
I noticed that my site has been picking up more comment spam recently.
Typo has built-in spam protection, but for
some reason a few spam comments that ought to have been caught slipped
through its filters. Curious, I investigated.
Most spam comments contain links to sites favored by the spammers.
The sites are almost always of the form x.domain.com,
where domain is one of a few higher-level domains and x is drawn
from a large set of values from the realms of gambling, pornography,
and male enhancement. It seems that the spammers pay for a few real
domains and then create a ton of subdomains under them.
One of the ways to detect comment spam is to find URIs in comments and
look up the sites they point to in DNS-based
SURBLs,
such as multi.surbl.org and
bsb.empty.us. The thing is, when SURBLs list a
spammy site x.domain.com, sometimes they list it under the full
hostname x.domain.com and sometimes they list it
under the higher-level domain
domain.com. To be safe, Typo looks up both forms when it checks
for spam.
Here’s the code it uses:
HOST_RBLS.each do |rbl|
begin
if [
IPSocket.getaddress([host, rbl].join('.')),
IPSocket.getaddress((domain + [rbl]).join('.'))
].include?("127.0.0.2")
throw :hit, "#{rbl} positively resolved #{domain.join('.')}"
end
rescue SocketError
end
end
The code iterates over the list of SURBLs it has and queries each
twice – once for the host and once for the domain in question – saving
the results of the queries in an array. Then if the array includes a
positive response (127.0.0.2), it throws a “hit” notice to the
calling code, which will block the associated comment.
Unfortunately, the code doesn’t quite work as intended. Although a
positive response for either the host or the domain should register
as a hit, the code requires both queries to return positive
responses. As a result, the code yields a lot of false negatives
because most lists don’t include both host and domain forms of spammy
sites; the required double positive is thus hard to obtain.
The cause of the problem is the attempt to query for both forms of the
site before checking either response. The queries are performed by
calling IPSocket.getaddress, which performs a DNS query
for the “A” record associated with its argument. If the record
exists, the call returns it; otherwise, the call raises a
SocketError exception.
The exception is what causes the logic to break down. When either the
host or domain is not in the queried SURBL, which will almost always
be the case for reasons I explained earlier, one of the queries will
result in a SocketError exception. The exception will be
caught by the rescue clause later in the code, but not
before the opportunity to test the other query’s response and throw a
“hit” has been lost.
My fix was to replace the above code with a call to a new helper
method:
query_rbls(HOST_RBLS, host, domain.join('.'))
The helper, defined later, makes the actual queries:
def query_rbls(rbls, *subdomains)
rbls.each do |rbl|
subdomains.uniq.each do |d|
begin
response = IPSocket.getaddress([d, rbl].join('.'))
throw :hit, "#{rbl} positively resolved #{d} => #{response}"
rescue SocketError
end
end
end
return false
end
Because some SURBLs don’t use 127.0.0.2 but some other “A” record to
indicate a positive response, my helper removes the hard-coded address
test.
I also made a few more improvements to the spam-protection
code. The full set of changes is available as Patch
657 on the Typo Trac site.
Posted in typo
Tags ruby, spam, typo
no comments
no trackbacks
