Posted by Tom Moertel
Tue, 25 Sep 2007 23:04:00 GMT
I am on the planning committee for the Pittsburgh Perl
Workshop. So far, it’s been an interesting
ride. Last year was the first PPW, and it went surprisingly well. In
the post-conference surveys, 94 percent of respondents said they
wanted to come back for another PPW, so we committed ourselves to
repeating the grueling conference-planning process for
2007. (Fact: making big commitments like this is much more likely
to happen if you’re drinking beer at the time.)
Now a year has gone by, and PPW 2007 is
only three weeks away. This year’s conference is 100% larger – two
full days – and offers a new, much-asked-for option: a
one-day introductory course
to give programmers new to Perl a quick dose of the language so they
can dive into the rest of the conference. This year’s conference also
offers a full-length Hackathon for those who feel the urge to code
at the conference.
The main attraction, however, is the conference’s wide array of
technical talks. We have retained
the same mix of industry and academic speakers that attendees said
they liked so much last year. Indeed, our speaker list includes some
of last year’s most fascinating speakers, as well as many new speakers
drawn from the world of Perl. No matter what your interests are,
you’ll find talks for you at PPW 2007. (I’m particularly interested
in the talks on continuation-based web applications, the cool new
stuff in Perl 5.10, and the Moose object system.)
All of this is to say: Do not miss PPW 2007! Where
else are you going to find so many interesting people, so many
fascinating talks, and so many opportunities to have fun and make
friends while learning useful stuff, all for so little expense?
(Regular admission is only $70, and students get a big discount.)
Get your ticket now because
over half of the seats are already gone.
I hope to see you there.
Posted in perl
Tags conferences, perl, pittsburgh, ppw, ppw2007
no comments
no trackbacks

Posted by Tom Moertel
Tue, 11 Sep 2007 16:51:00 GMT
If you’re reading my blog via Bloglines, you
may have noticed that some of my posts look terrible, especially when
they contain code snippets. I am sorry for that, but it’s not my
fault. Bloglines doesn’t handle white space properly.
Here’s the more detailed explanation. When you request one of my
feeds in, say, Atom format, you get back a bunch of XML that contains
the most-recent posts from my blog. Each post is represented as
lovingly crafted HTML, escaped per the Atom specs. When Bloglines
gets its hands on this very same HTML, it attempts to scrub it nice
and clean – get rid of any naughty bits, you know. And there’s
nothing wrong with that. Except when the scrubbing goes horribly,
horribly wrong. Which is exactly what happens when Bloglines
encounters perfectly legitimate markup that represents
syntax-highlighted code snippets.
What does Bloglines do then? It strips out all of the significant white
space, turning each block of code into a single, mile-long,
unbreakable line of NoSpaceText that forces your web browser to expand
the page until it is wide enough to enshroud a small solar system. Then
you are forced to scroll forever to read each line of the text
column. Ugg.
More specifically, each syntax-highlighted code block is
represented in HTML as a preformatted (PRE) text block.
Each word in that block is wrapped in a SPAN element
whose class attribute indicates the word’s role in the
original source code. Keywords get one
class, identifiers another, and so on. For example,
the code “import List” might be represented
as follows:
<span class="kwd">import</span> <span class="name">List</span>
But when Bloglines gets its hands on that markup, it strips
out the whitespace between the SPAN elements:
<span class="kwd">import</span><span class="name">List</span>
Thus the markup renders as “importList” when it hits your web
browser. Now imagine the same space-denuding bad behavior applied to
all of the inter-element white space in a full-length block of code.
That’s right, what you end up with is a single, insanely long
LineOfUnbreakableText that your web browser chokes on. Again:
Ugg.
The folks at Bloglines have had similar problems in the past, most of which have been fixed. I hope they fix this particular problem soon, too.
Until that time, however, you might want to consider other feed readers.
Posted in rants
Tags atom, bloglines, html, markup, rants, xml
4 comments
no trackbacks

Posted by Tom Moertel
Sat, 08 Sep 2007 00:19:00 GMT
I just released an updated version of cabal2rpm, a small program (written in Perl) that creates RPM spec files from Cabal package descriptions. RPM is the software-packaging format used by several popular Linux distributions, including Red Hat and Fedora. Cabal is the packaging format used by the Haskell community to distribute software written in Haskell.
Bryan O’Sullivan’s cabal-rpm also creates spec files from Cabal packages. Unlike cabal2rpm, it is written in Haskell and directly interfaces with the Cabal libraries. Long term, it is the way to go. For now, however, cabal2rpm may be more convenient because it works out of the box. (To use cabal-rpm, you’ll first need to install the just-tagged Cabal 1.2.0 library, not yet in wide distribution.)
Posted in haskell
Tags cabal, cabal2rpm, haskell
no comments
no trackbacks

Posted by Tom Moertel
Sat, 01 Sep 2007 19:39:00 GMT
Via Reddit I found Mark Nelson’s post about a recent word puzzle from NPR’s
Weekend Edition:
Take the names of two U.S. States, mix them all together, then rearrange the letters to form the names of two other U.S. States. What states are these?
The puzzle is fairly straightforward to solve by hand (think about
it), but let’s write a program to solve it. That will give us a convenient
excuse to discuss a super-handy function I use all the time:
clusterBy. In Haskell, it looks like this:
import Control.Arrow ((&&&))
import qualified Data.Map as M
clusterBy :: Ord b => (a -> b) -> [a] -> [[a]]
clusterBy f = M.elems . M.map reverse . M.fromListWith (++)
. map (f &&& return)
What clusterBy does is group a list of values by their signatures,
as computed by a given signature function f, and returns
the groups in order of ascending signature. For example, we
can cluster the words “the tan ant gets some fat” by length, by
first letter, or by last letter just by changing the
signature function we give to clusterBy:
*Main> let antwords = words "the tan ant gets some fat"
*Main> clusterBy length antwords
[["the","tan","ant","fat"],["gets","some"]]
*Main> clusterBy head antwords
[["ant"],["fat"],["gets"],["some"],["the","tan"]]
*Main> clusterBy last antwords
[["the","some"],["tan"],["gets"],["ant","fat"]]
If we use sort as the signature function, we can find anagrams:
*Main> clusterBy sort antwords
[["fat"],["tan","ant"],["gets"],["the"],["some"]]
And that brings us back to the original puzzle. To find the solution,
we must consider each unique pair of state names to form a “word” and
find the anagrams among a list of such “words.”
Assuming we are given
a list of state names on standard input, one state per line, we can
write the shell of our solution as follows:
main = mapM_ print . solve . lines =<< getContents
The shell delegates the real work to solve. It’s job is to
compute the unique, 2-state combinations from the original
list of states, and then find the anagrams among these combinations.
As before, finding the anagrams is simply a matter of calling
clusterBy with the right signature function. We also filter
out the trivial results, which are not valid solutions:
solve = filter ((>1) . length) . clusterBy signature . ucombos
ucombos xs = [[x,y] | x <- xs, y <- xs, x < y]
signature = sort . filter isAlpha . concat
That’s it. Now we can solve the puzzle by feeding our program a list of states:
$ runhaskell anagrams2.hs < states.txt
[["NORTH CAROLINA","SOUTH DAKOTA"],
["NORTH DAKOTA","SOUTH CAROLINA"]]
What a handy little function, that clusterBy.
Posted in programming
Tags clusterby, functions, haskell, hof, puzzles
11 comments
no trackbacks
