I have written a short novel's worth of content for my blog

Posted by Tom Moertel Fri, 30 Mar 2007 04:34:00 GMT

How much content have I written for my blog? Let’s find out.

My blog runs on Typo, which is built upon Ruby on Rails. Let’s fire up the Rails console and gather a quick word count:

$ cd ~/blog
$ ruby script/console 
Loading development environment.
>> require 'article'
=> true
>> Article.find(:all).inject(0) { |sum,a| sum +=
       (a.body + a.extended.to_s).split(/\s+/).length }
=> 66665

So I have written about 66 kilo-words, which is entering novel territory. Paperback-wise, it’s about 190 pages.

All I need now is a villain and some cool cover art.

;-)

Posted in , ,
Tags , , , ,
2 comments
no trackbacks
Reddit Delicious

Directory-tree printing in Haskell, part three: lazy I/O

Posted by Tom Moertel Wed, 28 Mar 2007 19:40:00 GMT

This article is part three in a series on introductory Haskell programming. In the first article, we wrote a small program to recursively scan file-system directories and print their contents as ASCII-art trees. In the second article, we refactored the program to make its logic more reusable by separating the directory-scanning logic from the tree-printing logic. In this article, we will address a shortcoming of the refactored version: It must scan directory hierarchies completely before printing their trees, i.e., it must scan and then print, when doing both simultaneously is both more efficient and more user friendly.

Recall from the previous article that our directory-printing program is factored into three pieces of logic:

  1. fsTraverse, which traverses a file-system hierarchy and returns a tree data structure;
  2. showTree, which converts a tree into lovingly crafted ASCII art; and
  3. traverseAndPrint, which prints the tree for a file-system hierarchy by using the first two pieces of logic.

The types of the functions are as follows:

fsTraverse       :: Path -> DentName -> IO DirTree
showTree         :: Tree String -> String
traverseAndPrint :: Path -> IO ()

Note that showTree is a pure function, but the other two return IO actions that may have side effects.

Within traverseAndPrint, fsTraverse and showTree are combined into a composite IO action by the =<< combinator:

putStr . showTree =<< fsTraverse root path

The sequencing semantics of Haskell’s IO monad forces all of the effects of fsTraverse to complete before any following effects can begin. To better understand these sequencing semantics, let’s consider a simple example.

The IO-monad code,

a >> b

can loosely be interpreted as running the action a, which forces its side effects to occur, and then running the action b, which forces its side effects to occur.

In reality, a and b are not actions. They are functions. Like all Haskell functions, they are pure and have no side effects. It’s just that a and b return values that represent actions, and those actions may have side effects, and the semantics of the IO monad guarantee the ordering of those effects (should the actions end up being connected to the runtime’s top-level IO action and executed). If you think that’s weird, hold that thought. For now, all that’s important is that, if the composite action represented by the expression (a >> b) is executed, the effects of a, regardless of how complex, will be executed before the effects of b.

Thus if a represents building a tree by recursively scanning a file-system hierarchy, the entire tree must be built before b ever gets a chance to do its thing. For our particular application, however, that particular sequencing is suboptimal. We know from our earlier, monolithic implementation that the file-system hierarchy can be scanned and printed simultaneously, which is more efficient. Ideally, then, our refactored implementation should be just as efficient.

In this article, we will look at one way to maintain the clean, logical separation of the a part from the b part while allowing the parts’ effects to be interleaved for efficiency. We will use an extension to the Haskell language to make the directory-scanning action lazy so that it builds the tree as the tree is consumed.

Ready? Let’s dive in.

Read more...

Posted in
Tags , , , , ,
9 comments
no trackbacks
Reddit Delicious

Debate to learn. Learn to debate.

Posted by Tom Moertel Tue, 27 Mar 2007 03:23:00 GMT

Recently, Reg Braithwaite wrote about the ad hominem fallacy. His article reminded me that debating – the art and science of constructing sound arguments in the face of opposition – is a valuable skill.

Though many online debates devolve into name calling and other foolishness, most are rich opportunities to learn – if learning is your goal. So make learning your goal.

Look at each debate you enter as a chance to discover something new. When you participate, assume the other participants are good people, who deserve an honest argument from you. If you learn the fundamentals of logic and clear thinking, it’s easy to stay in the debate, contribute, and increase your (and their) chances to learn.

Many people, however, overlook the opportunity to learn in order to pursue the opportunity to win. What a mistake. If the price of winning is ignorance, can you afford the purchase?

Therefore, when I debate, I make considerable efforts to be rational and reasonable. Even so, it’s hard not to say the wrong thing when a debate gets heated. To help keep me in the right frame of mind, I use a simple, idealized debating model.

I wrote about this model six years ago on Kuro5hin, but it’s worth revisiting. The model is not magic, and I doubt it’s novel, but it has helped me. Maybe it can help you, too.

Here it is:

  • The motivation for debating is to arrive at a better understanding of reality (i.e., the truth).
  • All participants share this motivation.
  • All participants are intelligent, rational human beings, each fully capable of drawing logical conclusions from facts.
  • The reason for disagreements is not because participants want to disagree but rather because their understandings of the facts differ.
  • Therefore, the objective of debating is to share information until the participants can bring their understandings of the facts into alignment, which will allow for agreement or at least consensus.

I know that the model and reality part ways at the outset. When I debate, however, I pretend the model is reality. I do this is because it allows me to participate earnestly. Forcing myself to make meaningful contributions increases the chance that debates will end in somebody learning something useful.

Nevertheless, debates often go wrong. That’s the second reason I use the model. It gives me something to compare real debates with so that problems are easy to spot and classify. If a key participant in a debate makes personal attacks or refuses to accept demonstrated facts, for example, the problem is easy to see and classify: It is a debate killer. Time to move on.

Another tool that has helped me stay on track is D. Q. McInerny’s wonderful introduction to logic, Being Logical: A Guide to Good Thinking. This short book, inspired by Strunk and White’s classic, tiny text on writing, The Elements of Style, introduces the foundations of logic, explains how to construct sound arguments, and prepares you to recognize and avoid illogical thinking (fallacies). The book is a pleasure to read and makes a handy reference (I keep mine within arm’s reach). If you need a quick refresher on clear thinking, add this delightful book to your toolbox.

Don’t forget: Every debate is an opportunity to learn. So when debating, make learning your goal. And if you learn to debate, you will have an easier time debating and learning.

Debate to learn. Learn to debate.

Posted in
Tags , , , , , ,
4 comments
no trackbacks
Reddit Delicious

The IMDB Movie Rating Decoder Ring: updated w/ 2 March 2007 data

Posted by Tom Moertel Fri, 09 Mar 2007 22:40:00 GMT

If you want to get more out of IMDB movie ratings, check out my IMDB Movie Rating Decoder Ring, now updated with fresher data (as of 2 March 2007).

Posted in
Tags , , , , , ,
1 comment
no trackbacks
Reddit Delicious

Directory-tree printing in Haskell, part two: refactoring

Posted by Tom Moertel Wed, 07 Mar 2007 21:04:00 GMT

In my previous article on writing a simple directory-tree printer in Haskell, I wrote a small program to recursively scan file-system directories and print their contents as ASCII-art trees. The program made for an approachable example of how to use Haskell for “imperative” tasks, but it has a few problems.

First, the directory-scanning logic and tree-printing logic are intertwined. Neither is reusable. Second, both bits of logic are rigid, specialized for this particular task. Even if you could reuse them, you wouldn’t want to.

In this article, the second in a series, we will explore ways to make our original code more reusable. We will separate the directory scanning from the tree printing, harness the power of some old friends from Haskell’s libraries, and think about the costs and benefits of our changes.

The plan

Recall our original directory tree–listing solution, the core of which I will reprint below:

tlist path =
    visit (if "/" `isPrefixOf` path then "" else ".") "" "" "" path

visit path leader tie arm node = do
    putStrLn (leader ++ arm ++ tie ++ node)
    visitChildren (path ++ "/" ++ node) (leader ++ extension)
  where
    extension = case arm of ""  -> ""; "`" -> "    "; _   -> "|   "

visitChildren path leader =
    whenM (doesDirectoryExist path) $ do
        contents <- getDirectoryContents path
            `catch` (\e -> return [show e])
        let visibles = sort . filter (`notElem` [".", ".."]) $ contents
            arms = replicate (length visibles - 1) "|" ++ ["`"]
        zipWithM_ (visit path leader "-- ") arms visibles

The tlist function kicks off the process for a particular file-system path, handing off to visit which recursively descends the directory tree from the root node. The visit function calls visitChildren to expand the subtree, if any, for each node visited. The visitChildren function, in turn, calls back to visit to repeat the process for each child in the subtree. In effect, we are traversing the tree rooted at path, printing each node in passing.

To separate the traversal part from the printing part, we will introduce a tree data structure. The file system–traversal code will emit a tree, and the tree-showing code will consume a tree. We will rewrite our old tlist function, which we might as well rename to the more descriptive traverseAndPrint, to glue the two pieces together with the tree serving as glue:

traverseAndPrint :: Path -> IO ()
traverseAndPrint path = do
    tree <- fsTraverse root path
    putStrLn (showTree tree)
  where
    root = if "/" `isPrefixOf` path then "" else "."

That’s the plan. Now let’s carry it out.

Read more...

Posted in
Tags , , , ,
8 comments
no trackbacks
Reddit Delicious