Directory-tree printing in Haskell, part three: lazy I/O

Posted on March 28, 2007

Tags: directory-tree-series, haskell, io, trees, laziness, lazy

This article is part three in a series on introductory Haskell programming. In the first article, we wrote a small program to recursively scan file-system directories and print their contents as ASCII-art trees. In the second article, we refactored the program to make its logic more reusable by separating the directory-scanning logic from the tree-printing logic. In this article, we will address a shortcoming of the refactored version: It must scan directory hierarchies completely before printing their trees, i.e., it must scan and then print, when doing both simultaneously is both more efficient and more user friendly.

Recall from the previous article that our directory-printing program is factored into three pieces of logic:

fsTraverse, which traverses a file-system hierarchy and returns a tree data structure;
showTree, which converts a tree into lovingly crafted ASCII art; and
traverseAndPrint, which prints the tree for a file-system hierarchy by using the first two pieces of logic.

The types of the functions are as follows:

fsTraverse       :: Path -> DentName -> IO DirTree
showTree         :: Tree String -> String
traverseAndPrint :: Path -> IO ()

Note that showTree is a pure function, but the other two return IO actions that may have side effects.

Within traverseAndPrint, fsTraverse and showTree are combined into a composite IO action by the =<< combinator:

putStr . showTree =<< fsTraverse root path

The sequencing semantics of Haskell’s IO monad forces all of the effects of fsTraverse to complete before any following effects can begin. To better understand these sequencing semantics, let’s consider a simple example.

The IO-monad code,

a >> b

can loosely be interpreted as running the action a, which forces its side effects to occur, and then running the action b, which forces its side effects to occur.

In reality, a and b are not actions. They are functions. Like all Haskell functions, they are pure and have no side effects. It’s just that a and b return values that represent actions, and those actions may have side effects, and the semantics of the IO monad guarantee the ordering of those effects . If you think that’s weird, hold that thought. For now, all that’s important is that, if the composite action represented by the expression is executed, the effects of a, regardless of how complex, will be executed before the effects of b.

Thus if a represents building a tree by recursively scanning a file-system hierarchy, the entire tree must be built before b ever gets a chance to do its thing. For our particular application, however, that particular sequencing is suboptimal. We know from our earlier, monolithic implementation that the file-system hierarchy can be scanned and printed simultaneously, which is more efficient. Ideally, then, our refactored implementation should be just as efficient.

In this article, we will look at one way to maintain the clean, logical separation of the a part from the b part while allowing the parts’ effects to be interleaved for efficiency. We will use an extension to the Haskell language to make the directory-scanning action lazy so that it builds the tree as the tree is consumed.

Ready? Let’s dive in.

Lazy I/O

I’ll be up front: we are going to be naughty. We are going to break the effect-ordering guarantees of the IO monad, but, if we are careful, we can get away with it. In fact, if you have used Haskell’s getContents or readFile functions, you have already used the fruits of this very naughtiness.

Before we go down the dark path, however, let’s spend a moment understanding why we might want to be naughty. Currently, our tree-building code builds the entire tree for a file-system hierarchy as one large IO action. If the hierarchy turns out to be large, the tree could gobble up lots of memory before the printing process has the chance to consume it. Further, if the underlying file-system operations are expensive, we could wait a long time before seeing any output from our program.

If we could produce the tree lazily, both of these problems would go away. As we built the tree, our tree-printing process would consume it, keeping the overall memory footprint low and keeping us informed of progress with timely output.

To better visualize our tree-building code’s laziness or lack thereof, let’s artificially raise the cost of visiting a file-system node. We will introduce fsVisit to simulate a costly file-system operation that takes 0.1 second to complete:

fsVisit :: DirNode -> IO DentName
fsVisit (_, name) = do
    usleep 100000
    return name

Now let’s work fsVisit into each step of our file-system-traversal logic by revising fsTraverseStep:

fsTraverseStep :: DirNode -> IO (DentName, [DirNode])
fsTraverseStep dnode@(path, node) = do
    name     <- fsVisit dnode
    children <- fsGetChildren (path ++ "/" ++ node)
    return (name, children)

Now we have a slow version of our original tlist program. I recommend that you download the source code and compile it so that you can run the program locally and experience the timing of its output.

If you run the program on a directory that contains a handful of files, it will pause for a second or two while it performs our artificially expensive directory scan. Finally, the scan completed, the program will print the directory tree all at once.

Now let’s make the tree-building process lazy. For this, we need the unsafeInterleaveIO combinator, an extension to Haskell 98. Here is its type signature:

unsafeInterleaveIO :: IO a -> IO a

What the combinator does is convert an action into a promise to perform the action later, should its result be needed. It takes an IO action, which might take a long time to carry out, and returns a substitute action that produces a value immediately. That value, however, is a placeholder: when you try to consume it, you will be made to wait while the real value is computed by performing the original action.

This ability to defer work via promises is exactly what we need to make our directory scans lazy. We can return each node of the tree as a pair of promises, one to compute the node’s root, and one to compute the node’s children. Let’s modify fsTraverseStep to do just that:

import System.IO.Unsafe -- add to imports at top of code

fsTraverseStep dnode@(path, node) = do
    name     <- unsafeInterleaveIO $ fsVisit dnode
    children <- unsafeInterleaveIO $
                fsGetChildren (path ++ "/" ++ node)
    return (name, children)

Now run this modified version of the program. Notice how each line of the directory tree is printed as it is visited in the file-system scan. We have made our tree-building process lazy!

Or have we? Let’s modify fsGetChildren to print the list of visible children as they are discovered:

fsGetChildren path = do
    contents <- getDirectoryContents path `catch` const (return [])
    let visibles = sort . filter (`notElem` [".", ".."]) $ contents
    print visibles  -- <--- add this line
    return (map ((,) path) visibles)

Recompiling and re-running our “lazy” program shows that all of the children are found before any of the printing occurs. In other words, even if we are visiting each node lazily, we are still finding all of the nodes in the hierarchy before the first node of the resulting tree is emitted from fsTraverse.

To see why, let’s examine the source code for unfoldTreeM, which is the library function we used to build the tree using our fsTraverseStep function:

unfoldTreeM :: (b -> IO (a, [b])) -> b -> IO (Tree a)
unfoldTreeM step seed = do
    (root, seeds) <- step seed                -- (1)
    children <- mapM (unfoldTreeM step) seeds -- (2)
    return (Node root children)               -- (3)

When unfoldTreeM calls our step function in line (1), it gets back a pair of promises, one for the root of the tree and another for the seeds from which to build the root’s children. In line (2), we see the problem: In order to convert seeds into a list of child nodes, unfoldTreeM calls mapM to recursively apply itself to seeds. The mapM action is sequenced before the return in line (3), and thus the value of seeds is consumed before the new tree Node can be constructed. In other words, because the mapM action isn’t deferred and because it depends on the value of seeds, it forces us to make good on our seeds promise earlier than we would like.

To get around this problem, we can create a lazy variant of unfoldTreeM. This version defers the mapM action until the children value is consumed:

lazyUnfoldTreeM :: (b -> IO (a, [b])) -> b -> IO (Tree a)
lazyUnfoldTreeM step seed = do
    (root, seeds) <- step seed
    children <- unsafeInterleaveIO $
                mapM (lazyUnfoldTreeM step) seeds
    return (Node root children)

Then we can use this version in our high-level fsTraverse function:

fsTraverse = curry (lazyUnfoldTreeM fsTraverseStep)

These changes made, we have a fully lazy version of the tree-building code. To see it in action, download this version of the whole program’s source code, compile it, and give it a try. If you add the print line to fsGetChildren like we did earlier, you’ll see that this version expands each directory only when its contents are needed. Now our lazy directory-tree-building code works as we would expect.

Mission accomplished.

Optional: Hiding the plumbing

It’s unfortunate that we had to create our own version of the library function unfoldTreeM. With its introduction, we lost some of our code’s tidiness. Can we get our clean code back?

Maybe we can. Recall that unfoldTreeM works for any monad. What if we created a LazyIO monad that was just like IO except that each action within it was performed only when its value was consumed? With this monad, we could eliminate the need for a special, lazy version of unfoldTreeM: the normal version would “do the right thing,” using our monad.

Let’s create our LazyIO monad. First, we’ll need a new module,

module LazyIO
 ( LazyIO(), runLazy, deferIO )
where

and, of course, we’ll need to import unsafeInterleaveIO:

import System.IO.Unsafe (unsafeInterleaveIO)

We will define LazyIO to be a type wrapper around the regular IO type:

newtype LazyIO a = LazyIO { unLazyIO :: IO a }

Note that our module definition, above, does not export the LazyIO type’s data constructor. The only way, then, to construct and destruct values will be through our interface.

To construct a value, we will provide deferIO, which converts a normal IO action into a LazyIO action:

deferIO :: IO a -> LazyIO a
deferIO = LazyIO

To get a regular IO action “out” of our monad, we will provide runLazy which converts a LazyIO action into an IO action that is deferred until it is consumed:

runLazy :: LazyIO a -> IO a
runLazy = unsafeInterleaveIO . unLazyIO

The key to our monad is that the only way to get an IO action out of LazyIO is through runLazy and thus through unsafeInterleaveIO. In fact, injecting an IO action into our monad and then immediately pulling it out again is identical to deferring it with unsafeInterleaveIO:

runLazy . deferIO = unsafeInterleaveIO

But what makes a monad interesting are its return and bind definitions. Here are ours:

instance Monad LazyIO where
    return     = LazyIO . return
    ma >>= fmb = LazyIO $ runLazy ma >>= unLazyIO . fmb

Our return simply returns the given value in the underlying IO monad and wraps the resulting action with the LazyIO constructor. The implication is that the only way to unwrap the action is via runLazy, which will defer it.

Likewise, our bind creates a composite LazyIO action from another LazyIO action and a function that produces a LazyIO action given the result of the first action. The composite action and the first action are deferred.

Exercise: Is LazyIO really a monad? Prove that it satisfies the monad laws.

To see the effect of our monad, let’s apply its definitions to the following simple lazy IO action.

runLazy $ do
    x <- return 5
    deferIO (print x)

With a little expansion, application, and simplification , we arrive at the following:

unsafeInterleaveIO $ do
    x <- unsafeInterleaveIO (return 5)
    print x

Note that the entire, composite IO action is deferred, as are the actions upon which it depends, which is exactly what we want.

Now we can return to housecleaning. With our new monad, we can banish both unsafeInterleaveIO and our custom, lazy version of unfoldTreeM from our otherwise uncluttered code.

Step one of the housecleaning is to modify fsTraverseStep to use the LazyIO monad instead of explicit calls to unsafeInterleaveIO:

import LazyIO   -- add to module imports at top of code

fsTraverseStep :: DirNode -> LazyIO (DentName, [DirNode])
fsTraverseStep dnode@(path, node) = do
    name     <- deferIO (fsVisit dnode)
    children <- deferIO (fsGetChildren (path ++ "/" ++ node))
    return (name, children)

Step two is to rewrite fsTraverse in terms of runLazy and the plain-old unfoldTreeM provided by the Data.Tree library:

fsTraverse :: Path -> DentName -> IO DirTree
fsTraverse p n = runLazy (unfoldTreeM fsTraverseStep (p,n))

Step three is to remove lazyUnfoldTreeM from our code: Snip!

And we’re done. The code:

To compile, here’s the command to use:

ghc -O2 -o tlist-slow-lazy-final --make tlist-slow-lazy-final.hs

Give the program a test run and convince yourself that it works as advertised.

Why unsafeInterleaveIO is “unsafe”

Now let’s get back to the “unsafe” part of unsafeInterleaveIO. What’s that about?

Haskell programmers label as “unsafe” those functions that allow you to circumvent Haskell’s safety guarantees. It’s a blunt way of letting you know that, if you use one of these functions, you can shoot yourself in the foot if your aim is bad. Thus, by using an “unsafe” function, you place the burden of proving that your code is safe upon yourself.

As we have already seen, we can use unsafeInterleaveIO to get around the side-effect ordering guarantees that Haskell’s IO monad normally provides. If we’re not careful, this could come back to bite us.

Using unsafeInterleaveIO on an action effectively removes the action from the normal world of IO actions and places it into its own parallel world, where it can run on its own schedule. Creating these parallel worlds is safe as long as we don’t attempt to do anything that depends upon the ordering of effects across worlds.

For example, take a look at this small Haskell program that will let us compare the output of a simple test under normal and unsafe-interleaved I/O sequencing:

{- file: unsafe-demo.hs -}

module Main where

import System.Environment
import System.IO.Unsafe

main = do
    args <- getArgs
    if args == ["lazy"]
        then doTest (unsafeInterleaveIO getLine)
        else doTest getLine

doTest myGetLine = do
    line1 <- myGetLine    -- (1)
    line2 <- myGetLine    -- (2)
    putStrLn line2
    putStrLn line1

The main function checks the command line to see if we have passed the “lazy” argument. If so, it passes to doTest; otherwise, it passes the straight getLine function.

The doTest function, in turn, constructs an IO action that reads two lines from the standard input using whichever version of getLine we have provided. First it reads line1, then line2. Finally, it prints the lines in the opposite order: first line2, then line1.

Let’s test this program on the following two-line file:

$ cat onetwo.txt
one
two

Ready? First, let’s run the straight-IO version:

$ runhaskell unsafe-demo.hs < onetwo.txt
two
one

As we would expect, the program read the lines from the file and printed them in reverse order.

Now, let’s pass the “lazy” flag, which wraps the getLine calls with unsafeInterleaveIO:

$ runhaskell unsafe-demo.hs lazy < onetwo.txt
one
two

For some reason, when we printed line2 we got the first line of the file, not the second. Likewise, when we printed line1, we got the second line. What happened?

What happened is that in lines (1) and (2) of the code, myGetLine evaluated to (unsafeInterleaveIO getLine). Thus line1 and line2 were bound to placeholder values that were connected to deferred getLine actions, each existing in its own parallel world and utterly unconnected to each other or to the main, normal IO world. The only thing, then, that could determine the order in which the two actions were performed was the order in which their values were consumed. And, because we printed line2 first, its deferred getLine action was the first to be performed, and so it read the first line of the input from the file. By the time we printed line1, causing its deferred getLine action to be performed, the first line of the file had already been consumed, and so the second line was returned.

Oops.

After this demonstration, we might wonder whether our lazy tree-building code is safe. After all, we’re using unsafeInterleaveIO like crazy behind the scenes. Let’s think about it.

Recall that, once detached from the main IO world, the only thing that determines when a deferred action is performed is when its value is consumed. That knowledge gives us the key to unlock the safety argument for our tree-building code.

Recall the type of fsTraverse:

fsTraverse :: Path -> DentName -> IO DirTree

The action that fsTraverse produces returns a DirTree value, which is a type synonym for a Tree of DentName values (file names). The Tree data type has only one constructor:

data Tree a = Node { rootLabel :: a,
                     subForest :: [Tree a] }

Each Node represents a tree: a root label and a sub-forest of children trees. Because of this representation, we cannot visit a node in a tree without first having visited all of its ancestors. In order to get to a child node in the tree, for example, we must go through the subForest value of its parent. Attempting to consume that value will force the attached, deferred action to be performed. That action, in turn, will fetch the contents of the parent’s directory and package each item, including the child we’re trying to get, into its own Node, the root and subForest of which will contain deferred actions for scanning further into the file-system hierarchy. Thus a top-down sequencing will be imposed on the deferred IO actions attached to those nodes. Fortunately, we want a top-down traversal because that’s the way we must descend directory hierarchies in the file system. Therefore, we have good reason to be assured of the safety of using unsafeInterleaveIO for this particular application.

For other applications, however, we must repeat this safety analysis anew, lest we allow the parallel words of our deferred actions to collide in unpleasant ways. Caveat coder.

Review and next time

In this installment of our ever-lengthening series of articles on directory printing, we revised our refactored directory-printing code to restore the efficiency of our original, monolithic implementation. In doing so, we experimented with lazy I/O and deferring IO actions by using the unsafeInterleaveIO function.

At first we inserted a customized version of the library function unfoldTreeM into our code to effect the desired laziness, but a little monad hacking allowed us to get rid of this intrusion. When we were done, we not only had a lean, lazy directory printer but also a new (and potentially unsafe) LazyIO library, which might prove useful in the future.

We also looked at the kind of problems that unsafeInterleaveIO can cause if we are not careful. Nevertheless, we were able to argue that our particular use of this function is safe: the Tree data structure ensures that we must consume tree nodes in the top-down order in which our directory-scanning must occur.

Still, our directory-printing code has problems. We haven’t yet improved its lackluster error-handling policy. Nor have we made the directory-scanning code amenable to uses beyond our immediate needs for directory printing. Next time, we will solve both of these problems.