Bloglines doesn't handle inter-element white space properly

By
Posted on
Tags: rants, bloglines, markup, atom, xml, html

If you’re reading my blog via Bloglines, you may have noticed that some of my posts look terrible, especially when they contain code snippets. I am sorry for that, but it’s not my fault. Bloglines doesn’t handle white space properly.

Here’s the more detailed explanation. When you request one of my feeds in, say, Atom format, you get back a bunch of XML that contains the most-recent posts from my blog. Each post is represented as lovingly crafted HTML, escaped per the Atom specs. When Bloglines gets its hands on this very same HTML, it attempts to scrub it nice and clean – get rid of any naughty bits, you know. And there’s nothing wrong with that. Except when the scrubbing goes horribly, horribly wrong. Which is exactly what happens when Bloglines encounters perfectly legitimate markup that represents syntax-highlighted code snippets.

What does Bloglines do then? It strips out all of the significant white space, turning each block of code into a single, mile-long, unbreakable line of NoSpaceText that forces your web browser to expand the page until it is wide enough to enshroud a small solar system. Then you are forced to scroll forever to read each line of the text column. Ugg.

More specifically, each syntax-highlighted code block is represented in HTML as a preformatted (PRE) text block. Each word in that block is wrapped in a SPAN element whose class attribute indicates the word’s role in the original source code. Keywords get one class, identifiers another, and so on. For example, the code “import List” might be represented as follows:

<span class="kwd">import</span> <span class="name">List</span>

But when Bloglines gets its hands on that markup, it strips out the whitespace between the SPAN elements:

<span class="kwd">import</span><span class="name">List</span>

Thus the markup renders as “importList” when it hits your web browser. Now imagine the same space-denuding bad behavior applied to all of the inter-element white space in a full-length block of code. That’s right, what you end up with is a single, insanely long LineOfUnbreakableText that your web browser chokes on. Again: Ugg.

The folks at Bloglines have had similar problems in the past, most of which have been fixed. I hope they fix this particular problem soon, too.

Until that time, however, you might want to consider other feed readers.