Bloglines doesn't handle inter-element white space properly
Posted by Tom Moertel Tue, 11 Sep 2007 16:51:00 GMT
If you’re reading my blog via Bloglines, you may have noticed that some of my posts look terrible, especially when they contain code snippets. I am sorry for that, but it’s not my fault. Bloglines doesn’t handle white space properly.
Here’s the more detailed explanation. When you request one of my feeds in, say, Atom format, you get back a bunch of XML that contains the most-recent posts from my blog. Each post is represented as lovingly crafted HTML, escaped per the Atom specs. When Bloglines gets its hands on this very same HTML, it attempts to scrub it nice and clean – get rid of any naughty bits, you know. And there’s nothing wrong with that. Except when the scrubbing goes horribly, horribly wrong. Which is exactly what happens when Bloglines encounters perfectly legitimate markup that represents syntax-highlighted code snippets.
What does Bloglines do then? It strips out all of the significant white space, turning each block of code into a single, mile-long, unbreakable line of NoSpaceText that forces your web browser to expand the page until it is wide enough to enshroud a small solar system. Then you are forced to scroll forever to read each line of the text column. Ugg.
More specifically, each syntax-highlighted code block is represented in HTML as a preformatted (PRE) text block. Each word in that block is wrapped in a SPAN element whose class attribute indicates the word’s role in the original source code. Keywords get one class, identifiers another, and so on. For example, the code “import List” might be represented as follows:
<span class="kwd">import</span> <span class="name">List</span>
But when Bloglines gets its hands on that markup, it strips out the whitespace between the SPAN elements:
<span class="kwd">import</span><span class="name">List</span>
Thus the markup renders as “importList” when it hits your web browser. Now imagine the same space-denuding bad behavior applied to all of the inter-element white space in a full-length block of code. That’s right, what you end up with is a single, insanely long LineOfUnbreakableText that your web browser chokes on. Again: Ugg.
The folks at Bloglines have had similar problems in the past, most of which have been fixed. I hope they fix this particular problem soon, too.
Until that time, however, you might want to consider other feed readers.
readers


Try replacing those inter-element blanks with non-breaking spaces. (Using a numeric character reference is probably the safest option.)
Aristotle Pagaltzis,
Thanks for your comment.
Bloglines doesn’t just eat the horizontal white space but the vertical as well, as long as it’s between elements. So I can’t just drop non-breaking spaces into my markup to work around the problem. It was a good thought, though.
(Also, I don’t feel like munging perfectly sensible markup to work around somebody else’s bugs.)
Cheers,
Tom
Well, just like all other tags,
<br>still works inside<pre>tags. You just have to drop the literal linebreaks you’re replacing, or else you’ll get double linebreaks on display.As for working around someone else’s bugs, I agree, but that’s often your lot on the ’net if that buggy someone is much bigger than you – cf. all the interaction designer effort wasted on making CSS work in IE6. Sigh.
Indeed. In working around one vendor’s bugs, however, it’s all too easy to bump into another vendor’s bugs. It’s not just a matter of changing my output until Bloglines can finally interpret it correctly. It’s a matter of emitting output that reduces breakage across all of the buggy consumers in the wild. For that reason, I’m not inclined to change the format of output that is already simple, correct, and correctly interpreted by just about everybody – except Bloglines.
Moreover, history suggests that Bloglines needs the stick now and again to motivate it to do the right thing. Thus I am inclined to put Bloglines’s breakage on display: maybe if enough users complain (or switch to Google Reader), the company will get the message.
Cheers,
Tom