Recently I wrote about unit testing being a tool, not a goal in itself. I argued that unit testing was not a reliable way to fight certain kinds of common coding errors and, therefore, that unit testing ought to be supplemented with other tools.
To support my argument, I gave an example of a common, important coding error that unit testing does a bad job of helping programmers control. That error is failing to manage and escape strings properly: the “strings problem.” It is the mother of XSS and SQL-injection security vulnerabilities, not to mention the cause of legions of broken links and bad HTML on the web.
If you think I’m overstating the problem, or if you think that unit testing is a good way of solving it, let me show you how easy it is for even smart developers to get it wrong.
Consider Ruby on Rails, a great framework for developing web applications. Rails has an extensive suite of unit tests, and the Rails development guidelines require that changes to Rails be accompanied by unit tests that “prove [the] change works.”
Now consider that one of Rails’s most-used and
most-scrutinized
methods – the venerable link_to
helper – contains a
fundamental string-escaping error:
require 'rubygems'
'rails'
require_gem include ActionView::Helpers::UrlHelper
= "http://example.com?ohms_law?volt=1&=3"
url puts link_to("TEST", url)
The code, when executed, prints the following HTML snippet:
a href="http://example.com?ohms_law?volt=1&=3">TEST</a> <
The HTML snippet represents a hypertext link. The link should point
to the URL given in the code, but because the URL was not properly
escaped when it was converted into HTML by the link_to
helper, the
link is broken:
CORRECT: http://example.com?ohms_law?volt=1&=3
LINK_TO: http://example.com?ohms_law?volt=1&=3
^ oops
Here’s what’s going on. Because the URL was not escaped, web browsers
misinterpret its “amp” parameter as a character-entity reference,
which gets gobbled up when the link’s href
attribute is
parsed.
(To see this for yourself, save the output of the Ruby code into an
HTML file, open the file with your favorite web browser, and see where
the link points.)
Now, how come the unit tests didn’t catch this problem? It turns out, the tests got it wrong, too, by expecting broken output:
# in url_helper_test.rb
def test_link_tag_with_query
assert_dom_equal \"<a href=\"http://www.example.com?q1=v1&q2=v2\">Hello</a>",
"Hello", "http://www.example.com?q1=v1&q2=v2")
link_to(end
The point isn’t that the Rails developers are dumb. The point is that the Rails developers are smart. If they can’t get the strings problem right, even with all their brains and all their unit testing, what reason does any programmer have to think that unit testing is going to solve this problem reliably?
If, then, you want to solve the strings problem – and you really, seriously ought to want to solve the strings problem – you should consider options beyond unit testing.
Update 2007-09-04: I just noticed that the documentation for link_to has been revised to state that if you pass a string as its options parameter, the string will be interpreted not as a URL but as an HTML href attribute value, that is, an HTML-encoded URL. The old documentation:
def link_to(name, options = {}, html_options = nil, *parms)
Creates a link tag of the given name using an URL created by the set of options…. It’s also possible to pass a string instead of an options hash to get a link tag that just points without consideration.
The relevant part of the revised documentation:
It’s also possible to pass a string instead of an options hash to get a link tag that uses the value of the string as the href for the link.
So, according to the updated documentation, the test I described in my article is actually correct. Does this mean that string-handling code is Rails is worry free? The existence of helper methods like fix_double_escape suggests the answer is no.