A type-based solution to the "strings problem": a fitting end to XSS and SQL-injection holes?

Posted by Tom Moertel Thu, 19 Oct 2006 01:40:00 GMT

Even skilled programmers have a hard time keeping their web applications free of XSS and SQL-injection vulnerabilities. And it shows: a sobering portion of web sites are open to some scary security threats.

Why are so many sites vulnerable to these well-known holes? Probably because it’s insanely hard for programmers to solve the fundamental “strings problem” at the heart of these vulnerabilities. The problem itself is easy to understand, but we humans aren’t equipped to carry out the solution. Simply put, we just plain suck at keeping a bazillion different strings straight in our heads, let alone consistently and reliably rendering their interactions safe whenever they cross paths in a modern web application. It’s easy to say, “just escape the darn things,” but it’s hard to get it right, every single time.

Computers, on the other hand, are pretty good at keeping track of details by the bucket-full. Wouldn’t it be nice, then, if our programming languages gave us the power to delegate this nasty “strings problem” to our computers, which could then devote their unwavering mechanical precision to grinding the problem out of existence? Isn’t that the kind of thing modern programming languages are supposed to be good at?

I’d like to think the answer to that question is a big, you betcha.

So let’s grab a modern programming language and solve the strings problem.

Let’s solve the strings problem in Haskell

In this article, we will look at one way (among many) to solve the strings problem: by adding Ruby-style string templates to Haskell. These templates support “interpolation” via the usual, convenient #{var} syntax, but here interpolation is type safe. Haskell’s type system will prevent us from inadvertently mixing incompatible string types, and it will detect mistakes at compile time, before they can become live XSS or SQL-injection holes. Further, our solution will offer us these benefits without making us jump through hoops or pay some onerous syntax penalty.

To be more specific, the system offers the following benefits:

  • It provides a string-management kernel that lets you create “safe strings” by certifying a regular string as representing either text or a fragment of a known language.
  • It allows you to conveniently define new language types for any string-based language that you can provide an escaping rule for (e.g., XML, URLs, SQL, untrusted user input).
  • It provides compile-time syntactic sugar (via Template Haskell) that makes working with safe strings as convenient as working with string interpolation in languages like Ruby and Perl.
  • It catches and reports (at compile time) the following commonly made programming errors:
    • failing to escape a plain-old-text string before mixing it into a string that represents a language fragment
    • mixing strings that represent fragments of incompatible languages
    • mixing strings that represent fragments of compatible languages in an ambiguous way (the system will force you to disambiguate)

(This is a long one, so grab an espresso, lean back, and read on in style. Also, if you have a smoking jacket, you might want to get it now.)

Read more...

Posted in , , , , , ,
Tags , , , ,
42 comments
no trackbacks
Reddit Delicious

If unit testing can't keep Rails safe from string-escaping problems, what makes you think it will keep your projects safe?

Posted by Tom Moertel Thu, 12 Oct 2006 20:06:00 GMT

Recently I wrote about unit testing being a tool, not a goal in itself. I argued that unit testing was not a reliable way to fight certain kinds of common coding errors and, therefore, that unit testing ought to be supplemented with other tools.

To support my argument, I gave an example of a common, important coding error that unit testing does a bad job of helping programmers control. That error is failing to manage and escape strings properly: the “strings problem.” It is the mother of XSS and SQL-injection security vulnerabilities, not to mention the cause of legions of broken links and bad HTML on the web.

If you think I’m overstating the problem, or if you think that unit testing is a good way of solving it, let me show you how easy it is for even smart developers to get it wrong.

Consider Ruby on Rails, a great framework for developing web applications. Rails has an extensive suite of unit tests, and the Rails development guidelines require that changes to Rails be accompanied by unit tests that “prove [the] change works.”

Now consider that one of Rails’s most-used and most-scrutinized methods – the venerable link_to helper – contains a fundamental string-escaping error:

require 'rubygems'
require_gem 'rails'
include ActionView::Helpers::UrlHelper

url = "http://example.com?ohms_law?volt=1&amp=3" 
puts link_to("TEST", url)

The code, when executed, prints the following HTML snippet:

<a href="http://example.com?ohms_law?volt=1&amp=3">TEST</a>

The HTML snippet represents a hypertext link. The link should point to the URL given in the code, but because the URL was not properly escaped when it was converted into HTML by the link_to helper, the link is broken:

CORRECT:  http://example.com?ohms_law?volt=1&amp=3
LINK_TO:  http://example.com?ohms_law?volt=1&=3
                                             ^ oops

Here’s what’s going on. Because the URL was not escaped, web browsers misinterpret its “amp” parameter as a character-entity reference, which gets gobbled up when the link’s href attribute is parsed. (To see this for yourself, save the output of the Ruby code into an HTML file, open the file with your favorite web browser, and see where the link points.)

Now, how come the unit tests didn’t catch this problem? It turns out, the tests got it wrong, too, by expecting broken output:

# in url_helper_test.rb

def test_link_tag_with_query
  assert_dom_equal \
    "<a href=\"http://www.example.com?q1=v1&amp;q2=v2\">Hello</a>",
    link_to("Hello", "http://www.example.com?q1=v1&amp;q2=v2")
end

The point isn’t that the Rails developers are dumb. The point is that the Rails developers are smart. If they can’t get the strings problem right, even with all their brains and all their unit testing, what reason does any programmer have to think that unit testing is going to solve this problem reliably?

If, then, you want to solve the strings problem – and you really, seriously ought to want to solve the strings problem – you should consider options beyond unit testing.

Update 2007-09-04: I just noticed that the documentation for link_to has been revised to state that if you pass a string as its options parameter, the string will be interpreted not as a URL but as an HTML href attribute value, that is, an HTML-encoded URL. The old documentation:

def link_to(name, options = {}, html_options = nil, *parms)
Creates a link tag of the given name using an URL created by the set of options.... It’s also possible to pass a string instead of an options hash to get a link tag that just points without consideration.

The relevant part of the revised documentation:

It’s also possible to pass a string instead of an options hash to get a link tag that uses the value of the string as the href for the link.

So, according to the updated documentation, the test I described in my article is actually correct. Does this mean that string-handling code is Rails is worry free? The existence of helper methods like fix_double_escape suggests the answer is no.

Posted in , ,
Tags , , , ,
5 comments
no trackbacks
Reddit Delicious