A bright future: security and modern type systems
Posted by Tom Moertel Wed, 15 Aug 2007 20:07:00 GMT
The recent defacement of the United Nations web site is a prime example of why we programmers shouldn’t trust ourselves to write secure code – at least not without our computers’ help. The U.N. web site, according to Slashdot’s coverage of the incident, was defaced by way of a common, well-known attack: SQL injection. What’s interesting is that programmers can render this attack harmless by employing simple, readily available programming tools such as placeholders and prepared statements. Why, then, are so many web sites, including the UN site apparently, still vulnerable?
Some say it’s because the programmers of these sites are incompetent, but that argument ignores that programmers are human, while the security tools we give them offer meaningful protection only if wielded with inhuman perfection. Having the tools to plug security holes, even if the tools are simple to use and readily available, is not enough to ensure that every single security hole will be identified, let alone plugged. Even the most experienced programmer can be expected to overlook a hole now and then. Unfortunately, one hole is all it takes.
That’s because security is not like other software-quality challenges: its costs are fundamentally asymmetric. For the attacker, the bad guy, the challenge is to find just a single exploitable hole. For us, the good guys, the challenge is to achieve perfection: to plug all of the holes in our code, every single one. That’s because attackers, unlike regular users, can be expected to probe our code until they find a hole to exploit.
How then do we ensure that we have plugged every single hole in our code? Testing isn’t sufficient: we can easily overlook holes when writing tests – a perfectly human error. We could supplement testing with code reviews, painstakingly searching for remaining holes while enforcing the use of hole-preventing best practices, but reviews are expensive and, again, subject to human error. A better approach, both less costly and more reliable, is to delegate this burden to our computers, which can do the job correctly, every single time.
This kind of delegation is possible today with modern static type systems. For example, in A Type-Based Solution to the Strings Problem, I offered a tiny “safe strings” library for the Haskell programming language. The library takes advantage of Haskell’s powerful type system to detect unsafe string interactions at compile time. If we faithfully build our code on top of the library, and our code compiles without error, we can be assured that our code is free – completely free – of SQL-injection (and XSS) holes.
While this result is indeed quite beautiful, it certainly isn’t novel. Researchers have been proving interesting properties via type systems for a long time. As Oleg Kiselyov and Chung-chieh Shan pointed out in a comment on my earlier article, the foundational idea is over three decades old.
More recently, Kiselyov and Shan have extended the idea to guarantee more-interesting properties using a trusted kernel and types that represent lightweight static capabilities. The kernel, which is small enough to be reasoned about and formally verified, carefully hands out capabilities to untrusted application code. The untrusted code, in turn, presents the capabilities back to the kernel to invoke operations, which, thanks to the kernel’s trustworthiness, are guaranteed to be safe. (My safe-string library can be seen as a trivial implementation of this programming style.)
When static type systems are used in this way, they don’t merely catch typos and bugs that good testing would have caught as a matter of course, but offer programmers guarantees that would have been impractical to obtain any other way.1 If you consider security important, you might bear this fact in mind when choosing languages for your next project.
Going further, the security benefits of rich static type systems are only now starting to trickle into mainstream industry. As libraries like “safe strings” and idioms like static capabilities become more familiar and get woven into future generations of development frameworks, we can expect marked improvements in the security and robustness of our applications.
In the not-too-distant future, perhaps, we might look back in amazement at the days when important security properties were neither free nor guaranteed but expensive and uncertain, underwritten only by the heroic efforts of individual programmers, struggling against impossible odds to achieve inhuman perfection.
Then again, it sure took garbage collection a long time to catch on.
1. How, for example, could you eliminate the possibility of SQL-injection and XSS holes via testing?
I suppose you could do it if you worked at it hard enough. You could augment your string data structures with run-time information about what they represent: this string represents SQL, this string represents plain-old text, and so on. Then you could redefine your string operations and template interpolation systems to assert that their string inputs were compatible. Of course, if these assertions ever failed, they would do so only at run time, when it would be too late to do anything but die rather ungracefully. So you would be forced to augment your code-coverage tools to ensure that every string-path was covered during testing. That way you could catch all potential run-time string failures – indicating holes – during testing and eliminate the holes (and the subsequent need to fail at run time) before you deployed your application for real.
So, yes, you could do it. But to do so would require you, in effect, to write a crude, single-purpose type system that checks types at test time. That says something, doesn’t it?

http://www.cs.cmu.edu/~fp/papers/manifest07.pdf
another recent proposal.
You could use randomized testing like quickcheck. Of course, this wouldn’t ensure that all holes were found.
After reading this article, I’m not sure you know what an SQL Injection vulnerability is.
You can construe a myriad of coding errors as “type system failures”, for sufficiently broad definitions of “type system”. But in conventional (read: non-Haskell) web applications, SQL injection isn’t caused by inadequate string types.
SQL Injection happens because web application developers cons up SQL queries on the fly, inside of strings, into which user input is interpolated, and then present them to a database hoping for the best. The solution to this problem isn’t a smarter string type that can ensure safer interpolation; the solution, almost universally adopted by the industry, is to seperate code from data using prepared/parameterized statements or stored procedures, either of which parse statements seperately from user-controlled data.
Your solution to this problem is evil. Instead of segregating user inputs from parseable code, you try to employ transparent escaping to ensure safe interpolation. “A-ha!”, you say, “there is no way in my type system to evade my escaping, ergo my strings are safe”. Until you realize you forgot about internationalization, or whatever else Jeremiah Grossman or Alex Stamos thinks up next year.
This is simply mod_security, provided at a library level. As a general approach, it has a poor track record.
@Thomas H. Ptacek: First, thanks for your comments. I can see that you’re passionate about this topic, so I appreciate your taking the time to share your views.
Second, I would ask that you please give my article a closer read because, for example, you seemed not to notice the first paragraph, which acknowledges the “almost universal” solutions to SQL injection that you mentioned and points out that they aren’t getting the job done: lots of sites are still vulnerable. It’s this state of affairs that the article attempts to address.
Third, I don’t think you understand what my “evil” safe-strings solution actually does. You wrote that the right solution is to “segregate user inputs from parseable code,” but that’s what my solution does. My solution performs the segregation at the type level, however, and if you’re not expecting it, I guess it’s easy to overlook. But that’s how it works. (And if you think it’s anything like mod_security, you’re really off the mark. For starters, mod_security doesn’t provide its assurances at compile time.)
Thanks again for your comments. If you take a closer look at my safe-strings library, I would be interested in your feedback.
Cheers,
Tom
Your footnote about a “crude single-purpose type system” accurately describes the Perl tainting approach.
If people get nothing out of this except the part about the game being asymmetric, that’s enough. My awakening came during a testing of a CMS we built. The client was a major IT Consultancy and one thing they offer are security audits. Seeing the results, I finally got the point that thinking like a developer will never plug all the holes. Meta-answers (like your type system) that eliminate a whole class of mistakes from the realm of possibility are what is needed.
Rather than being terribly clever like I (developer) imagined, the security audit was brute force: they spidered thousands of pages to disk, ran scripts looking for form inputs and crafted fake posts to all those urls with noise for data. And blew up old legacy things no one had thought about or used in years. I don’t know if a human ever had to be involved beyond pointing the script at the initial url.
The best analogy I can come up with is the last time I got electrocuted*: the instant previous, my hand was controlled by a series of tiny, intentional, carefully crafted electrical impulses. Then it was taken over by a flood of huge impulses coming through every nerve ending I have (had). There’s no way to control that, you just have to avoid it.
Tom Clancy:
I don’t think you understood Thomas’ point.
Your solution does not separate code from data at a database level. A SQL prepared statement expects the SQL code and the data to be fed to the database separately. Only a bug in the database will allow you to exploit a prepared statement.
Your solution generates regular “code and data together” SQL and will fail when someone manages to escape something or play encoding tricks in a way which your type system didn’t foresee. You also need to remember that different databases are subtly different and moving targets as it comes to escaping and encoding rules.
The fact that most web sites are exploitable doesn’t mean that the solutions Thomas talks about are inefficient, it just means that web sites are not using them. It’s not the solutions that are not doing their job, it’s the managers and programmers. Surely, it requires discipline and has a price, but it works.
I personally dislike static typing and even if we could agree that using a language with advanced static typing and inference would solve security issues, that’s not a realistic solution for web sites that have already been built nor for many that will be built by many organizations in the near future for a myraid of reasons.
Mike: I’ll respond to your comment because I think you have confused me (Tom Moertel, the guy who wrote this article) for Tom Clancy, the guy who posted a comment earlier.
Just how is somebody going to “escape something” or “play encoding tricks” in a way that evades the type system? Can you show me an example (that doesn’t assume the programmer is pathological)?
Assume we have the typical newbie web programmer: well-meaning but possibly forgetful and lacking knowledge of security best practices (or even fundamentals). Now, drop him into a programming environment where something like the safe-strings system has been baked into the web-app framework. Every string this programmer gets or gives to the framework will have the right type already associated with it at compile time. Now, tell me, just how is this programmer, through anything but an outright attempt at subversion, going to evade the type system? How is he going to manage to write code that stuffs a string that the type system thinks is “plain old text” into a template that represents SQL or XHTML or a URL or anything else without the string being properly escaped?
True. But if we assume that advanced static typing will someday work its way into the mainstream – a reasonable assumption – then we might at least want to start thinking about that future today.
Thanks for your comment.
Cheers,
Tom
I got to agree with Poul Johnson, he`s right. Btw. Great blog!