Thursday, March 15, 2007

Web security: avoiding HTML injection

A shockingly high percentage of web applications have various kinds of security holes (or just bugs), and one of the biggest causes is failing to quote strings before putting them into an HTML page. See for example LWN: Cross-site scripting attacks.

Most people have now figured out this issue to the point of realizing that your application needs to quote text as it outputs it and providing a function to escape HTML, such as the PHP htmlspecialchars or any number of locally written versions such as the MIFOS xmlEscape in MifosTagUtils.

However, requiring people to remember to call it is error prone. A better approach is that HTML is one thing, and strings are another. So inserting a string into an HTML document will quote it. A few template systems do this (tinytemplate, Amrita, probably a few others). Most HTML-generating libraries (builder, DOM, XmlStreamWriter, etc) do it. If you are using an older template system, like JSP, ASP, PHP, etc, start thinking about how to migrate to something which is secure by default. Here's an article about these issues in Rails (recommending builder instead of rhtml).