Channels ▼
RSS

.NET

Prevent Cross-Site Scripting in ASP.NET Web Apps


Cross-Site Scripting (XSS) is the most pervasive vulnerability present in Web applications today. That being said, it is possible to build Web apps that are impervious to XSS by arming yourself with an understanding of the threat and a basic toolbox of encoding functions.

XSS Review

The attack occurs in a variety of scenarios where data is taken in by your website and then replayed to the user as an executable script. For example, imagine navigating to the following URL:

http://www.contoso.com/shopping?name=<script>eval(name)</script>

If the website were to replay the query string parameter into its HTML markup verbatim, malicious script would execute on the page. Given the same-origin policy security model of the browser, this script could perform actions or access data on behalf of the user behind the keyboard.

There are numerous other ways that an XSS vulnerability might arise. For example, imagine your Web application presents a page with a list of users. If one of the users managed to set their visible name to a SCRIPT element, we then have XSS, though this scenario does not involve query string parameters per se.

Alternatively, consider a situation where an onerror attribute results in malicious script execution (as opposed to a SCRIPT element). How many mechanisms like this exist within HTML/JavaScript that enable script execution? It turns out that there are a lot. Fortunately you don't need to be an XSS expert to prevent XSS vulnerabilities from being introduced in application logic.

Best Practice

Generally speaking, the strategy to pursue in building application code is to encode potentially untrusted content appropriately for the context in which it's being output on the page.

It's worthwhile to define these terms. Potentially untrusted content could be input from the user to the website, or even information stored in a database. If your Web application takes input from the URL, that data is potentially untrusted content because the data could have been supplied by an attacker. Information from cookies is not generally directly suspect because of restrictions enforced by the same origin policy; however, if the data originally came from the URL or an HTTP POST, you should consider it suspect. Perhaps the easiest way to define potentially untrusted content would be to say that it's any content that the application did not itself define statically in its own business logic.

The context on the page into which the output is placed is also very important to consider as it dictates how you must encode output. Consider the following example in ASP.NET:

<a href=
"http://contoso.com/app.aspx?var=<%:Server.UrlEncode(UntrustedVar)%> ">
<%: UntrustedTitle %>
</a>

Notice that the UrlEncode function is used to encode query string data. It's an IIS 6.0 function that converts spaces to + signs and non-alphanumeric characters to their hex equivalents. The default HtmlEncode-based encoding is used in the context of HTML. To understand why different encoding mechanisms are necessary, consider what malicious input might look like if encoding were not in place. In the HREF case, the output might close off the attribute and append a new attribute that would run script, for example:

  " onload=[Malicious script]

Whereas in the second case, an effective attack would be:

  <script>[Malicious script]</script>

So the various encoding mechanisms must encode different sets of characters to offer effective protection. (In addition, URLs are percent-encoded so that they may be properly parsed by browsers and Web servers, whereas HTML markup is encoded into HTML entities. Using the wrong encoding would create URLs or markup that can't be properly parsed.)

It is important to understand that each distinct output context requires a different encoding method. Other notable contexts include XML (attributes and markup), CSS, and JavaScript strings.

There is one issue worthy of note: Your application might hand off data to an external control or API to render on the page. In such a case, what encoding should be applied? To find out, you may need to evaluate the security guarantee provided by the external code. It seems reasonable to assume API input is encoded appropriately for the output context, although any particular API might, in fact, push that responsibility to its caller. The documentation for any good API should specify any required encoding necessary to ensure that output is rendered securely on the page. The <%: %> syntax in ASP.NET 4 and later provides a clever solution to this problem, utilizing a new HtmlString type.

The Microsoft AntiXSS Library

All major Web platforms provide some sort of API for output encoding. Microsoft's implementation for ASP.NET is a library of encoding functions referred to as the Microsoft AntiXSS Library. This library has been available since ASP.NET 4.5.

The first thing you'll want to do to leverage the AntiXSS Library on ASP.NET 4.5 is to enable it as the default encoder by adding the encoderType attribute to your Web.config file:

<httpRuntime ... 
    encoderType=
       "System.Web.Security.AntiXss.AntiXssEncoder, System.Web,
       Version=4.0.0.0, Culture=neutral,PublicKeyToken=b03f5f7f11d50a3a"
 />

This entry will cause default output encoding functionality in ASP.NET to use the conservative AntiXSS Library encoding. In addition, you may then begin to utilize APIs in the AntiXssEncoder class:

Each of these APIs encode data for different contexts. As described previously, it is very important to make use of the proper function for a given context. Examine the surrounding markup on the page to determine context appropriately and choose the right function or combination of functions. In cases where it's necessary to encode more than once, be aware that order is important. For example:

<a href=
”http://contoso.com/app.aspx?var=<%:Server.UrlEncode(UntrustedVar)%>”>
<%: UntrustedTitle %>
</a>
<script>
var x = "<%=HttpUtility.JavascriptStringEncode(UntrustedVariable)%>";
. . . 
</script>

In this example, only a single query string variable is encoded, using the UrlEncode function. UrlEncode and UrlPathEncode are not appropriate for encoding entire URLs. If you need to encode a full URL, it is necessary to, at minimum, validate its URL scheme to avoid allowing URLs with the JavaScript: or VBScript: URL schemes. To do this, construct a new Uri object and then validate the URL scheme as acceptable.

That's really all there is to it. While there are other XSS-related security techniques to evaluate when building your Web technology (such as sandboxing, HTML sanitization, and the like), you will find that proper encoding is what's necessary to prevent the most prevalent XSS bugs.

Conclusion

While XSS remains a pervasive Web threat, a good understanding of Web encoding techniques and their supporting APIs enables you to secure your Web applications. While all modern application platforms provide the necessary APIs to enable output encoding, it's up to individual developers to effectively apply the proper functions in the appropriate context.

The AntiXSS Library is available for download at no cost. Special thanks to Levi Broderick and Barry Dorrans for contributing to this article.


David Ross is a Principal Security Software Engineer on the MSRC Engineering team at Microsoft. Prior to joining MSRC Engineering in 2002, Ross spent his formative years on the Internet Explorer Security Team.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Comments:

Video