File Uploads

Published in PHP Architect on 18 Oct 2004

Welcome to another edition of Security Corner. This month's topic is file uploads, and I focus on the mechanism you create to allow users to upload files to your application. Unlike typical form data, files are handled uniquely, and PHP uses the $_FILES array to provide you with all of the information you need. However, because it isn't very clear what information is provided by the client and what information is provided by PHP, a security-conscious developer can have a difficult time determining what data can be trusted.

This article takes a detailed look at file uploads, beginning with a brief discussion that walks you through the process and some example code that offers this feature. This is followed by a close examination of the mechanics of file uploads, and then I discuss the security risks inherent in this activity as well as some safeguards and best practices that you can implement in your own applications.

File Uploads

In order to let users upload files, you must present them with a typical HTML form. However, because files are not sent in the same way that regular form data is, you must specify a particular encoding:

  1. <form action="upload.php" method="POST" enctype="multipart/form-data">

The enctype attribute is often left out, so you might not be familiar with it. An HTTP request that includes both regular form data and files has a special format, and this attribute is necessary for the browser's compliance.

The form field for a file is actually very simple:

  1. <input type="file" name="attachment" />

This is rendered in various ways by the different browsers. Traditionally, the interface includes a standard text field as well as a browse button, so that the user can either enter the path to the file manually or browse for it. In Safari, only the browse option is available. Regardless, the behavior from a developer's perspective is the same, but you might want to be mindful of the differences in presentation, in case you have very specific instructions for the user.

To better illustrate a file upload, I present you with an example HTML form that can be used to allow users to upload attachments to a web-based email application:

  1. <p>Please choose a file to upload:</p>
  2. <form action="upload.php" method="POST" enctype="multipart/form-data">
  3. <input type="file" name="attachment" />
  4. <input type="submit" value="Upload Attachment" />
  5. </form>

The PHP directive upload_max_filesize can be used to limit the size of uploaded files, and post_max_size can potentially restrict this as well, because the file is part of the POST data.

Figure 1 shows how this form appears when rendered in a browser.

When this form is submitted, the HTTP request is sent to upload.php. In order to demonstrate what information is made available to you, upload.php does the following:

  1. <?php
  2.  
  3. header('Content-Type: text/plain');
  4. print_r($_FILES);
  5.  
  6. ?>

To experiment with my form, I choose to upload the September 2004 issue of php|architect, which is a file on my computer named phpa_09-2004.pdf. Upon selecting this article and submitting the form, I get the following response from upload.php:

  1. Array
  2. (
  3.     [attachment] => Array
  4.         (
  5.             [name] => phpa_09-2004.pdf
  6.             [type] => application/pdf
  7.             [tmp_name] => /tmp/phpz1A0zr
  8.             [error] => 0
  9.             [size] => 2704632
  10.         )
  11.  
  12. )

This shows exactly what information PHP provides in the $_FILES superglobal array. However, what it doesn't show is what information can be trusted. With a cursory glance, you probably suspect that the name is provided by the client, but what about the other information?

Multipart HTTP Request

In order to clarify things, it is necessary to examine the HTTP request, because this shows you exactly what is sent by the client. Because the September 2004 issue of php|architect is a rather large file, I use something much smaller in this next example. Using the same form, I can upload a file named attachment.txt that contains the following:

  1. Security Corner: File Uploads
  2. Chris Shiflett
  3. php|architect
  4. Oct 2004

When I upload this file, I see the following:

  1. Array
  2. (
  3.     [attachment] => Array
  4.         (
  5.             [name] => attachment.txt
  6.             [type] => text/plain
  7.             [tmp_name] => /tmp/phpnXMOeW
  8.             [error] => 0
  9.             [size] => 68
  10.         )
  11.  
  12. )

The HTTP request sent by my browser is as follows (some optional headers are removed for brevity):

  1. POST /upload.php HTTP/1.1
  2. Host: example.org
  3. Referer: http://example.org/attach.php
  4. Content-Type: multipart/form-data; boundary=---------------------------11401160922046879112964562566
  5. Content-Length: 298
  6.  
  7. -----------------------------11401160922046879112964562566
  8. Content-Disposition: form-data; name="attachment"; filename="attachment.txt"
  9. Content-Type: text/plain
  10.  
  11. Security Corner: File Uploads
  12. Chris Shiflett
  13. php|architect
  14. Oct 2004
  15.  
  16. -----------------------------11401160922046879112964562566--

It is not necessary to understand the format of this request, but it should be easy to spot the contents of the file and its associated metadata. The name attribute is the name of the form field given in attach.php, and the filename attribute is the name of the local file on the user's computer. Based on this example request, it seems that only the name and type are potentially dangerous, because it's difficult for an attacker to do much damage by changing the name of the form field itself, primarily because your code references this by name (e.g., changing the name will usually result in the code accessing a variable that does not exist, and the only caveat is when the code loops through all form data for some reason).

In order to better appreciate what an attacker can accomplish, use the code in Listing 1 to perform your own tests. You will need to change the domain name referenced in the Host header as well as make sure /upload.php exists on your server. If you can figure out a way to alter the tmp_name or size, you can discover a dangerous opening for an attacker.

Practical Risks

Surprisingly, there are very few additional risks associated with file uploads, although it is important to remember the risks associated with any form processing - you have no assurance as to the format or size of anything sent in each request.

Validating the format of a file depends entirely on your specific application, and binary files present additional challenges. Although I do not discuss these approaches here, anti-virus software, file signatures, and the like can be used to help prevent certain malicious file types. Although these are blacklist approaches and fundamentally flawed, they may be your only option.

The filename (attachment.txt in my example) is provided by the client, and it should be filtered before being used in any capacity. If your requirements allow it, you can ignore this information completely and choose your own name. This eliminates this particular risk entirely.

Theoretical Risks

There are two things in $_FILES for which you want to implement additional safeguards: tmp_name and size. While I have been unable to uncover a specific exploit in my research (I am limited to a small number of platforms), there are best practices available that prevent the theoretical attacks that involve the client being capable of manipulating this information.

In order to be assured that the filename given in tmp_name is actually the file that was uploaded with the form (and not an arbitrary file given by the user, such as /etc/passwd), you can use is_uploaded_file() as follows:

  1. <?php
  2.  
  3. $filename = $_FILES['attachment']['tmp_name'];
  4.  
  5. if (is_uploaded_file($filename)) {
  6.     /* $filename is an uploaded file. */
  7. }
  8.  
  9. ?>

This is particularly important in situations where some or all of the contents of the uploaded file are displayed to the user (perhaps for verification).

If you plan to simply move this file to another location in the filesystem, PHP provides you with a function that first checks whether the file given is an uploaded file and only moves it if it is:

  1. <?php
  2.  
  3. $old_filename = $_FILES['attachment']['tmp_name'];
  4. $new_filename = '/path/to/attachment.txt';
  5.  
  6. if (move_uploaded_file($old_filename, $new_filename)) {
  7.     /* $old_filename is an uploaded file, and the move was successful. */
  8. }
  9.  
  10. ?>

If you want to be sure of the file's size, you can use a standard PHP filesystem function after verifying that the file is a valid uploaded file:

  1. <?php
  2.  
  3. $filename = $_FILES['attachment']['tmp_name'];
  4.  
  5. if (is_uploaded_file($filename)) {
  6.     $size = filesize($filename);
  7. }
  8.  
  9. ?>

Until Next Time...

It might seem unnecessary to validate data when you are not aware of an exploit that lets the client modify it. This approach of adding redundant safeguards is known as defense in depth, and I highly recommend abiding by it. Theoretical attacks have been known to materialize into real attacks, and you'll be glad that you're prepared.

If you happen to find a way to modify the tmp_name or size attribute of a file, with register_globals, please let me know.

You can now eliminate file uploads from your list of worries, and I hope that I've also been able to provide some clarity regarding the underlying mechanism. Until next month, be safe.