Understanding Superglobals

Published in PHP Architect on 25 Jul 2006

A superglobal is a PHP array that is available in every scope. Examples include $_GET and $_POST. Global variables, which exist only in the global scope, are therefore different, despite some claims to the contrary. Consider the following example:

  1. <?php
  2.  
  3. function greeting() {
  4.   echo "Hello, {$name}.";
  5. }
  6.  
  7. $name = $_GET['name'];
  8.  
  9. greeting();
  10.  
  11. ?>

This example uses two variables, $name (a global) and $_GET['name'] (a superglobal). This example also demonstrates a couple of problems:

  1. $name does not exist in the function’s local scope, so the reference to $name within greeting() is invalid. This will generate a notice.
  2. The intended use of the function demonstrates a cross-site scripting vulnerability, because $_GET['name'] is never inspected.

There are a few ways to provide greeting() with the value of $_GET['name']. One approach is to declare $name a global:

  1. <?php
  2.  
  3. function greeting() {
  4.   global $name;
  5.   echo "Hello, {$name}.";
  6. }
  7.  
  8. ?>

This has the disadvantage of specifying a particular variable name, so anyone who uses this function must be sure that $name is defined in the global scope. This approach also has the disadvantage of obscuring the function’s utility. Because it outputs the value of $name when it is called, someone can easily create a cross-site scripting vulnerability by tainting $name:

  1. <?php
  2.  
  3. $name = $_GET['name'];
  4.  
  5. greeting();
  6.  
  7. ?>

Without observing the function’s source, it is not obvious that the value of $name is used therein. Contrast this with accepting the name as an argument to the function:

  1. <?php
  2.  
  3. function greeting($name) {
  4.   echo "Hello, {$name}.";
  5. }
  6.  
  7. ?>

With the function defined in this way, a developer must explicitly pass the value to be used by the function:

  1. <?php greeting($_GET['name']); ?>

You must still observe the function’s source to state with any certainty that this does (or does not) create a cross-site scripting vulnerability, but it is at least clear to the developer that the tainted value of $_GET['name'] is provided to greeting().

Superglobals

Superglobals are more than just ubiquitous arrays. Each superglobal provides essential information.

$_POST is an array of values provided in a form that’s defined with a method of POST:

  1. <form method="POST" action="/process.php">

More specifically, it is the parsed representation of the content of a POST request where the Content-Type is application/x-www-form-urlencoded:

  1. POST /process.php HTTP/1.1
  2. Host: example.org
  3. Content-Type: application/x-www-form-urlencoded
  4. Content-Length: 36
  5.  
  6. name=chris&email=chris%40example.org

For such a request, $_POST['name'] would have a value of chris, and $_POST['email'] would have a value of chris@example.org.

The $_GET superglobal is very similar, but it provides a parsed representation of the query string. Thus, despite the name, it is not limited to GET requests. The reason for the name is that form data is provided in the query string of a URL when the form is defined with a method of GET:

  1. <form method="GET" action="/process.php">

The $_COOKIE superglobal is a parsed representation of the Cookie header in the current request. For example, consider a request that provides two cookies:

  1. GET /cookie.php HTTP/1.1
  2. Host: example.org
  3. Cookie: name=chris; email=chris%40example.org

In the requested PHP script (cookie.php), $_COOKIE['name'] has a value of chris, and $_COOKIE['email'] has a value of chris@example.org.

Adding a Set-Cookie header to the response with setcookie() or header() does not modify the contents of $_COOKIE.

$_REQUEST is a superglobal that combines $_GET, $_POST, and $_COOKIE. Therefore, it obscures the source of data, and this can increase the risk of certain security vulnerabilities such as cross-site request forgeries (CSRF).

$_FILES contains information about uploaded files. Refer to my article on file uploads for more information.

The values in $_SERVER are provided by the web server. This does not mean that these values are trustworthy. In many cases, the HTTP request can affect values in $_SERVER. Notable examples include $_SERVER['PHP_SELF'], $_SERVER['SERVER_NAME'], and all elements with a key that begins with HTTP.

$_ENV is a collection of values provided by the environment.

Values set with Apache’s SetEnv directive are available in $_SERVER, not $_ENV.

The $_SESSION superglobal provides a convenient mechanism for storing and retrieving session data. When you add an element to this array, PHP persists it in the session data store. When you call session_start(), $_SESSION is populated with all session data associated with the current client’s session.

Finally, $GLOBALS is a superglobal containing all global variables. Thus, using it within a function is similar to declaring variables as global. It’s worth noting that it does not follow the same naming convention as the other superglobals, as its name is missing the leading underscore.

Is $_POST More Secure than $_REQUEST?

One of the questions raised in the mailing list discussion I mentioned earlier was whether using $_POST was somehow more secure than using $_REQUEST. This is a question where the answer is not simple and straightforward. Neither are trustworthy, so it’s difficult to state that one untrustworthy source of information is more or less trustworthy than another. In fact, the difference between the two has more to do with attack vectors than the trustworthiness of the source.

More specifically, using $_REQUEST lowers the bar for cross-site request forgeries, because it’s easier to forge GET requests than POST requests. Since it’s not impossible to forge POST requests, however, it’s not valid to claim that using $_POST prevents cross-site request forgeries. So, why bother distinguishing?

There are two reasons you should be using $_POST instead of $_REQUEST for forms that perform an action. One reason is that it is easier to forge GET requests, increasing your security risk. Another is that it violates the HTTP specification. RFC 2616 states the following:

In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered “safe”. This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.

Thus, if a GET request can perform an action, your application violates this convention.

Injection and Initialization

Another statement from the mailing list discussion that inspired this article suggests that enabling the register_globals directive can lead to your global variables being overwritten and that your only protection is the unpredictability of your variables names. This is not true.

When register_globals is enabled, it’s true that tainted data from various remote sources is registered as global variables. However, this takes place before your script executes. (This is why disabling register_globals in your code makes no sense.) Thus, by simply initializing a variable, you can be sure that its value cannot be manipulated by a third party. You can try for yourself using a simple example:

  1. <?php
  2.  
  3. $name = 'chris';
  4.  
  5. echo $name;
  6.  
  7. ?>

This script will output chris, regardless of how it is requested.

The point is that it’s very helpful to understand timing. Once control is passed to your script, the request has already been received, and the user’s opportunity to inject malicious data has passed. Thus, by initializing your variables, you put yourself in control of their value.

You should be careful with arrays and realize that simply adding an element to an array is not the same as initializing it. For example, consider the following:

  1. <?php $names[] = 'chris'; ?>

This example assumes an existing array called $names, so it’s just as bad as appending characters to an uninitialized string and thinking this initializes the string:

  1. <?php $string .= 'foo'; ?>

In both cases, if malicious data has been injected (in $names or $string), the malicious data persists. Therefore, you should always initialize arrays with the array() construct:

  1. <?php $names = array(); ?>

Until Next Time…

I hope you have a deeper understanding of PHP’s superglobals and how to use them safely and effectively. In many cases, the security of your application has as much to do with your understanding of it as anything else. By understanding your application and the environment in which it operates, you’re in a better position to predict and prevent security vulnerabilities before they are exploited.

Until next month, be safe!