About the Author

Chris Shiflett

Hi, I’m Chris: web craftsman, community leader, husband, father, and partner at Fictive Kin.


More on Filtering Input and Escaping Output

In my previous blog entry, I summarized the two most important steps (in my opinion) that all PHP developers should take to help secure their applications:

  • Filter input
  • Escape output

These are essentially "the least you can do" in terms of security. I consider anything less to be negligent (we all make mistakes, but these mistakes should be the exception and not the norm).

To my surprise, this simple statement has already been misinterpreted, and this is what prompted me to try to clarify things. Robert Peake writes:

Chris Shiflett has an interesting post on his blog wherein he declares that all PHP security vulnerabilities come from either a lack of flitering input or escaping output.

I hope that's not what I said, especially since it is wrong. :-) Filtering input and escaping output certainly aren't going to protect you from everything, but these two steps can improve the security of your applications substantially with very little effort.

Of course, my simple list leaves out many details, and that's fine. As I mentioned before, this list provides a broad perspective that helps to keep you on track while you focus on the details. I'm trying to help you focus on what's most important, because it's not always practical to implement every safeguard that you know.

The challenge is identifying data that comes from some external source - what is input? Robert mentions something else that I want to correct:

What this really points out once again is that web applications written in PHP do not really need to focus on much more than absolutely everything that a malicious attacker could throw at you through GET, POST or COOKIES (unless they have access to your server ENVIRONMENT ... *shudder*). Once again this means that if register_globals is turned off, these variables can only make their way in neatly packaged into corresponding $_GET, $_POST, and $_COOKIE arrays (as well as $_SESSION).

It is true that all data in $_GET, $_POST, and $_COOKIE is sent from the client and therefore tainted. However, data within $_SESSION is not. This data is persisted on the server and never even exposed over the Internet (unless you have a custom session handler that specifically does this). If you filter data on input, then you will never store tainted data in a session variable. Therefore, you can trust $_SESSION.

$_SERVER contains a mixture. Some of this data is provided by the web server, and some is provided by the client. Try this simple quiz.

Where does the data in each of the following PHP variables originate?

1. $_SERVER['DOCUMENT_ROOT']
2. $_SERVER['HTTP_HOST']
3. $_SERVER['REQUEST_URI']
4. $_SERVER['SCRIPT_NAME']

About this post

More on Filtering Input and Escaping Output was posted on Tue, 08 Feb 2005. If you liked it, follow me on Twitter or share:

10 comments

1.ryan king said:

I think the key phrase here would be that these two steps are "neccessary but not sufficient."

Wed, 09 Feb 2005 at 06:16:42 GMT Link


2.Robert said:

Glad to see you're fleshing this out. If we're really getting down to brass tacks, I wouldn't go so far as to say, "you can trust $_SESSION." The reference comes in via $_GET, $_POST, or $_COOKIE -- and data dependent on user-defined data is equally as dangerous.

Wed, 09 Feb 2005 at 16:49:18 GMT Link


3.Robert said:

p.s. -- all apologies if I misquoted you. Not trying to misrepresent what you say -- I just think you may be saying something that is more revolutionary than you realize.

Wed, 09 Feb 2005 at 16:59:32 GMT Link


4.Chris Shiflett said:

No worries. I knew you didn't intend to misquote me, but I wanted to make sure that no one relied on these two steps to protect them from everything. :-)

As for $_SESSION, I do consider it to be safe. It's not as safe as $clean (the array that most of my data filtering examples use), only because $clean does not persist. With $_SESSION, you are relying on the security of the session data store. So, if you're on a shared host, this is something to keep in mind. If you're storing session data in a database, the security of that database is important.

Strictly focusing on the application, however, $_SESSION is safe.

Wed, 09 Feb 2005 at 17:24:16 GMT Link


5.Chris Shiflett said:

By the way, is no one going to take the quiz? :-)

Wed, 09 Feb 2005 at 17:26:28 GMT Link


6.Dan said:

I'll play your game, you rogue.

1. Server

2. Client

3. Client

4. Server (though $_SERVER['SCRIPT_FILENAME'] could conceivably come from the client, I believe, if they specify a relative path)

Man, I hate tests! All the pressure...

Wed, 09 Feb 2005 at 18:28:47 GMT Link


7.Geoffrey Young said:

well, because they are in $_SERVER don't they come from the server by definition?

;)

so long as we're talking about apache and not some other php variant, these are each created via calls to apache's special APIs, ap_add_cgi_vars() and ap_add_common_vars(), which actually put these values into the execution environment. so anyone who can intercept the root of these values before those APIs are called can cause your data to be tainted...

1. $_SERVER['DOCUMENT_ROOT'] - this is taken straight from the private DocumentRoot httpd.conf configuration variable (so long as someone like mod_perl hasn't altered it before PHP took a peek at it).

2. $_SERVER['HTTP_HOST'] - ok this, like many others, is taken straight from the incoming request headers. in the case of host, it doesn't even need to be correct - you can use an absolute URI for the request and mungle the Host header and things will probably still work. not the best thing to trust, in any case.

3. $_SERVER['REQUEST_URI'] - this is taken directly from the GET request line, so I guess that makes it originate from the client. however, apache will validate this somewhat (for instance apache 1.3 will immediately fail if the URI that doesn't start with a slash). of course, mod_perl can muck with this value too, as can any number of input filters in httpd-2.0.

4. $_SERVER['SCRIPT_NAME'] - this is calculated by the server based on the requested URI, so again it's _kinda_ coming from the client, but in the same respect as the request line.

still feel safe? :)

Wed, 09 Feb 2005 at 20:36:33 GMT Link


8.Chris Shiflett said:

Hey, no ASF members! That's cheating! :-)

Wed, 09 Feb 2005 at 22:13:22 GMT Link


9.Geoffrey Young said:

:)

Wed, 09 Feb 2005 at 23:01:07 GMT Link


10.pavel said:

what about PHP_SELF? so many newbies like to use it as a target for their forms, but they forget that this one is as easy to manipulate as the address bar of the browser.

just add s.th. like

foo.php/<a%20href="asd"><hr>

to the url in order to provocate an xss vulnerabilty.

just take a look at one of such "masterpieces": http://design.definitelymaybe.org/guestbook/example.php

Mon, 26 Sep 2005 at 21:43:31 GMT Link


Hello! What’s your name?

Want to comment? Please connect with Twitter to join the discussion.