About the Author

Chris Shiflett

Hi, I'm Chris, a web developer and a founding member of Analog. I live and work in Brooklyn, NY.


More on Filtering Input and Escaping Output

In my previous blog entry, I summarized the two most important steps (in my opinion) that all PHP developers should take to help secure their applications:

  • Filter input
  • Escape output

These are essentially "the least you can do" in terms of security. I consider anything less to be negligent (we all make mistakes, but these mistakes should be the exception and not the norm).

To my surprise, this simple statement has already been misinterpreted, and this is what prompted me to try to clarify things. Robert Peake writes:

Chris Shiflett has an interesting post on his blog wherein he declares that all PHP security vulnerabilities come from either a lack of flitering input or escaping output.

I hope that's not what I said, especially since it is wrong. :-) Filtering input and escaping output certainly aren't going to protect you from everything, but these two steps can improve the security of your applications substantially with very little effort.

Of course, my simple list leaves out many details, and that's fine. As I mentioned before, this list provides a broad perspective that helps to keep you on track while you focus on the details. I'm trying to help you focus on what's most important, because it's not always practical to implement every safeguard that you know.

The challenge is identifying data that comes from some external source - what is input? Robert mentions something else that I want to correct:

What this really points out once again is that web applications written in PHP do not really need to focus on much more than absolutely everything that a malicious attacker could throw at you through GET, POST or COOKIES (unless they have access to your server ENVIRONMENT ... *shudder*). Once again this means that if register_globals is turned off, these variables can only make their way in neatly packaged into corresponding $_GET, $_POST, and $_COOKIE arrays (as well as $_SESSION).

It is true that all data in $_GET, $_POST, and $_COOKIE is sent from the client and therefore tainted. However, data within $_SESSION is not. This data is persisted on the server and never even exposed over the Internet (unless you have a custom session handler that specifically does this). If you filter data on input, then you will never store tainted data in a session variable. Therefore, you can trust $_SESSION.

$_SERVER contains a mixture. Some of this data is provided by the web server, and some is provided by the client. Try this simple quiz.

Where does the data in each of the following PHP variables originate?

1. $_SERVER['DOCUMENT_ROOT']
2. $_SERVER['HTTP_HOST']
3. $_SERVER['REQUEST_URI']
4. $_SERVER['SCRIPT_NAME']

About This Post

More on Filtering Input and Escaping Output was posted on Wed, 09 Feb 2005 at 01:37:15 GMT.

10 Comments

1. ryan king's Gravatarryan king said:

I think the key phrase here would be that these two steps are "neccessary but not sufficient."

Wed, 09 Feb 2005 at 06:16:42 GMT Link


2. Robert's GravatarRobert said:

Glad to see you're fleshing this out. If we're really getting down to brass tacks, I wouldn't go so far as to say, "you can trust $_SESSION." The reference comes in via $_GET, $_POST, or $_COOKIE -- and data dependent on user-defined data is equally as dangerous.

Wed, 09 Feb 2005 at 16:49:18 GMT Link


3. Robert's GravatarRobert said:

p.s. -- all apologies if I misquoted you. Not trying to misrepresent what you say -- I just think you may be saying something that is more revolutionary than you realize.

Wed, 09 Feb 2005 at 16:59:32 GMT Link


4. Chris Shiflett's GravatarChris Shiflett said:

No worries. I knew you didn't intend to misquote me, but I wanted to make sure that no one relied on these two steps to protect them from everything. :-)

As for $_SESSION, I do consider it to be safe. It's not as safe as $clean (the array that most of my data filtering examples use), only because $clean does not persist. With $_SESSION, you are relying on the security of the session data store. So, if you're on a shared host, this is something to keep in mind. If you're storing session data in a database, the security of that database is important.

Strictly focusing on the application, however, $_SESSION is safe.

Wed, 09 Feb 2005 at 17:24:16 GMT Link


5. Chris Shiflett's GravatarChris Shiflett said:

By the way, is no one going to take the quiz? :-)

Wed, 09 Feb 2005 at 17:26:28 GMT Link


6. Dan's GravatarDan said:

I'll play your game, you rogue.

1. Server

2. Client

3. Client

4. Server (though $_SERVER['SCRIPT_FILENAME'] could conceivably come from the client, I believe, if they specify a relative path)

Man, I hate tests! All the pressure...

Wed, 09 Feb 2005 at 18:28:47 GMT Link


7. Geoffrey Young's GravatarGeoffrey Young said:

well, because they are in $_SERVER don't they come from the server by definition?

;)

so long as we're talking about apache and not some other php variant, these are each created via calls to apache's special APIs, ap_add_cgi_vars() and ap_add_common_vars(), which actually put these values into the execution environment. so anyone who can intercept the root of these values before those APIs are called can cause your data to be tainted...

1. $_SERVER['DOCUMENT_ROOT'] - this is taken straight from the private DocumentRoot httpd.conf configuration variable (so long as someone like mod_perl hasn't altered it before PHP took a peek at it).

2. $_SERVER['HTTP_HOST'] - ok this, like many others, is taken straight from the incoming request headers. in the case of host, it doesn't even need to be correct - you can use an absolute URI for the request and mungle the Host header and things will probably still work. not the best thing to trust, in any case.

3. $_SERVER['REQUEST_URI'] - this is taken directly from the GET request line, so I guess that makes it originate from the client. however, apache will validate this somewhat (for instance apache 1.3 will immediately fail if the URI that doesn't start with a slash). of course, mod_perl can muck with this value too, as can any number of input filters in httpd-2.0.

4. $_SERVER['SCRIPT_NAME'] - this is calculated by the server based on the requested URI, so again it's _kinda_ coming from the client, but in the same respect as the request line.

still feel safe? :)

Wed, 09 Feb 2005 at 20:36:33 GMT Link


8. Chris Shiflett's GravatarChris Shiflett said:

Hey, no ASF members! That's cheating! :-)

Wed, 09 Feb 2005 at 22:13:22 GMT Link


9. Geoffrey Young's GravatarGeoffrey Young said:

:)

Wed, 09 Feb 2005 at 23:01:07 GMT Link


10. pavel's Gravatarpavel said:

what about PHP_SELF? so many newbies like to use it as a target for their forms, but they forget that this one is as easy to manipulate as the address bar of the browser.

just add s.th. like

foo.php/<a%20href="asd"><hr>

to the url in order to provocate an xss vulnerabilty.

just take a look at one of such "masterpieces": http://design.definitelymaybe.org/guestbook/example.php

Mon, 26 Sep 2005 at 21:43:31 GMT Link


Post A Comment

Personal Details and Comment

Style Guide

Line breaks are converted to paragraphs. Also use:

  • <a href="" title="">text</a>1
  • <em>text</em>
  • <blockquote><p>text</p></blockquote>
  • <code>2  <?php  if ($foo) {      $foo = TRUE;  }  ?></code>
  1. Note: <code> can be used inline (e.g. in paragraphs) or in a block as shown. Include whitespace and newlines in blocks.

Please enter Chris (my first name) below. This is a primitive spam prevention technique, and I apologize for the inconvenience.

Preview and Submit

Upcoming Talks

ConFoo

10 - 12 Mar 2010

At Hilton Montréal Bonaventure, Montréal, Canada.

South by Southwest

12 - 16 Mar 2010

At Austin Convention Center, Austin, Texas.

Dutch PHP Conference

10 - 12 Jun 2010

At TBD, Amsterdam, Netherlands.

O'Reilly Open Source Convention

19 - 23 Jul 2010

At Oregon Convention Center, Portland, Oregon.

New Comments

RyanTheGreat wrote:

Well, I'm not Chris, but I will do my best to address the questions raised in the comments by Ian...

Posted in Security Corner: Cross-Site Request Forgeries
Chris Shiflett wrote:

Thanks for the kind words, Simon. I'm glad you liked the tutorial. In case it's helpful, here'...

Posted in Webstock
Chris Shiflett wrote:

Hi Robin, I plan to post something about it, but it's going to be hard to express everything i...

Posted in Webstock
Simon Mahony wrote:

Hi Chris, I really enjoyed your workshop on the Evolution of Security at Webstock. I think I g...

Posted in Webstock
Robin Gorry wrote:

Hi Chris, I was wondering if you were going to post how Webstock went for you this year. I li...

Posted in Webstock

Browse Comments


Work and Books

Analog Essential PHP Security HTTP Developer's Handbook