About the Author

Chris Shiflett

Hi, I’m Chris: entrepreneur, community leader, husband, and father. I live and work in Boulder, CO.


All posts for Oct 2006

Formatting and Highlighting PHP Code Listings

For the impatient, here's a direct link to the example that highlights itself:

http://shiflett.org/code/highlight.php

As I mentioned in the previous post, shiflett.org is being redesigned and redeveloped from the ground up. (Nope, it's not finished yet; you'll know it when you see it.) One of the things I want to improve is commenting. This blog has been getting a lot of comments, and I really appreciate that. (Thanks!) Since the topics I talk about (PHP, MySQL, etc.) are technical, I want to let you add formatted code listings to your comments.

I've been playing with this tonight. Feel free to follow along as I go. The first thing you want to do is create an ordered list from the code you want to format ($code in these examples). This provides line numbers, among other things:

<?php
 
/* HTML Output */
$html = array();
 
/* Normalize Newlines */
$code = str_replace("\r", "\n", $code);
$code = preg_replace("!\n\n\n+!", "\n\n", $code);
 
$lines = explode("\n", $code);
 
/* Output Listing */
echo "<ol class=\"code\">\n";
foreach ($lines as $line) {
    if (empty($line)) {
        $html['line'] = '&#160;';
    } else {
        $html['line'] = htmlentities($line, ENT_QUOTES, 'UTF-8');
    }
 
    echo "  <li><code>{$html['line']}</code></li>\n";
}
echo "</ol>\n";
 
?>

In order to make <code> tags preserve whitespace, you can add this to your CSS:

code {
    white-space: pre;
}

Pretty easy, right? Now that you have a good foundation, you can start to improve it. First, add class="even" to every other list item:

<?php
 
foreach ($lines as $key => $line) {
    if (empty($line)) {
        $line = '&#160;';
    }
 
    $html['line'] = htmlentities($line, ENT_QUOTES, 'UTF-8');
 
    if ($key % 2) {
        echo "    <li class=\"even\"><code>{$html['line']}</code></li>\n";
    } else {
        echo "    <li><code>{$html['line']}</code></li>\n";
    }
}
 
?>

This lets you add a subtle background color to the even rows, making the code easier to read:

ol.code li.even {
    background:#f3f3f0;
}

The next step is to add syntax highlighting. This is a bit more involved, but only if you're picky. (I am.) You can use token_get_all() and loop through the tokens yourself, or you can use highlight_string() and try to clean up its output. I have chosen the latter.

You avoid some of the cleanup by using this idea I got from Wez:

<?php
 
ini_set('highlight.comment', 'comment');
ini_set('highlight.default', 'default');
ini_set('highlight.keyword', 'keyword');
ini_set('highlight.string', 'string');
ini_set('highlight.html', 'html');
 
$code = highlight_string($code, TRUE);
 
?>

This gets rid of colors and uses meaningful names instead, but it leaves behind plenty of ugliness. If you're like me, the first thing you want to do is get rid of the extra crap that highlight_string() adds to the front and end of the string:

<?php
 
$code = substr($code, 33, -15);
 
?>

If you're using PHP 4, this is going to be different. You can do something more clever to accommodate both. I didn't.

A simple replacement can turn inline styles into classes:

<?php
 
$code = str_replace('<span style="color: ', '<span class="', $code);
 
?>

If you're using PHP 4, you're going to need to do this for <font> tags instead, but it's the same basic idea.

Might as well turn &nbsp; back into a space, &amp; into &#38;, and <br /> back into a newline while you're at it:

<?php
 
$code = str_replace('&nbsp;', ' ', $code);
$code = str_replace('&amp;', '&#38;', $code);
$code = str_replace('<br />', "\n", $code);
 
?>

Now you can put the pieces together, but there's one more obstacle to overcome. The highlight_string() function closes a <span> tag just before opening the next one, sometimes several lines later. This can yield output that looks like this:

<li><code><span class="comment">...</code></li>
<li><code>...</code></li>
<li><code>...</span></code></li>

You want it to look more balanced, like this:

<li><code><span class="comment">...</span></code></li>
<li><code><span class="comment">...</span></code></li>
<li><code><span class="comment">...</span></code></li>

Feel free to solve this one on your own. (Solving this almost made me wish I had used token_get_all() instead of highlight_string().) If you're interested in seeing my solution, I've got an example that highlights itself, complete with a document type, styles, and everything else needed to make it validate as XHTML 1.0 Strict. (View source if you want to really appreciate the XHTML goodness.)

Thanks to Jon Tan for the styles and colors. He's the accessibility, usability, standards, and design expert that's helping with the new site.

I'll probably be making some minor improvements to this code before using it in production on the new site. If you notice any bugs or can think of any improvements, please leave a comment. Thanks!

PHP Tidbits

I'm developing a new web site for shiflett.org from the ground up, focusing on a clean, accessible design. As a result, I've been noticing all of the things I dislike about blogs, mine included. Navigation, commenting, and community are some aspects that I especially hope to improve.

I must admit, though, that instead of diving right in, I've been goofing off. Just for fun, I'd like to share a couple of quick PHP tidbits with you that I wrote instead of starting on the real project at hand. :-)

The first is an example that really shows off how useful a simple REST API can be in combination with SimpleXML. I've been using FeedBurner for a while for my feed, and it's cool to see how many people are subscribed. As part of the new design, I'd like to be able to grab that number without having to use their image. Enter the FeedBurner Awareness API. With two lines of PHP, I'm good to go:

<?php 

$info 
simplexml_load_file('http://api.feedburner.com/awareness/1.0/GetFeedData?uri=shiflett');
$subscribers $info->feed->entry['circulation'];

?>

Formatting can be difficult when you have really long URLs, and one of the best solutions I've seen is to shorten URLs to just the first x characters and the last y characters. Something like this:

<?php 

function shorten_url($url$separator '...'$first_chunk_length 35$last_chunk_length 15)
{
    
$url_length strlen($url);
    
$max_length $first_chunk_length strlen($separator) + $last_chunk_length;

    if (
$url_length $max_length) {
        return 
substr_replace($url$separator$first_chunk_length, -$last_chunk_length); 
    } 

    return 
$url;
}

$url 'http://averylongdomainname.org/a/very/long/path/to/averylongfilename.pdf';
$short_url shorten_url($url);

?>

With this, you can link to $url and display $short_url, and it's still pretty clear where the link takes you. Of course, you can also easily adapt it to fit any particular length, and you can even use a real ellipsis instead of ... for the separator.

I'm currently writing a test suite for comments, since I want to allow more formatting in the comments as well as maintain strict XHTML. I'm hoping to find an existing solution, but I haven't found anything yet. Once I have my criteria better defined and a decent test suite written, I'll blog more about it.

Firefox 2.0 First Impressions

I've been using Firefox 2.0 for most of the day, and so far, I like it. The biggest disappointment is that it doesn't support HttpOnly cookies. Also, a few of my favorite extensions (del.icio.us, Foxylicious, and LiveHTTPHeaders) aren't compatible, but that's a temporary problem.

The button to close a tab is now on the tab itself, and to help address the increased likelihood that you'll accidentally close a tab (because the close buttons move as the tabs resize), there's an "Undo Close Tab" option:

Microsummaries are pretty cool, and I might implement them on my blog. They're basically little summaries that are small enough to fit in a bookmark label.

If you spend a lot of time commenting on blogs and other sites, you might find the new spell checker useful. It's both intuitive and unobtrusive:

There's also some phishing prevention (which I haven't experimented with yet) and a few other security features, such as support for RFC 3546, which (among other things) extends TLS to allow for host identification, alleviating the shared host problem where the Host header is necessary to figure out which SSL certificate to present, but it's not available until the SSL handshake has completed.

For PHP developers, there is a section for web site and application developers with some useful information.

What are your first impressions? Have you tried IE 7.0?

Note: For the cynics among us, Jeremiah Grossman is soliciting guesses for the first Firefox 2.0 vulnerability.

DC PHP Conference Recap

This past Thursday, I attended the DC PHP Conference. Since I was only there for a day, I'm sure I missed a lot, but I did manage to do some of the things on my list.

I attended more talks than usual, including:

Although I didn't see his talk at the conference, Adam Trachtenberg visited OmniTI on Wednesday to give a talk on ext/soap at our weekly developer session.

My talk about PHP Security Testing was just after lunch, and I received a lot of positive feedback. My other talk, The Truth about XSS, was the last talk of the day, and I went over by about 15 minutes. I think this is currently my most interesting talk, and as a testament to this, the room remained packed despite the fact that free beer was available elsewhere. :-) Thanks to everyone who gave up free beer to hear my talk.

I also briefly met David Recordon, one of the guys involved with OpenID. He works at VeriSign, who offers a Personal Identity Provider. This is something Wez has been playing with recently. Hopefully he'll blog about his experiences.

Damien Seguy, who has been tracking PHP 5 adoption statistics for us, mentioned to me that he is gathering statistics from open phpinfo() pages. His statistics reveal that register_globals is enabled on about half of these. (Adam suggested that there is probably a relationship between those who enable register_globals and those who have open phpinfo() pages.) I'm eager to see these statistics published.

Laura, Damien, Adam, and I finished the day at a Chinese restaurant, where I managed to find some spicy food. Damien and Adam both speak Chinese, so I think they appreciated the chance to practice.

All in all, the conference turned out pretty well, and I'm happy to have been a part of it.

Using CSRF for Browser Hijacking

Something the Myspace worm taught us is that traditional safeguards against CSRF (cross-site request forgeries) are rendered ineffective when XSS (cross-site scripting) vulnerabilities exist in a web application. This is because malicious content injected into a web site can do a number of things, such as send HTTP requests and receive HTTP responses with Ajax, or even attack servers on your local network.

The result is that an attacker can perfectly mimic your actions using your own browser. It's something I've labeled browser hijacking, and it's one of the most dangerous examples of CSRF to date. It's also basically the same attack that I've been discussing in recent posts such as Cross-Domain Ajax Insecurity and The Dangers of Cross-Domain Ajax with Flash, except that XSS is an easy way around the same-domain restriction (without requiring an open crossdomain.xml policy).

Think XSS doesn't matter? Think again. As Johann from ThinkPHP puts it:

Buy one XSS, get a CSRF for free.

I've been speaking at conferences about CSRF for years, and one of the most alarming things I've noted during that time is that very few developers are aware of it, even conceptually. Jeremiah Grossman has noticed this, too, and he likens CSRF to a sleeping giant. RSnake calls it "the attack of the future." SANS is doing their part by trying to raise awareness.

Hopefully good guys will learn about it before too many bad guys do.

DC PHP Conference Is Next Week

I keep forgetting to mention this, but I'll be speaking at the DC PHP Conference next week about PHP Security Testing and The Truth about XSS. For some reason, they haven't posted the schedule yet, but both of my talks are supposed to be on Thursday (19 Oct), and that's probably the only day I'll be able to attend.

Here are a few of the things I'm looking forward to:

There are plenty of good talks to choose from. Rasmus is speaking about Getting Rich with PHP, his excuse to show off the cool stuff he has been playing with recently. You can also attend Laura's talk about Writing Maintainable Code and Ben's talk about XML and Web Services with PHP. The conference schedule lists all of the talks under the speaker biographies.

I hope to see you there next Thursday!

Google Code Search for Security Vulnerabilities

Stephen de Vries sent an email to SecurityFocus's web application security mailing list earlier today to comment on the new Google Code Search:

Google's code search provides an easy way to find obvious software flaws in open source and example applications.

He provided a few example queries to illustrate his point:

There is certainly some potential for abuse. Here are a few queries for PHP and MySQL vulnerabilities off the top of my head:

There are a few false positives in these results, but hopefully it's clear that with a little bit of effort, it's easy to create a collection of queries to search for common web application security vulnerabilities.

Maybe I'm being naive, but I see a silver lining. With this tool that Google has created, it seems possible to develop a useful static analysis tool for the source code that's in the index. As easily as vulnerabilities can be discovered by the bad guys, they can also be discovered by the good guys.

Can you think of some good queries to add to this list? Please share!

The Best City in America for PHP Developers

The Zend Developer Zone recently analyzed the best places to live in America and how the salaries compare for PHP developers. If you were to use this analysis as a metric to decide where to work, apparently you should choose Columbia, MD. Although I don't think that's a good way to make such a decision, I agree with the result.

Why? Well, Columbia happens to be where OmniTI is based. If you're an excellent PHP developer, "these are the days." The PHP job market is red hot, and there are lots of innovative companies looking for talented people. We're one of them.

Why should you choose OmniTI? I think OmniTI is a great place to work, especially for PHP developers. We've got a great learning environment that includes weekly developer meetings. (And, our people are frequent speakers at industry conferences, so the technical presentations can be quite good.) We're smart, friendly people who write popular books and develop open source software. Our client list is beginning to look like a who's who of Web 2.0, which means we get to work on some pretty interesting and challenging projects.

Sound interesting? I hope so. We've got a jobs page that lists the details of our open positions (and contact information), but if you're passionate about PHP, there's probably a place for you here at OmniTI. And, let's face it, if you're reading my blog (which happens to be mostly about PHP and web application security), you're probably interested in the right stuff. :-)

The crossdomain.xml Witch Hunt

After disclosing the security vulnerability in Flickr (a result of its crossdomain.xml policy), a number of other major web sites have been identified as being vulnerable to the same exploit: using cross-domain Ajax requests for CSRF. Among these new discoveries are YouTube and Adobe.

This is an inherent risk that exists whenever you disclose a new exploit. Because this exploit is the first of its kind, there are numerous web sites that are potentially vulnerable. I've made a sincere attempt to notify those who I know are vulnerable, but there's only so much a bit of Google searching can reveal.

Roderick Divilbiss wondered why more people aren't paying attention to this discovery:

Such a simple, yet potentially damaging vector. I am dismayed that so few people have bothered to Digg this.

Someone else mentioned that disclosing this vulnerability before Flickr had a chance to fix it would have been a better tactic for spreading the word, but added that he was glad I waited. I'm well aware of the merits of full disclosure, but I prefer to give people time. Flickr certainly didn't abuse my trust and patience; in just 12 days, a fix was in place. If everyone was this responsible, the Web would be a safer place.

For more information about the exploit, see Cross-Domain Ajax Insecurity and The Dangers of Cross-Domain Ajax with Flash.

In a Flash Player TechNote, Adobe warns about an open policy that permits all sites to send cross-domain requests:

This practice is suitable for public servers, but should not be used for sites located behind a firewall because it could permit access to protected areas. It should not be used for sites that require authentication in the form of passwords or cookies.

As written, the warning is a bit unclear. Does Adobe already know about this exploit? Here's their current crossdomain.xml policy:

<cross-domain-policy> 
    <allow-access-from domain="*"/> 
    <allow-access-from domain="*.macromedia.com" secure="false"/> 
    <allow-access-from domain="*.adobe.com" secure="false"/> 
</cross-domain-policy>

If they demonstrate the vulnerability themselves, it doesn't seem likely they're aware of it.

A relatively new site has popped up at crossdomainxml.org that lists sites with open policies. It already lists the new location of Flickr's policy, so it's pretty current.