About the Author

Chris Shiflett

Chris Shiflett is an author and speaker who leads the web application security practice at OmniTI.


Google XSS Example

In the comments to my previous blog post, Ivo Jansch asks:

To be able to comprehend how this may affect my website, could you explain how this could be exploited, even though you cannot demonstrate it?

Rather than offer another vague answer, I decided to provide a very simple proof of concept that demonstrates how character encoding inconsistencies can bite you. Google's vulnerability has of course been fixed, but with a simple PHP script, we can reproduce the situation:

<?php 
 
header('Content-Type: text/html; charset=UTF-7'); 
 
$string = "<script>alert('XSS');</script>"; 
$string = mb_convert_encoding($string, 'UTF-7'); 
 
echo htmlentities($string); 
 
?>

If you run this PHP script, you should see a popup window:

Although the output is escaped with htmlentities(), the JavaScript is still executed by the browser.

The example attack is a UTF-7 string (I just use mb_convert_encoding() for this demonstration), and the browser interprets the page as UTF-7 due to the Content-Type header. Internet Explorer makes this assumption automatically (thus, you can remove the explicit header() call), but this example should work in any browser.

Hopefully developers will begin to appreciate the necessity of character encoding consistency. If anyone ever tries to claim that it doesn't matter, you can point them here. :-)

About This Post

Google XSS Example was posted on Wed, 21 Dec 2005 at 20:15:17 GMT.

32 Comments

1. Mike (SpikeZ - Sitepoint)'s GravatarMike (SpikeZ - Sitepoint) said:

Hi Chris,

Good article and very helpful.

Shows a glaring weakness in many 'secure' sites.

Cheers

Mike

Wed, 21 Dec 2005 at 20:47:24 GMT Link


2. Alex's GravatarAlex said:

Once upon a time there was a security expert whom-must-not-be-named.

Day for day he consulted others and guide them to prevent XSS. He likes to critism all others and hates it when someone does the same on him. Then he starts to cry and delete comments - so the next time he critism a guy for insecurity he still has a clean record.

Some (lets call them death eater) try to protect him.

But like in each fairy story, some day there will be a happy end - and he-who-must-not-be-named will fall.....

Wed, 21 Dec 2005 at 21:19:55 GMT Link


3. Ivo Jansch's GravatarIvo Jansch said:

Thanks Chris, the example is very clear now :)

Wed, 21 Dec 2005 at 21:56:14 GMT Link


4. Ilia Alshanetsky's GravatarIlia Alshanetsky said:

Actually the problem is not limited to Internet Explorer, Mozilla Firefox 1.5 exibits the exact same behaviour.

If you enable automatic character set detection either browser will trigger the XSS without the call to the header() function. The difference is that in Firefox to trigger the header-less problem the auto-detection needs to be configured to detect utf-7. If it is not, then the exploit does not happen.

Wed, 21 Dec 2005 at 23:35:53 GMT Link


5. Josh Dechant's GravatarJosh Dechant said:

Try as I might, this exploit does not work in Safari. It does in Firefox Mac, but Safari won't have it and just prints out "+ADw-script+AD4-alert('XSS')+ADsAPA-/script+AD4-"

Thu, 22 Dec 2005 at 14:18:45 GMT Link


6. Chris Shiflett's GravatarChris Shiflett said:

Hi Josh,

I just tried in Safari and got the same results.

If you go to View > Text Encoding, you'll see that UTF-7 is not an encoding that Safari supports, so that's probably why.

Thu, 22 Dec 2005 at 14:33:14 GMT Link


7. Josh Dechant's GravatarJosh Dechant said:

Chris, yes after I posted I noticed that Safari does not support UTF-7. But if you think about it, it really does make sense for a browser to not support UTF-7 since it is largely meant for email. Though Mail.app doesn't have UTF-7 either, and nor does TextEdit or any other Apple App that I can see, so it seems that Apple has made a blanket decision to not support UTF-7 for one reason or another.

Ironically enough though, removing the explicit header, and the dead IE 5 Mac does not automatically set the encoding to UTF-7... And unless you really dig into FireFox, it won't autodetect UTF-7 either, so while I agree we should be consistent in our encoding, this seems to equally be an XSS and IE PC exploit.

Thu, 22 Dec 2005 at 14:49:20 GMT Link


8. DewChugr's GravatarDewChugr said:

As someone fairly new to PHP and web programming I have read several of your articles and I have to say they are filled with great information. It's nice that you share this stuff with everyone. I don't really get all this UTF stuff yet so I'll google around and try to learn some more.

Thanks

Thu, 22 Dec 2005 at 22:26:17 GMT Link


9. Chris Shiflett's GravatarChris Shiflett said:

Thanks for the kind words. I really appreciate it. :-)

Andrei Zmievski is working on PHP's Unicode support and has given some good talks on the topic:

http://www.gravitonic.com/talks/

Wikipedia has a pretty good description of UTF-7:

http://en.wikipedia.org/wiki/UTF-7

Skip down to the description, as I think it's the most informative.

Hope that helps!

Thu, 22 Dec 2005 at 23:29:51 GMT Link


10. DewChugr's GravatarDewChugr said:

Thanks for the links, very helpful.

btw, this blog/commenting system. Did you code it or is it a package? It's very slick...

Fri, 23 Dec 2005 at 14:54:53 GMT Link


11. Harry Fuecks's GravatarHarry Fuecks said:

Interesting.

UTF-7 would be considered "valid" by common techniques used to validate encodings e.g. this regex: http://www.w3.org/International/questions/qa-forms-utf-8 - UTF-7 would pass, being in the ASCII range.

So we're saying the only time this is a risk is if it's left up to the browser to guess the encoding? I.e. make sure you declare the charset with a header or a meta tag?

Sun, 25 Dec 2005 at 01:00:42 GMT Link


12. joh's Gravatarjoh said:

if you want to learn about Unicode, you should start here:

"The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)"

http://joelonsoftware.com/articles/Unicode.html

Sun, 25 Dec 2005 at 08:51:55 GMT Link


13. Miggy's GravatarMiggy said:

That was show on http://ha.ckers.org/xss.html a while back too.

Mon, 02 Jan 2006 at 00:10:45 GMT Link


14. Paul Davey's GravatarPaul Davey said:

<p>Is it because the htmlentities function is not compatible with utf-7 and hence does not alter the HTML? I did this for example (after your code):</p>

<pre>if (htmlentities($string)== $string) {

echo "NO CHANGE!";

}</pre>

<p>And it displayed NO CHANGE</p>

Sun, 15 Jan 2006 at 16:36:52 GMT Link


15. Chris Shiflett's GravatarChris Shiflett said:

Harry, I would add one more thing - we should indicate the charset in both our Content-Type headers and in our htmlentities() calls.

The previous post might make this clearer and provides an example:

http://shiflett.org/archive/177

Paul, this example works because htmlentities() assumes ISO-8859-1 by default, but the content is UTF-7. This mismatch causes htmlentities() to misinterpret characters.

We want to make sure our escaping functions and the remote systems to which we're sending data interpret data consistently, otherwise vulnerabilities like this are possible.

Sun, 15 Jan 2006 at 16:54:05 GMT Link


16. Luis's GravatarLuis said:

the example is sui generis because UTF-7 is not supported by htmlentities(). passing UTF-8 to it avoids the popup but you see no text either (but looking at the source you see something like the safari output above). on the other hand if the encoding had been UTF-8, not passing it to htmlentities() would not cause a popup either, probably because for this example UTF-8 and ISO-8859-1 are the same. so the example is good but not perfect.

Thu, 02 Feb 2006 at 16:52:59 GMT Link


17. Chris Shiflett's GravatarChris Shiflett said:

Hi Luis,

The example is only meant to reproduce Google's XSS vulnerability and highlight the importance of character encoding consistency. It's not a contrived example.

The idea isn't to use UTF-7 but to show that it's worth being explicit about which character encoding you're using. By specifying this in the Content-Type header and the htmlentities() function call, you're protected from these types of vulnerabilities.

Thu, 02 Feb 2006 at 17:20:14 GMT Link


18. Steph's GravatarSteph said:

In regards to the comment from "Alex". I have no idea who you are but wtf are you doing posting around your trivial life problems with someone on the comments. I think it would be best for you is the Blogger deleted your comment and you stick to something on topic regarding character encoding and xss issues and not some "I want to sound smart with my metaphors" useless comments"

If you do have a valid point regarding something well then post a link to somewhere where you might have a sensical discussion of the issues not some fairy tale garbage.

I'd like to point out that I am all for your useless comments being removed they serve no purpose on this blog.

Also note that I will not be offended if the blogger does not wish this comment section to become a war of words and deletes my comment. Its his blog and he should make the decisions based on what he thinks is best for his readers.

Wed, 01 Mar 2006 at 05:04:53 GMT Link


19. Nate Klaiber's GravatarNate Klaiber said:

Steph,

I wouldnt get too worked up over 'Alex' - people like this, who hide behind the Internet, lose all respect and credibility anyway. So, the whiny attitude and comment approach he has taken holds no water, and deserves to be removed.

Cmon, if you are going to act tough - at least post some contact information. Some people need to grow up, period. I say delete the comments, they are useless and provide nothing of worth here - He just needs to go back to hardened-php...er...um....

Back on topic - I have tried this (As others) in Safari and have been unable to duplicate any results (but I understand some may have been fixed, etc). I still want to see more of this with working examples - so I will test it later on a local machine. I understand what is being said, but sometimes its easier to see HOW it could maliciously affect you if ignored.

Thanks for the great information....

Wed, 01 Mar 2006 at 15:25:49 GMT Link


20. Chris Shiflett's GravatarChris Shiflett said:

I don't like to delete comments, so I try to only do so when they're spam, off-topic, or flagrant.

Nate, you won't be able to try this in Safari, because it doesn't support UTF-7. In browsers that automatically detect the encoding, you can remove the header() call, which more closely resembles the problem Google was having.

Wed, 01 Mar 2006 at 15:53:54 GMT Link


21. Steven Roddis's GravatarSteven Roddis said:

Please note table.2:

http://www.php.net/htmlentities

It shows the character sets that are supported in PHP 4.3.0 and later. UTF-7 is not one, therefor it is not going to escape it.

So the problem is that the developer did not understand which character sets are supported and which aren't.

Steven

Wed, 19 Apr 2006 at 06:31:36 GMT Link


22. its not important to know my name's Gravatarits not important to know my name said:

What exactly was the worst case possibility of google's vulnerabilty. How could it have been used to to bypass security?

Sat, 22 Apr 2006 at 17:43:51 GMT Link


23. Michael's GravatarMichael said:

사설 번역/음성사설 번역/음성

Fri, 04 Aug 2006 at 00:17:16 GMT Link


24. Mikispag's GravatarMikispag said:

Great tip! Thanks!

Mon, 27 Nov 2006 at 15:35:27 GMT Link


25. SEO Blog's GravatarSEO Blog said:

Very good informations to get a secure website. I will check this. Thanks!

Sat, 16 Dec 2006 at 18:02:48 GMT Link


26. Bourse's GravatarBourse said:

I have never imagined how character encoding could be that important! I am not an expert in web programming and I am new to cross site scripting subject (actually I am currently developing a website and this is the first time I am making the coding entirely on my own) so a big thanks from me for sharing such an useful information on your site. I have already read several articles here that really came into use.

Sun, 11 Mar 2007 at 15:43:25 GMT Link


27. Tereska's GravatarTereska said:

Replace last line with this one:

echo htmlentities($string, ENT_QUOTES);

and this hack will not work....

learn PHP guys ;))

Tue, 29 May 2007 at 01:06:05 GMT Link


28. Chris Shiflett's GravatarChris Shiflett said:

Hi Tereska,

If you think this problem has to do with whether quotes are escaped, then you're the one with some learning to do.

Because you failed to indicate the character encoding, your example is vulnerable to XSS. I'm surprised you made this particular mistake, because it's the focal point of this post.

Tue, 29 May 2007 at 01:39:39 GMT Link


29. Tereska's GravatarTereska said:

Sorry for my E ;)

Chris, I didnt want to to offend anyone so I'm sorry for my "learn PHP" sentence :) it's just misunderstanding... :)

I'm really concern about this RSS example and I've tried to do something to make this hack useless...

I think the KEY in this example is htmlentities 3rd parameter -> [, string $charset]. If I'm wrong just correct me.

Thanks! Seeyaa!

Tue, 29 May 2007 at 23:03:05 GMT Link


30. Daniel's GravatarDaniel said:

What if you convert to UTF-8 (or your application encoding) the submited variables before processing?

if(!is_myEncode($var)) Encode($var);

Thus, you will have consistent values.

Wed, 30 May 2007 at 06:23:40 GMT Link


31. Thijs Wijnmaalen's GravatarThijs Wijnmaalen said:

Does anybody know if the Smarty modifier 'escape' is vulnerable to this attack?

Wed, 20 Jun 2007 at 15:23:53 GMT Link


32. Jim's GravatarJim said:

Briliant article! Thanks!

Thu, 22 May 2008 at 14:22:10 GMT Link


Post A Comment

Personal Details and Comment

Style Guide

Line breaks are converted to paragraphs. Also use:

  • <a href="" title="">text</a>1
  • <em>text</em>
  • <blockquote><p>text</p></blockquote>
  • <code>2  <?php  if ($foo) {      $foo = TRUE;  }  ?></code>
  1. Note: <code> can be used inline (e.g. in paragraphs) or in a block as shown. Include whitespace and newlines in blocks.

Please enter Chris (my first name) below. This is a primitive spam prevention technique, and I apologize for the inconvenience.

Preview and Submit

Upcoming Talks

O'Reilly Open Source Convention

21 - 25 Jul 2008

At Oregon Convention Center, Portland, Oregon.

ZendCon

15 - 18 Sep 2008

In Santa Clara, California.

PHP Appalachia

11 - 14 Oct 2008

At Big Bear Lodge, Gatlinburg, Tennessee.

New Comments

Amir wrote:

Hi chris! Please check this and guide me: http://forums.devnetwork.net/viewtopic.php?f=34&t=8...

Posted in
Nathan Bentley wrote:

Hi Chris, A great tutorial, which should help a lot of people! We implemented something simil...

Posted in
Daniel S wrote:

Just recently I sold my 1.gen Macbook(core duo version). And to be honest, I don't miss it for on...

Posted in Top X List of Mac OS X Annoyances
Buke Beyond wrote:

I agree it is ridiculous that php is doing this. I am using php for generating commands for othe...

Posted in PHP Stripping Newlines
Davis Ford wrote:

I agree, although I have a list of many more annoyances. However, rather than complain about the...

Posted in Top X List of Mac OS X Annoyances

Browse Comments