Google Web Accelerator Debate

19 Dec 2006

I was browsing Ajaxian and stumbled upon a rant from late last year about Google Web Accelerator (GWA):

Google has reintroduced their Google Web Accelerator with a vengeance. It was evil enough the first time around, but this time it's downright scary.

This is sensationalism at its worst. There doesn't appear to be anything new since the first time this issue was discussed.

In case you missed it, the controversy is whether GWA should pre-fetch links on a page. Some web applications use links for important actions like deleting content, and when such "pages" are pre-fetched, GWA is actually performing these actions rather than just fetching content. There is apparently an ongoing debate about where to place the blame.

During 2002, I wrote HTTP Developer's Handbook. As a result, whether right or wrong, I have some pretty strong opinions about how the HTTP specification is being interpreted. However, before discussing SHOULD versus MUST, there is an important point being missed by those who are quick to blame Google for their own mistakes. Every example I have seen of a link that performs an action is either not going to be pre-fetched by GWA due to the presence of a query string, or it's vulnerable to CSRF (cross-site request forgeries). Sometimes it's both.

Here are some examples:

  1. <a href="/delete.php?item=socks">Remove Socks from Cart</a>
  2. <a href="/delete.php">Remove All from Cart</a></p>

Although only the second link is eligible for caching by GWA, both of these links are vulnerable to CSRF. (Read my article about CSRF if you're not sure what it is.) Anyone who is logged in and sends a request for one of these URLs will perform the indicated action. You can add an anti-CSRF token to these links to help protect against this vulnerability:

  1. <a href="/delete.php?item=socks&token=abcd">Remove Socks from Cart</a>
  2. <a href="/delete.php?token=abcd">Remove All from Cart</a>

If not an anti-CSRF token, you've got to use two-factor authentication or something to protect against CSRF.

The string abcd is a placeholder intended to represent a random string - a shared secret between the server and a single client. Because these links each include a query string, neither is eligible for caching by GWA. If you're like me, you hate query strings anyway, because they're ugly. :-) Let's try another example:

  1. <a href="/delete/socks/abcd">Remove Socks from Cart</a>
  2. <a href="/delete/all/abcd">Remove All from Cart</a>

Whether you prefer subject-verb or verb-subject, I think you'll agree that the presence of the anti-CSRF token in the URL is ugly. However, these links are both eligible for caching by GWA and not vulnerable to CSRF. Those who want to blame Google can only point to examples like this. As far as I know, it's the only type of link that presents a valid case against GWA's behavior, yet it's not one that I've seen cited anywhere.

Regarding the HTTP specification, there is the oft-quoted section on safe methods:

In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.

Those who insist on violating this recommendation focus on the semantic distinction between SHOULD NOT and MUST NOT. Luckily, SHOULD NOT is well defined in another specification:

This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.

What implications? :-)

Rules aside, using GET to perform actions violates a standard idiom that has become commonplace - clicking a link only fetches content. POST is represented differently for a reason. Browsers can warn before sending a POST request again (an issue I discuss in detail in one of my articles), whereas a GET request might be re-sent whenever the user navigates through the browser history. If such a request initiates an action, it can result in undesired behavior. Imagine losing your entire shopping cart contents, just because you went back a few pages. Would you bother continuing, or would you lose interest and leave? Food for thought.

Focusing on aesthetics alone, I can see where an interface designer might prefer a link for simple actions like logging out. Therefore, I asked my good friend Jon Tan to demonstrate how to use CSS to make a form submission button look exactly like a link. (This approach also happens to make it easy to add an anti-CSRF token.) He came through in style with an example that styles a button to look like a link.

I'll leave you with a funny statement by Mark Pilgrim that someone pointed out. It seems appropriate in this situation:

Besides the run-of-the-mill morons, there are two factions of morons that are worth special mention. The first work from examples, and ship code, and get yelled at, just like all the other morons. But then when they finally bother to read the spec, they magically turn into assholes and argue that the spec is ambiguous, misleading in some way, ignorable because nobody else implements it, or simply wrong. These people are called sociopaths. They will never write conformant code regardless of how good the spec is, so they can safely be ignored.

He continues with a description of the other faction:

The second faction of morons work from examples, ship code, and get yelled at. But when they get around to reading the spec, they magically turn into advocates and write up tutorials on what they learned from their mistakes. These people are called experts. Virtually every useful tutorial in the world was written by a moron-turned-expert.

To which faction do you belong?