Why REST?

Here’s part 7 of the ongoing serialization of Refactoring HTML, also available from Amazon and Safari.

Representational State Transfer (REST) is the oldest and yet least familiar of the three refactoring goals I present here. Although I’ll mostly focus on HTML in this book, one can’t ignore the protocol by which HTML travels. That protocol is HTTP, and REST is the architecture of HTTP. (To be pedantic, REST is actually the architectural style by which HTTP is designed.)

Understanding HTTP and REST has important consequences for how you design web applications. Anytime you place a form in a page, or use AJAX to send data back and forth to a JavaScript program, you’re using HTTP. Use HTTP correctly and you’ll develop robust, secure, scalable applications. Use it incorrectly and the best you can hope for is a marginally functional system. The worst that can happen, however, is pretty bad: a web spider that deletes your entire site, a shopping center that melts down under heavy traffic during the Christmas shopping season, or a site that search engines can’t index and users can’t find.

Although basic static HTML pages are inherently RESTful, most web applications that are more complex are not. In particular, you must consider REST anytime your application involves the following common things:

  • Forms
  • User authentication
  • Cookies
  • Sessions
  • State

These are very easy to get wrong, and more applications to this day get them wrong than right. The Web is not a LAN. The techniques that worked for limited client/server systems of a few dozen to a few hundred users do not scale to web systems that must accommodate thousands to millions of users. Client/server architectures based on sessions and persistent connections are simply not possible on the Web. Attempts to re-create them fail at scale, often with disastrous consequences.

REST, as implemented in HTTP, has several key ideas. In brief:

All Resources Are Identified by URLs

Tagging distinct resources with distinct URLs enables bookmarking, linking, search engine storage, and painting on billboards. It is much easier to find a resource when you can say, “Go to http://www.example.com/foo/bar” than when you have to say, “Go to http://www.example.com/. Type ‘bar’ into the form field. Then press the foo button.”

Do not be afraid of URLs. Most resources should be identified only by URLs. For example, a customer record should have a URL such as http://example.com/patroninfo/username rather than http://example.com/patroninfo. That is, each customer should have a separate URL that links directly to their record (protected by a password, of course), rather than all your customers sharing a single URL whose content changes depending on the value of some login cookie.

Safe, Side-Effect-Free Operations Such As Querying or Browsing Operate via GET

Google can only index pages that are accessed via GET. Users can only bookmark pages that are accessed via GET. Other sites can only link to pages with GET. If you care about raising your site traffic at all, you need to make as much of it as possible accessible via GET.

Nonsafe Operations Such As Purchasing an Item or Adding a Comment to a Page Operate via POST

Web spiders routinely follow links on a page that are accessible via GET, sometimes even when they are told not to. Users type URLs into browser location bars, and then edit them to see what happens. Browsers prefetch linked pages. If an operation such as deleting content, agreeing to a contract, or placing an order is performed via GET, some program somewhere is going to do it without asking or consulting an actual user, sometimes with disastrous consequences. Entire sites have disappeared when Google discovered them and began to follow “delete this page” links, all because GET was used instead of POST.

Each Request Is Independent of All Others

The client and server may each have state, but neither relies on the other side remembering what its state is. All necessary information is transferred in each communication. Statelessness enables scalability through caching and proxy servers. It also enables a server to be easily replaced by a server farm as necessary. There’s no requirement that the same server respond to the same client two times in a row.

Robust, scalable web applications work with HTTP rather than against it. RESTful applications can do everything that more familiar client/server applications do, and they can do it at scale. However, implementing this may require some of the largest changes to your systems. Nonetheless, if you’re experiencing scalability problems, these can be among the most critical refactorings to make.

Continued tomorrow…

6 Responses to “Why REST?”

  1. website design Says:

    So the lesson here is that you should design your application to conform to the architecture of the underlying protocol, rather than shoe-horning it into a form it is not? A shocking and revolutionary development, indeed! 🙂

  2. Tom Scott Says:

    Good article. It drives me nuts that folk fail to think of personalization in terms of a RESTful architecture – and instead expect users to personalize at a URL.

  3. links for 2008-06-15 « Derivadow.com Says:

    […] Why REST? Robust, scalable web applications work with HTTP rather than against it. [The Cafes] Resources should be identified by URLs. For example, each user should have a separate URL that links directly to their record, rather than all your customers sharing a single URL whose content changes depending on the value of some login cookie. (tags: architecture http programming rest) […]

  4. Links for 2008-06-18 - tonyscott.org.uk Says:

    […] Why REST? [The Cafes] […]

  5. What do people have against URLs? « Derivadow.com Says:

    […] Elliotte Rusty Harold puts it, all resources are identified by URLs: Tagging distinct resources with distinct URLs enables bookmarking, linking, search engine storage, […]

  6. nickelcode » ASP.NET on Rails Says:

    […] Routes! I loves me some Routes! Making your web apps RESTful has many advantages (see here, and here for some examples) and in any case the flexibility that custom routes provides is […]