Why Cookies Are Bad For You #3

We’ve known for a long time that cookies are are deeply antithetical to the design of HTTP and the Web (#1). We’ve known they are used to track users and violate privacy (#2). However, I recently had called out to me yet another reason why cookies, specifically user authentication cookies, are bad for you.

Amazon has recently launched a Plog service. Plog stands for “personalized weblog”. Your plog is a combination of blogs from authors you like, shipment tracking, new items they think you might like, changes to your friends’ wishlists, and other personalized information. It’s a cool and useful feature. However to use it, you have to log into Amazon’s web site and read your Plog there.

The login requirement is quite reasonable. After all, the Plog includes a lot of personalized information. You probably don’t want your coworkers to see that you really enjoy the vocal stylings of Ashlee Simpson. However, Amazon does logins with cookies and URL rewriting instead of with HTTP authentication, and that’s where the problem arises.

You see, feed readers like NetNewsWire and Vienna don’t support these non-standard logins. In fact, no software does. Because every login page that uses cookies requires a different form, there’s no standard way to ask the browser or any other client for the user name and password. A human has to read the site and figure out where to put the username and password. Yes, there are a few standard form element names that can help autofill tools figure this out, but a person still has to navigate to the right page.

This works as long as browsing is the metaphor. However it falls apart as soon as we move to non-browser tools such as feed readers. Standard HTTP authentication does work in most major feed readers. However cookie based authentication doesn’t. Even if the feed reader knows how to handle cookies, there’s no way for it to login to the website in the first place to get the necessary cookie. It can ask the user for the username and password, but it can’t figure out where (i.e. on which other page) to submit this to the server.

The only user authentication that works for feed readers and any other non-browser, automated client is HTTP authentication. HTTP authentication is completely standard. The challenge comes from the page being authenticated, not from some other page somewhere else on the web site. The feed readers know exactly how to recognize and respond to a request for credentials because every site does it exactly the same way. (OK, that’s not quite true. There are actually three or four different ways; but they’re all related, all standardized, and all easily recognized and handled by the HTTP libraries the feed readers use.)

Right now feed readers are a distinct minority of the audience, but that’s shifting. I only noticed this because I’m already receiving request for feeds from my Amazon blog within a few days of launching it. As feed readers grow in popularity and new uses continue to be found for them, more and more sites are going to need to provide not just feeds but password protected feeds; and the only way to do this reliably is by using HTTP authentication as it was designed: sessionless and no cookies.

7 Responses to “Why Cookies Are Bad For You #3”

  1. moe Says:

    Nice call, unfortunately there are a few problems with HTTP authentication.

    1. Browsers tend to offer “save login/password” checkboxes and users tend to “forget”
    unchecking them which results in user passwords being permanently saved on
    machines where they don’t belong.

    2. Browsers remember the HTTP auth credentials at least until all browser windows
    (not only the current one) are closed. Therefor…

    3. It’s quite tricky to implement a “logout-button” and there is no clean way to do it.
    The common hack is to assign each user a personal realm which is fugly and fragile,
    to say the least.

    4. HTTP Simple Auth sends your login/password in the clear *on every request*.
    DIGEST auth is tricky to implement and is not supported by all browsers and especially
    not by all feed readers etc.

    The only way to preserve your sanity is to offer both. Have the regular cookie/rewrite sessions for your users. Additionally do support digest auth (with fallback to simple auth) for those
    feed readers.

  2. Berend de Boer Says:

    moe, all not true: http://www.pobox.com/~berend/rest/authentication.html

  3. Joshua Says:

    I really do not trust the solutions the site offers, especially when it self describes them:

    “The solutions presented here use only voodoo, so sacrifice your chickens and let’s go!”

    I believe that your RSS Reader is the limiting factor. I have personally written several web crawlers that were able to login and accept a cookie for authentication.

    “We’ve known they are used to track users and violate privacy.” This is not an issue with the technology. There is nothing in Cookies that allows people to “gather” information that they did not already have about you. This kind of FUD really chaps my hide. Until you are able to demonstrate, and I mean a working world-class application like Amazon, the same kind of personalization with at least the same level of security, don’t go whining about cookies.

    Anyone can complain, few can offer solutions.

  4. ssl Says:

    This is most likely the less intelligent I have read in a long time. The reason sites use form based login and not HTTP login is to prevent automated services from logging in i.e. it is a Turing test. Login based on “non standard” HTTP authentication and cookies are for this reason much safer. Session cookies are also needed by digest authentication for nonce counting.

  5. Masklinn Says:

    > Login based on “non standard” HTTP authentication and cookies are for this reason much safer.

    Except that this one doesn’t hold true: it’s trivial to create a site-specific cookie-aware bot. For example in Python all you need to do is

    import cookielib, urllib2
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookielib.CookieJar()))

    You’re done, any URL you’ll open with `opener` will be able to store cookies in the cookiejar and retrieve them from it.

    Anyone who’d want to automate data retrieval from your website can trivially do it, even if you’re using a cookie-based authentication scheme.

    The only issue is with general purpose tools (such as feed readers) because they can’t automate finding the login/password names/inputs.

  6. Aleph Says:

    Masklin, you’re right with cookies trivially intercepted by any little bot out there, but your bot cannot pass easily a “Turing test” challenge (Captcha). HTTP Basic (or digest) Auth does not offer any hook for this (unless you create a session and redirect to the HTTP Basic login page. Cookies/URL rewriting has nothing to do with authentication but session linkage (if you use an authentication ticket and use a cookie to transfer from client to server, and you do not add any protection mechanism to the cookie+ticket, e.g. SSL + schemes to detect replayed/modified cookies, your ‘authenticated session’ is under risk). Unfortunately, such protection mechanisms are VERY difficult to design and implement properly.

    My feeling is that no authentication mechanism is the best choice: SSL w/ client certificates is robust if done properly with keypair management (a big problem for most users and web sites, and nobody checks the little padlock), HTTP Simple Auth may be intuitive for end-users but it is not very practice due to the limitations shown by moe, FORM-based authentication is a ‘do-it-yourself’ (risks could be reasonably low, but the devil is in the implementation). HTTP Digest Auth is not widely supported in browsers and web servers (but only adds a bit of security to the HTTP Basic Auth). Probably one-time passwords (or variations, like the authentication matrix that some home banking sites provide), and biometry, are much more ‘secure’, but I am dreaming here…

    If you have a keylogger sniffing your keystrokes, no authentication scheme can help you, but this is a different problem.
    At the end, in the web nobody knows I’m a dog!

  7. tramadol hydrochloride Says:

    tramadol hydrochloride