Programming with Perl | Cookie Monsters (Web Techniques, May 2001)

Ahh, cookies. One of my pet peeves is the amount of bad cookie code I see out there—like the code responsible for the reaction I get from a Web site when I choose not to permit cookies (usually because I'm feeling rebellious). Cookies are one of many ways to turn stateless HTTP into a series of session-based transactions with states like "logged in" and "logged out." Other ways to achieve a similar transformation of HTTP include using authentication, mangling the URLs, and hiding data.


January 01, 2002
URL:http://www.drdobbs.com/programming-with-perl-cookie-monsters-w/184413446

1    #!/usr/bin/perl -Tw
2    use strict;
3    $|++;
4    
5    use CGI qw(:all);
6    
7    ## cookie check
8    my $browser = cookie("browser");
9    if (defined $browser) {         # got a good browser
10      Delete("_cookiecheck");       # don't let this leak further
11    } else {                        # no cookie? set one
12      require MD5;
13      my $cookie = cookie
14        (-name => 'browser',
15         -value => MD5->hexhash(MD5->hexhash(time.{}.rand().$$)));
16    
17      if (defined param("_cookiecheck")) { # already tried!
18        print +(header(-cookie => $cookie),
19                start_html("Missing cookies"),
20                h1("Missing cookies"),
21                p("This site requires a cookie to be set. Please permit this."),
22                startform, submit("OK"), endform,
23                end_html);
24      } else {
25        param("_cookiecheck", 1);   # prevent infinite loop
26        print redirect (-cookie => $cookie, -uri => self_url());
27      }
28      exit 0;
29    }
30    
31    ## At this point, $browser is now the unique ID of the browser
32    
33    require File::Cache;
34    my $cache = File::Cache->new({namespace => 'cookiemaker',
35                                  username => 'nobody',
36                                  filemode => 0666,
37                                  expires_in => 3600, # one hour
38                                 });
39    
40    ## first, some housekeeping
41    unless ($cache->get(" _purge_ ")) {
42      $cache->purge;                # remove expired objects
43      $cache->set(" _purge_ ", 1, 3600 * 4); # purge every four hours
44    }
45    
46    my $user = $cache->get($browser); ## either the logged-in user, or undef
47    
48    print header,start_html('session demonstration'),h1('session demonstration');
49    
50    ## handle requested transitions (login or logout)
51    if (defined $user and defined param("_logout")) {
52      Delete("_logout");
53      $cache->remove($browser);
54      print p("You are no longer logged in as $user.");
55      undef $user;
56    } elsif (not defined $user and defined (my $try_user = param("_user"))) {
57      Delete("_user");
58      my $try_password = param("_password");
59      Delete("_password");
60      if ($try_user =~ /\A\w+\z/ and verify($try_user, $try_password)) 		{
61        $user = $try_user;
62        print p("Welcome back, $user.");
63      } else {
64        print p("I'm sorry, that's not right.");
65      }
66    }
67    
68    ## handle current state (possibly after transition)
69    if (defined $user) {
70      $cache->set($browser,$user);  # update cache on each hit
71      print p("You are logged in as $user.");
72      print startform, hidden("_logout", 1), submit("Log out"), endform;
73    } else {
74      print p("You are not logged in.");
75      print
76        startform,
77          table({-border => 1, -cellspacing => 0, -cellpadding => 2},
78                Tr(th("username:"),
79                   td(textfield("_user")),
80                   td({-rowspan => 2}, submit("login"))),
81                Tr(th("password:"), td(password_field("_password")))),
82                  endform;
83    }
84    
85    ## rest of page would go here, paying attention to $user
86    
87    for ([Cookies => \&cookie], [Params => \¶m]) {
88      my ($title, $f) = @$_;
89    
90      print h2($title), table 
91        ({-border => 0, -cellspacing => 0, -cellpadding => 2},
92         map (Tr(th(escapeHTML($_)), td(escapeHTML(join ", ", $f->($_)))),
93            $f->()));
94    }
95    
96    ## sample verification
97    
98    sub verify {
99      my($user, $password) = @_;
100      return index($password, $user) > -1; # require password to contain user
101    }


Programming with Perl | Cookie Monsters (Web Techniques, May 2001)

Ahh, cookies. One of my pet peeves is the amount of bad cookie code I see out there—like the code responsible for the reaction I get from a Web site when I choose not to permit cookies (usually because I'm feeling rebellious). Cookies are one of many ways to turn stateless HTTP into a series of session-based transactions with states like "logged in" and "logged out." Other ways to achieve a similar transformation of HTTP include using authentication, mangling the URLs, and hiding data.

Cookies are the object of my ire because so many Internet programmers assume that "one user equals one Web browser." After all, that's the basic model of the cookie itself. But that assumption is also demonstrably untrue. For example, right now I have three browsers open. And, I've been known to enter an Internet café from time to time and use the browsers it supplies. The problem is that when I move from browser to browser, my cookies don't follow me.

Bad Batches

What are the wrong ways to use cookies? Let me count them: One is to use cookies with a login form that, upon successful login, sends out a cookie that lasts for several years. That cookie works only for the particular browser I used when I filled out the login form. Because of this, I can't log in using a different browser unless I fill out the login form again. Worse yet, if I'm using an Internet café's browser and I forget to log out, the next user who stumbles across that Web site is already logged in as me.

Then there are Web sites that send out a bunch of data inside a cookie, such as the entire contents of my shopping cart. Those sites usually trust that data when my next visit to the site returns it inside the cookie. But, if all my shopping cart data comes from that cookie, nothing (except my honesty) stops me from changing the price of a $300 item in my shopping cart to $1 instead.

There are also sites that inundate users with dozens of cookies—for example, one for each graphic. I've been to such sites and had to accept a baker's dozen in cookies before the entire page would display. Some sites even let the cookies' expiration time serve as the security policy for timing out an active user. But don't count on that, because browsers don't have to respect expiration times. I've even seen servers go into infinite loops: checking to see whether cookies are set, redirecting browsers if they're not, and never telling users why things went wrong.

Can you tell I've seen more than my share of bad cookie code? Do you understand why the hairs on the back of my neck stand up when I hear "I need cookies for this application?" Well, read on.

The good news is that there's a reasonably safe way to use cookies. You can use them to brand or mark a particular browser for the duration of a specific browser session. To do this, use a single, small cookie with a short, but impossible-to-guess value (like the MD5 hash of some cryptographically strong material). The branded browser sends the cookie back to the server during the current session only.

Next, use the brand/mark as the key to a database so you can identify and validate a particular user for that browser. The database should include a time stamp of recent activity, and shouldn't be trusted after a time-out period has expired. Don't use the brand/mark value for anything other than this one-step user validation. If you do, that user can't migrate the session over to a new browser without restarting some of the transaction, and that's annoying. In fact, you should probably let the same user log in using multiple browsers simultaneously. After you validate a user, use the database's value for that person as a key into another database that contains session information, like shopping carts or personal preferences.

Easy Bake

Sounds hard? Naah. It's just a few dozen lines of Perl code. I hacked it out recently. Listing 1 shows a sample implementation of this strategy. Please keep in mind that this isn't a complete application; it's just the code that handles the "which user is logged into this browser?" part.

Lines 1 through 3 start nearly every program I write, turning on taint mode (good for CGI programs), warnings (good for catching stupid mistakes), compiler restrictions (for catching more stupid mistakes), and disabling buffering on STDOUT (good for CGI programs). Then, line 5 pulls in the veritable CGI.pm module, including all function shortcuts.

Lines 7 to 29 use a unique cookie to brand a particular browser. We have to do this before we send anything to standard output because we may need to issue a new Set-Cookie header. Or, we may want to redirect to ourselves first, as a cookie test.

Line 8 fetches the browser cookie if there is one. If there's a cookie, the $browser variable in line 8 becomes a unique string (actually an MD5 signature of some unique data). If there isn't a cookie, we have a bit of work to do to make this browser our own.

After this program has been invoked once (and we've fetched the browser cookie), lines 9 and 10 recognize that we have a valid browser ID. (I'll explain the _cookiecheck parameter a bit later.) If line 9 doesn't contain a cookie (in $browser), there are two possibilities: the cookie was never sent to the browser, or the browser didn't send it back. In either of these cases, use lines 12 through 15 to prepare a potential new cookie. The MD5 module (inside the CPAN) lets us create a 32-character hexadecimal string from an arbitrary set of data. We'll use the following pieces:

While this method of generating the new cookie isn't as secure as using cryptographically strong items, the CPAN contains other modules that make it harder to guess. Also, because I lifted this piece of the code directly from Apache::Session, a well-known chunk of code that handles session management, I'm confident knowing I can blame someone else (if someone breaks in with it).

Line 17 detects whether this is the first invocation of the program—as opposed to a subsequent invocation, where we've had at least one chance to set a cookie and it was refused. If the _cookiecheck parameter in line 17 contains a value, it means we've already tried to set a cookie and it was rejected (thus, it's not the first invocation). So, dump out an HTML page that asks the user to let us set a cookie (lines 19 through 24), and try to set it again. Hey, maybe the user will get tired of saying "Reject this cookie," or maybe he or she just didn't like the previous hex string. Who knows?

The form submission in line 22 causes us to return to the same HTML page, perhaps still with a value for the _cookiecheck parameter, which means we were rejected again. If _cookiecheck isn't set, we'll get two hits to return here again, just as when we started. If this is the user's first visit (and the first invocation), the _cookiecheck parameter won't be set. Set it in line 25, and we'll do an external redirect to ourselves to verify that the cookie is present. By line 30, we've branded the browser with a unique cookie identification, which is in the $browser variable.

The next step is to determine whether this browser is logged in or not. We'll keep track of this with a lightweight database, made possible with the CPAN's File::Cache module. (The author of this module has started to generalize the caching structure into a separate Cache::Cache module, so by the time you read this, things might work differently.)

Line 34 "opens" the cache by creating a cache object in the $cache variable. We'll set the cache items to expire in an hour, which means that no user can be logged in for longer than one hour of inactivity. You might choose to make the time-out longer for low-risk items or shorter for high-risk items, but one hour is a good starting point.

Sweep Up the Crumbs

Lines 41 to 44 handle a small housekeeping chore for the cache. If a user doesn't return and hasn't logged out, his or her cached user ID still exists as a file in the database directory until the next time it's fetched. However, that ID probably won't be fetched, because the cookie expires when the user closes the browser or after an hour of inactivity. That means we have a dead file sitting around. So, every four hours, the _purge_ entry expires and invokes a cleanup process that deletes any dead files. The purge process should be very lightweight. If you're concerned about doing this at CGI time, you could instead pull this out to a separate cron job—but be sure the job runs as the Web user, not as you.

Line 46 identifies the user associated with $browser, if any. If there's a valid entry in cache for a user, that user is current and is logged in as $user. If there's no entry in the cache for the user, or if there's an entry but it's older than one hour (so it's expired), we receive undef (for undefined user). That means there's no user associated with the browser who is uniquely identified by $browser.

Lines 50 to 66 manage the transitions between logged in and logged out. If the user is logged in and has requested a logout, lines 52 to 55 handle that. They delete the parameter that requests logout (for sticky forms), and remove the user's entry from the cache database. Afterward, the $user variable returns to being undefined, indicating that there's no user logged in.

Lines 57 to 65 handle logging in. First, the requested username and password are read. Next, the username is checked for the correct form (which I've arbitrarily defined here as "looks like a Perl identifier"). Then, we verify the correct password for this user by calling the verify function. I've defined a simple version of the verify function at the bottom of the program in lines 98 to 101. It returns true if the username is a substring of the password. Please don't use this in real life—this is just a demo. If the password is good, the $user variable sets. Otherwise, we reject the attempt.

Lines 68 to 83 deal with the useful actions within the current state. For example, each time a logged-in user returns to the Web page, we update the cache time in line 70 to reset the hour-long expiration (and let him or her stay for another hour from now). Line 72 displays a simple logout form button, which reinvokes the same program including a _logout parameter. (Recall that we tested this parameter up in line 51.)

For logged-out users, lines 74 through 82 indicate this status and present a simple login form with a submit button, using a table for the layout. Please don't fault my lack of HTML design skills; I'm illustrating structure here, not my graphics aptitude, which I admit is sorely lacking.

The code from line 85 down would be where your real application goes, using the previous code as a framework. Our current code ensures that the rest of the application can count on the facts that:

While developing this program, I wrote some testing code to find out what current cookies and parameters contained. I thought I'd leave that code in as an example of a "do-nothing" program. So, lines 87 to 94 execute a loop twice: Once with the $title variable set to Cookies and a $f variable set to the coderef for the cookie function (provided by CGI.pm), and another time with $title set to Params and $f set to the coderef for the param function. I originally wrote this as two separate displays, but then disliked the similarity of the code, which I factored out and parameterized. Thank goodness for coderefs.

Line 90 prints the second-level header for the title, then follows it with a table containing the cookie or parameter keys in the first column and their values in the second column. Because cookies and parameters can have more than one value, I added code to join multiple values with commas (line 92). Also, since both the keys and values can contain HTML-significant markup (less-thans, greater-thans, ampersands, and so on), I pass the data through the escapeHTML function (provided by CGI.pm) before display.

Note that line 93 invokes one of my test functions (cookie or param) with no arguments (so it gets a list of all things of that type). The end of line 92 invokes that same function and passes it one item of that type to get its value. It's nice to have that same interface. Finally, I described lines 98 to 101 earlier. They're also a part of the program that you should rewrite for use in a real application.

In summary, cookies can be reasonable for session management, as long as the state is clear (logged on or logged out), a logout button is clearly visible, the cookie expires when the browser is closed, and the session expires after an inactivity time-out value (typically an hour) is reached. Have fun handing out cookies, and don't forget the milk.


Randal ([email protected]) has coauthored the must-have standards Programming Perl, Learning Perl, and Effective Perl Programming.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.