The Client Statistics Gatherer   1 | 2 | 3 | 4 | 5 | 6 

What's the trick?

So how did I get around all this? It actually turned out to be fairly simple. (Then why all this bantering? Okay, it wasn't THAT simple.) The client statistics gatherer works by making a user think he/she is following a link from my home page, when they are actually submitting a form that records the screen and color data. The CGI script on the other end of that form does this, then sends the user to where he thought they were going in the first place.

The "simple" part comes from the fact that this is all achieved with the tiniest bit of JavaScript — about 2 KB, and a lot of that is just parsing the UserAgent string to pick out the useful morsels — a few hundred bytes on the home page for the form, and less than 1 KB of Perl on the server to process it. To put that in perspective, what you've read about this thing so far is almost three times that long, which probably makes you wish I could write the way I code. (Believe me, you don't.)

More specifically, here is how all the pieces work together to gather the data without bugging the user:

  1. When the home page is loaded, a JavaScript routine goes through the document object (DOM) and modifies all links on the page, so that they invoke a JavaScript routine, passing the original link as an argument. As an added bonus, the routine also adds an "onMouseOver" event to each link, so that the status bar in the major browsers still makes it appear that the user is going where they expect. (I don't know about you, but hovering over a link and seeing some weird JavaScript link in my status bar often makes me a bit nervous.)

  2. When the user follows a link, the JavaScript routine fills in the fields of the "statistics" form with the requested screen, window, and color data, then submits the form. One of those form fields is the argument passed in: the original destination to which the user intended to go.

  3. On the server end, a CGI script throws the form data into a log file, then sendds the user to the original destination. Note, this is done using HTTP headers, and not JavaScript, so the back button will work as expected, as unless the CGI script takes more than the expected microsecond, the user won't even know they went to a different URL along the way. The CGI script only records what was sent by JavaScript, although it could add more information like the IP address, the full UserAgent string, etc. I decided to keep it to a svelte 80-character line, though, so it'd be easier to scan and process. Of course, a real hacker would store this all in a database, but I'm a purist who belives in having the fewest possible points of failure — especially if the whole point is to be unobtrusive.

The first thing I tried, incidentally, was having an onUnload handler invoke the JavaScript routinet to submit the form, then redirect the user after submitting the form. However, this creates a race condition, since there are two calls trying to load different URLs in the browser window; IE ignored the form submit, and Navigator went schizo. I also tried having each link submit the form using an onClick event, then had the onUnload event send the user to the right location, but that had about the same result. Which brings up a good question: if an onClick event and an onUnload event both call functions that loads different URLs, which should win? (The answer is that most browsers will choke, so don't bother.)

What about users that don't have JavaScript? That's one nice thing about this solution: nothing happens at all. Without JavaScript, the links are never altered in the first place, so the page operates just as it normally would. Unfortunately, I don't get any statistics, but just querying for that information requires JavaScript, so the fact that I can't retrieve it is somewhat irrelevant. But it is important that those users aren't clicking on links that don't do anything. So now, thank the Gods, you can continue to browse my site from your web-enabled Palm...you dork.

(continued)

Copyright © 2002
Last updated: 28 Oct 2002 10:06:06