Tuesday, January 8, 2008

Understanding Unique Visitors

If you are into web analytics, you certainly had to explain page views, visits and visitors. When it comes to unique visitors, it gets a bit trickier and making sense of the hourly, weekly, monthly, quarterly and yearly unique visitors reports is difficult even for experienced analysts.

A definition of Unique Visitors

Here's the definition of Unique Visitors according to Web Analytics Definitions v4.0 published by the Web Analytics Association:
The number of inferred individual people (filtered for spiders and robots), within a designated reporting time frame, with activity consisting of one or more visits to a site. Each individual is counted only once in the unique visitor measure for the reporting period.
Unique Visitors are always calculated for a time period and a time reference. Here, the time period relates to "daily", "weekly", "monthly" and so on, while the time reference is the date range to which the calculation will apply.

Since a picture is worth a thousand words, lets explain Unique Visitors with a few examples.
  1. Bob access the site for the first time on the 1st of the month, at 10:00AM and view a couple of pages (the last request for page being precisely at 10:16)
  2. Bob comes back the same day, at 10:45AM and view a single page
  3. Bob comes again on the 1st at 11:25AM and spends about 20 minutes on the site
  4. Then, the same day at 11:55PM Bob does a last visit to the site and stays late, until 12:25AM on the night of the 2nd
  5. On the 10th of the month, Bob pays another visit to the site
  6. And finally, on the 1st of the next month, Bob comes back at 11:00AM
A bit complex, so let's look at this same information presented in the calendar view below (click for larger view).

Time Reference

Let's say you want to get Daily Unique Visitors. The web analytics tool needs to work within a specific time frame, a slice of time, a time reference (however you want to name it) that will be used to create one bucket for each hour where each visit from Bob will be put. At the end of the tally, we take each bucket and check if Bob is in there. As soon as we see him once, we know Bob is a unique visitor for that time period. We don't want to count how many times Bob is in a specific time bucket, just the fact of seeing him once is enough, thus the "Unique" aspect of the calculation.

To stress it even more, Unique Visitors should always be put in context of the time period and time reference being used.

Hourly Unique Visitors

Now we can take back our calendar and see what would be the Hourly Unique Visitors if we were to pick only the 1st day of the month as a time frame (in those examples, the reference time frame is always shown within brackets).

Let's check our buckets and ask ourselves: was Bob here between midnight and 1:00AM? ... Between 10:00AM and 10:59AM? And so on for each hour bucket. Once Bob is in a bucket, we can't add another Bob in there.

What's the Hourly Unique Visitors for the 1st of the month? Pretty easy: 3

What if Jane was here between 10:00AM and 10:15AM? We would put Jane in the right bucket (10:00-11:00AM) and the count for this bucket would be 2, for an Hourly Unique Visitors total of 4.

What about Bob's visit that span from 11:55PM until the next day? Since our time frame stops at midnight on Monday, it doesn't make a difference. If we were to extend our time reference to include the 2nd day, our tally would add up to 4 buckets where Bob was here.

A visit that spans two hours (or 2 days, 2 weeks, 2 months...) is a bit tricky. There was at least a Page View on Day 1, and at least one Page View on Day 2. If we look at Hourly Uniques, there's going to be 1 Visitor for 11:00PM, and 1 Visitor for 12:00AM on the next day. This example explains why, when you change your time reference, you don't get the same numbers. It also explains why you can't simply add up the Hourly Unique Visitors and expect to have a good representation of a Daily Unique Visitors count.

Weekly Unique Visitors

Now let's look at Weekly Unique Visitors for the time period between the 1st of the month and 31st of the month.
If we apply the same bucket principle, one bucket per week, we will see Bob in Week 1, Week 2 and Week 5. Thus, a Weekly Unique Visitors count as follow: Week 1 = 1, Week 2 = 1, Week 3 = 0, Week 4 = 0, Week 5 = 0.

If we had chosen a time frame encompassing the whole year, Bob's visit on the 1st of the second month would have been taken into account. The end result would have been different! Week 1 = 1, Week 2 = 1, Week 3 = 0, Week 4 = 0 but Week 5 = 1!

Monthly Unique Visitors

This one is easy if we keep the only the month as the time reference.
Did Bob visit us at least once during out time reference? Yes. Thus, the Monthly Unique Visitors count for our Month is 1.

Again, if we had chosen the whole year for the time reference, Month 1 would have a Unique Visitor Count of 1, and Month 2 would also have 1.

Identifying Unique Visitors

The most predominant method of identifying unique visitors is via a persistent cookie
that stores and returns a unique id value, so Bob is always the right Bob whenever he comes back. However, other methods of inferring unique visitors uses tricks to try to come up with a fair count of Unique Visitors (for example, looking at the combination of browser version, host OS, IP address). In theory, the most accurate way of counting Unique Visitors is to use an account id (such as on sites where you have to login) but even then, you will still have non-authenticated users. My recommendation in this case would be to use the default methodology and use the authenticated method as a segment.

Caveats

There's a catch to calculating Unique Visitors: some studies say 97% of users accept cookies, and Omniture says if a person accepts cookies, the accuracy of the Unique Visitor count will be as high as 99.5%. However, another research revealed a very large percentage of people delete their cookies (automatically or manually) about every month. What does it mean? Basically, the longer the time reference, the less accurate the Unique Visitor count will be. Think of it, if I delete my cookies ever month and I want to get Monthly Unique Visitors based on a full year time reference, I will be counted 12 times instead of once!

Similar problems arise for people using home computers and work computers, or multiple browsers. So now I have Bob at Home with Firefox and Bob at Work with Internet Explorer... two Bob's for the same human being... thus counting as two Unique Visitors.

Does it mean Unique Visitors is not good a good metric? Not necessarily, since all things being equal, you can still use Unique Visitors if you put them in the right context, you can still look at trends and compare period-to-period.

Conclusion

  • A Unique Visitor refers to an individual who has visited a site the first time within a certain time period.
  • Unique Visitors are counted for time periods (hourly, daily, weekly, monthly, etc.) within a time reference (time frame).
  • If you change your time frame, it is likely to affect the tally of the time periods.
  • Unique Visitors might be largely inaccurate because of cookie deletion and other issues in attempting to identify human beings through non-accurate technical means.
I hope it's clearer now. If not, let me know. If I made a mistake in my explanation, let me know!