Thursday, January 29, 2009

Sapient seeking consultant for Toronto

We might be in a recession, but jobs in our field are still available. This time I'm relaying a job from a contact I have at Sapient. They have 2-3 positions for contracting jobs of up to 6 months at their client location.

Who is Sapient?

Sapient is both an interactive agency and a consulting firm working with a number of prestigious clients: Coca-Cola, Times Online, Food.com, etc. They have thousand of employees and 24 offices around the world.

The opportunity

Sapient is seeking Web Analytics specialists for definition and preparation of regular interval reporting, ad hoc requests and business & marketing insight through analysis. They are seeking  candidates with experience monitoring Web site performance and marketing campaign activity. Candidates should have a firm grasp of marketing strategy, performance analysis and implementation of analytics tools. Candidates should be familiar with MS Excel, MS PPT and ideally Omniture SiteCatalyst or HitBox . Experience with online surveys, A/B and/or multivariate testing a plus!

This contract will require the Analyst to commute daily to the client site in South Western Ontario. Travel expenses will be reimbursed.

If you are interested, email me at shamel(at)immeria(dot)net and I will put you in contact.

Wednesday, January 28, 2009

WASP v1.08 released

WASP v1.08 is now available through the official Mozilla extensions site at addons.mozilla.org

This version fixes an annoying bug that was introduced in v1.07... In order to get approval on the Mozilla site I had to tighten the security around the core detection module. In doing so I broke the detection for some tools...

As always, you can get the latest version from WebAnalyticsSolutionProfiler.com or in Firefox, simply do Tools/Addons and click on the "Find Updates" button.

Interesting posts about WASP on immeria

The very first post about WASP was on September 12th 2006: Web Analytics Solution Profiler - WASP!

Tag auditing, using WASP for Analyst and WASP Pro sidebar and crawler:
Market research:
Other articles where WASP was mentionned:

Rants and praises

I'm receiving tons of feedback about WASP. Some of it comes through email, bloggers and well known analysts reference it, vendors includes WASP in their training and use it as a support and implementation tool. Here are a couple of interesting posts that are related or talking about WASP:

I also want to reiterrate my invitation to use User Voice to suggest and vote for new features!

WASP case study: WebTrends tag audit

While doing extensive tests with WASP on a site using WebTrends, I have identified a tagging issue. The site will remain unnamed to protect the innocents, but I thought that would make a pretty good case for WASP. Note that although WebTrends is shown in this example, the type of issue describe here is applicable to almost all vendors.

In this (long) post, I will walk you through a real case example of tag auditing with WASP. Some aspects are quite technical, but tag implementation, and furthermore the quality assurance of those tags, is often neglected. Yet, tag quality as a direct correlation with your ability to provide insight and business recommendations!

Even if you are not a technical person, read on, I'll hold your hand along the way :)

The WebTrends tag

Most vendors rely on the concept of page tags, a couple of lines of JavaScript embeded in each page of your site. A typical page tag for WebTrends might look like the code below, taken straight from their Tag Builder tool:
|!-- START OF SmartSource Data Collector TAG --|
|!-- Copyright (c) 1996-2009 WebTrends Inc.  All rights reserved. --|
|!-- Version: 8.6.0 --|
|!-- Tag Builder Version: 2.1.0  --|
|!-- Created: 1/28/2009 20:43:18 --|
|script src="webtrends.js" type="text/javascript"||/script|
|!-- ----------------------------------------------------------------------------------- --|
|!-- Warning: The two script blocks below must remain inline. Moving them to an external --|
|!-- JavaScript include file can cause serious problems with cross-domain tracking.      --|
|!-- ----------------------------------------------------------------------------------- --|
|script type="text/javascript"|
//|![CDATA[
var _tag=new WebTrends();
_tag.dcsGetId();
//]]||
|/script|
|script type="text/javascript"|
//|![CDATA[
// Add custom parameters here.
//_tag.DCSext.param_name=param_value;
_tag.dcsCollect();
//]]||
|/script|
|noscript|
|div||img alt="DCSIMG" id="DCSIMG" width="1" height="1" src="http://statse.webtrendslive.com/dscMySmartSourceId/njs.gif?dcsuri=/nojavascript&WT.js=No&WT.tv=8.6.0"/||/div|
|/noscript|
|!-- END OF SmartSource Data Collector TAG --|

Debugging: the traditional way

Although I receive ton of positive feedback, I also got some skeptics people arguing that using "debuggers" (for example, Firebug) and "proxy debuggers" (for example, Charles Proxy), or even simple "web bug" checkers or parsers that looks at the page source code to see if the tag strings are there is sufficient.

They simply can't see the value WASP is providing!

Note: I'm mentioning Firebug and Charles Proxy because they are good tools and they have their place in the web analyst/web implementation specialists arsenal.

The WebTrends data collection URL

With other tools you would get an indication the WebTrends JavaScript include file is there (typically, webtrends.js, as shown in the page tag abstract above), or a debugger would show you something like this:
http://www.webtrendslive.com/dcsabcdefghijklmnop1234dcs.gif?&dcsdat=1232652422330
&dcssip=www.somesite.com&dcsuri=/dir/subdir/subsubdir
&dcsqry=%3FOpenDocument&LID=RONav0008&WT.cg_n=NewsMedia;
&WT.cg_s=PAC;&WT.tz=-5&WT.bh=14&WT.ul=en-US&WT.cd=32
&WT.sr=1680x1050&WT.jo=Yes&WT.ti=ABC:%20Public%20Awareness%20Campaign:%20ABC%20ABC%20Case%20Studies%20&%20Marketing%20Statistics:%20ABC%20ABC
&WT.js=Yes&WT.jv=1.5&WT.bs=1019x770&WT.fi=Yes&WT.fv=10.0
&WT.vt_f_tlh=1232684821&WT.vt_sid=96.21.212.87-3063823040.29981510.1232684716744
&WT.co_f=96.21.212.87-3063823040.29981510

The challenge

Without looking at the solution below, if I told you that:
  1. The javascript tags are fine
  2. The tags are firing since I can see the URL flying by in a proxy debugger
  3. Data is being collected since I can view my reports online
Would you be able to identify the problem?

All elements of the solution are shown above in the page tag and the collecting URL, and although I can't show you the WebTrends report, I can assure you I'm getting data in my reports.

The solution, in short

Some pages use titles with the ampersand (&) character (yes, as shown in the obscure tag URL shown above!). The page title is automatically passed to WebTrends via the "WT.ti" parameter. The problem is this character should be encoded/escaped. Otherwise it will break the URL terminology WebTrends is expecting and likely result in shortened page titles report or even plain rejection of this data. You would likely miss some data, but not all, since some pages titles do not use the & character! Thus, you still gets reports with some data...

WASP for Analyst sidebar view

In the WASP sidebar view shown right (click for larger view), we see the Title (WT.ti) tag, and we see an odd tag further down. Although it's doesn't seem obvious to the non exercised eye, this is a clear indication something is wrong since each row should be composed of a value pair with the tag on the left, and its value on the right.

WASP Pro crawler

The sidebar view is excellent for seeing the tags in the context of your browsing session. But for deeper analysis of site tags, using WASP Pro crawler feature is a must. It will start from a given location, typically your home page, and visit each links of your site and gather the detailed information about the tags.

A snapshot of the built-in Data Browser is shown bellow. I have removed most of the information and kept only a few columns, but you can see some HTML Titles and how they have been populated to the WT.ti variable. While a couple of them are fine fine since the "&" character was correctly escaped in the title, some others are cut off.

The data browser is specifically built to make it easier to view all tags, sort or filter their values. This makes it very easy for quick perusing of the crawl result. Of course, you can also export the results to a CSV file to play with the data in Excel, or even to XML if you ever need to integrate the crawl results into another system.

Abstract from Webtrends documentation

Future releases of WASP will include data validation rules so each value sent will be checked against acceptable characters and length rules. In order to do so, I need the vendors to provide detailed information about the acceptable values for their tags, as shown in the WebTrends installation guide abstract bellow:
URL Encoding
Certain characters can cause problems when used in query parameter values. For example, for a WebTrends query parameter assignment of WT.ti="The Gettysburg Address"; SDC writes the following value to the log file:

&WT.ti=The Gettysburg Address

The space characters in this value cause problems because the space character is used to separate fields within a log file. The solution is to URL encode all query parameter values. URL encoding means replacing certain characters with their hexadecimal equivalents of the form %XX where % is the escaping character and XX is the character’s numeric ASCII value. URL encoded characters are properly rendered in WebTrends reports.

Continuing with this example, the URL-encoded form is as follows:

&WT.ti=The%20Gettysburg%20Address

Note that space characters have been replaced by %20.

The tag URL encodes the following characters: tab, space, #, &, +, ?, ", \, and non-breaking spaces. These characters are defined in the regular expression list. The regular expression list contains regular expressions to search for, and the corresponding %XX replacement strings. Regular expression properties are used as arguments to the string.replace method. The tag URL encodes parameter values by passing them as arguments into the dcsEscape function.

Conclusion

There is absolutely no way other methods would have helped you spot this kind of issue. I don't have to convince anyone that good analytics start with good data. You could spend countless hours, even days, trying to find a problem like this one (and many others of the same type!), while you could be spending time doing useful analysis and providing insight & recommendations to improve your business.

Go ahead, give a try at the free version of WASP, or get the WASP for Analyst or WASP Pro licences. Visit WebAnalyticsSolutionProfiler.com now!

Monday, January 26, 2009

Switching Web Analytics solution: a case study

While doing quality assurance of a migration from WebSideStory HBX to Omniture SiteCatalyst we noticed some variations in the atomic metrics (or what the Web Analytics Association calls "building blocks"): page views, visits and visitors. While expected, I think it is important to understand why there are such discrepancies between various web analytics tools.

In this case study I'm demonstrating HBX and SiteCatalyst, but similar differences could be noticed with any tools. As you will see, generally the trends are identical, but the scales are different. Let's take an example from our daily life. Oven and meat thermometers both measure temperatures; a good cook will be able to use both to make a great dinner; but don't expect both thermometers to show the same result or you're calling for disaster!

In this article, I will focus only on page views, visits and visitors since almost all other reports are derived from those base metrics. I will consider the following points:
  • Hypothesis behind this analysis
  • The Page View metric
  • The Visits metric
  • The Visitors and Unique Visitors metrics
  • Factors that can influence those metrics

Methodology

The same site was tagged with both solutions, received a significant volume of traffic, and we collected data for over a month. The site only had a couple of different templates so we could easily make sure all pages were tagged and working correctly. WASP was used to audit the site and make sure the data being collected could be trusted. The exclusion filters and other configurations were double-checked to make sure they were similar.

There are few documented examples of companies switching from a product to another. Most observations are either anecdotal, the two products were not used in parallel or the web analytics switch coincided with a site redesign. For the same tools, some people have reported similar results while others have seen very different ones. The word of caution here is: "results may vary". Depending on the nature of your site and the type of traffic you get, which tools you are switching from and to, the impact will likely be very different.

Page Views

Page Views are the atomic elements of web analytics (now, along with events). As such, page view is the metric that should be the reported consistently by various web analytics vendors.

Both HBX and SiteCatalyst defines a page view as (ref. Omniture KB article #7824) "a request for a full-page document containing tracking code on a website. Each time the page is loaded, an image request is generated. Refresh, back and forward button traffic as well as pages loaded from cache will be counted as a page view." The WAA definition is rather short, stating simply it's "the number of times a page was viewed."

Looking at the graph (click for larger view), we see the discrepancy between the two solutions is acceptable, at an average of 2.2% (control limits between 1.5% and 3%, shown on the 2nd vertical axis). In this case, HBX is reporting slightly higher numbers than SiteCatalyst.

Visits

In both products, "a visit begins when a visitor first views a tracked page on a website. The visit will continue until there have been no additional tracking requests from the site for 30 minutes (default) or until the maximum visit length occurs (12 hours)."

If a visitor stays on one page for more than 30 minutes without additional tracking requests being sent and then resumes viewing additional pages, causing additional tracking requests to be sent, both products will register a new visit.

Both products require persistent cookies to count a visit. If a browser does not accept persistent cookies, both products will not calculate a visit for that visitor (but will track the page view).

This is compliant with the WAA standard definition of a visit: "A visit is an interaction, by an individual, with a web site consisting of one or more requests for a page. If an individual has not taken another action (typically additional page views) on the site within a specified time period, the visit will terminate by timing out."

The results for visits shows more variations, at 12.6% (control limits between 11.4% and 13.8%). That is, SiteCatalyst reports more visits than HBX does.

Visitors and Daily Unique Visitors

Since this metric is highly influenced by Visits, the discrepancy is also high, at 10.5% (limits between 9.1% and 11.9%). But here, HBX and SiteCatalyst handle the identification of unique visitors very differently. Although they both use persistent cookies, SiteCatalyst can fall-back to IP+User Agent to track unique visitors. The impact is significant:
  • HBX will still record page views for visitors rejecting cookies, "but will not be counted as a Visitor or Visit and will not generate visit-level data." (KB #7824)
  • SiteCatalyst, however, will still count visitors rejecting cookies as a Visitor (via fall-back) and will generate page-level data, but will not be counted as a Visit or generate visit-level data.
This explains why the discrepancy in Visitors might be lower than the one for Visits.

Factors to consider

Why such differences? Here is a list of various factors to consider:

1st party vs. 3rd party cookies
We can't insist enough on the importance of using 1st party cookies. In our case, HBX was historically tagged using 3rd party cookies (served from hitbox.com), while SiteCatalyst was tagged using 1st party cookies (served from their own domain). 3rd party cookies are much more likely to be blocked and deleted, some estimates states up 40%, even up to 50% of 3rd party cookies are either blocked or deleted on a monthly basis.

Tag code location
Some internal studies conducted by Omniture revealed that, for the same tool, putting the tagging code at the top vs. bottom of the page could lead to 3% to 8% differences in page views being collected. Depending on the site, visitors quickly sifting through page navigation might click on a link before the page is fully loaded, thus, before the tags are fired.

In our example, the SiteCatalyst code was implemented near the top of the BODY tag, while the HBX code was just before the closing BODY tag. Good practice generally recommends putting tags at the end of the page to alleviate any negative impact on the page loading time. However, there are cases where you will consciously decide to put it at the top.

Session timeout
SiteCatalyst uses a 30 minute default timeout period before considering a visit complete.

In SiteCatalyst, upon request, this time period can be customized by report suite. However, both the WAA and the IAB recommends a 30 minutes session timeout.

Visit reporting time period
Suppose I’m in the GMT-5 timezone (Quebec city, Eastern Time), and your company is somewhere in GMT-6 (Central Time), and Omniture is at GMT-6 (Utah, Mountain Time)... If my visit starts at 11:55pm (local time) and go on for 10 minutes after midnight, in which day should my traffic be recorded?

The Omniture KB article  #7824 states for both solutions that "a visit is reported only during the time period in which the visit is initiated". Furthermore, article #633 indicates that unique visitors are based strictly on the time zone specified in your report suite. Thus, if the reporting time is Mountain Time, and my visit starts at 11:55pm (EST) and last until 12:05am the next day, it would show up as being 10:55 pm when viewed from a Central Time perspective. Information regarding HBX wasn't available. However, taken globally, this would certainly not explain a difference of over 12%.

Conclusion

In this case, we know the atomic metric (Page Views) is accurate. We now have a fair explanation for the reported number of Visits & Visitors: different calculation methods, 1st party vs. 3rd party and code location plays a important role in the accuracy of your metrics. Even though the variance in Visits is a bit high, they still move in concert with one another. As David Kirschner, Principal Best Practices Consultant at Omniture, rightly pointed out "the causes for these discrepancies may be difficult to explain, but they should not prevent you from taking action on the data".

As a web analyst, and in such situations, your role is to explain in non-technical terms why there are differences in metrics. Always stress the significance of trends and if you can, run the two solutions in parallel and show the difference in scales. Use the thermometer analogy, or this one: CAD vs USD: both reports monetary amounts, they both use the same metric terminology (Dollars), but their values are different. For most people, trying to understand the intricacies of such a difference is not necessary in order to make good business decisions.

Please contact me if you need professional auditing of your web analytics implementation.

Acknowledgments: I want to thanks Penny Tietjen from Quickbooks Support at Intuit, as well as David Kirschner, Paul Kraus, Martin Liljenback, Alex Hill and James Kemp from Omniture for their great support.

Sunday, January 25, 2009

Whitehouse.gov: snafu around a web bug

Even the best of intentions are often incorrectly brought to life. Here are some random quotes & opinions about the InformationWeek article entitled "White House Web Site Revisits Privacy Policy".

Note: in this post, emphasis in quotes are mine...

Privacy Policy

1st mistake: forgot to update privacy policy. The whole thing seems to have started because 3rd party cookies from WebTrends were found while the privacy policy didn't mention it.

Of course, stating "attorney" and "web bug" in the same sentence is nothing to reassure casual users and those not educated about web analytics (emphasis mine): "an attorney, warned [...] the WhiteHouse.gov site contains a Web bug."

"Certain details"

The long article continues and mention "in the process of receiving the remote request from WhiteHouse.gov [...] WebTrends also receives certain details". "Auerbach observed [...] that while he recognized some of the data requested [...] the other data gathered by WebTrends was unclear." Then the article goes on to (try) to explain "what" is being collected and fail miserably. Why? Because most web analytics vendors do not provide detailed breakdown and disclose usage of each of the parameter data they are collecting.

Using WASP would easily reveal most of value-pair information sent to WebTrends. While most tags are documented in vendors implementation guides, some of them are not. Getting information about their specific use is very difficult. And this is not specific to WebTrends, other vendors are also very reluctant in letting end-users know exactly what each tag is being used for as I found out myself.

Is it legal?

I don't want to put oil on the "is it ethical" fire. To me, that's a waste of time. However, suggesting  web analytics (in general) might be illegal is a whole other story: "I would suggest that since the collection, aggregation, and conveyance of the data to WebTrends is from the user's computer and not from WhiteHouse.gov that a very strong argument can be made that the data belongs to the user, not WhiteHouse.gov".

However, "Auerbach concedes that the data sent to WebTrends may not be clearly categorizable as personally identifiable information."

Case closed.

JavaScript Security?

"[JavaScript] is not a particularly safe practice or good for privacy, although most major sites still do it anyway, using WebTrends, Omniture, or Google Analytics". We've been arguing about that for years, yet, I still talk to a lot of people who think JavaScript and cookies are evil. As with any other technology, they can be used to make our lives better... or turn them into nightmares. Such bold statements, like any radical opinion, is never good to foster discussion and advancement.

My take

My takeaways from this story:
  1. The privacy policy of Whitehouse.gov should be reviewed.
  2. WebTrends should also be configured to use 1st party cookie (although, in the end, the data would still be sent to WebTrends anyway).
  3. I will continue to talk with vendors to provide full disclosure about the exact use of each of the parameter they receive and include that information in WASP
  4. Privacy advocates and casual users certainly needs to be better educated about web analytics

Friday, January 23, 2009

Web Analytics Conversations: Winter 2009

I've been following dozens of web analytics blogs over the past couple of years and one of my goals has always been to make it easy to share and leverage this information with other web analytics practitionners.
  • 2007: I created a master Feedburner feed I called "Web Analytics Conversations" and manually maintained a list I published on my blog. Maintaining the list was long and tedious.
  • 2008: In the fall of 2008, I developed a mashup of Feedburner and Google Blogs Search which provided latest posts and automated ranking of blogs. However, Feedburner is having so much trouble and the API is so unreliable that my solution didn't really work well.
  • 2008: In late 2008, AllTop create a topic dedicated to web analytics. However, I don't find the interface that intuitive either.
  • 2009: I have created a Wikio page dedicated to web analytics. This looks very promising: all feeds are readily available if you want to subscribe to specific ones and you can vote for the best articles. The approach is very interesting, so I'll give it a try!
    add page
    However, there are a couple of things I'm not sure about: If you add this page to your own Wikio account and I add/remove feeds later on, will you get those changes too?
Check out this very cool Google News visualization from marumushi.com:

Now, it would be amazingly cool to have this feature for the Web Analytics Conversations!

Job: Senior Web Analytics Specialist at Mediagrif

I'm relaying a job offer I received from Mediagrif, a leading operator of e-business networks and provider of complete e-business solutions located in Montréal.

I have rarely seen such a thorough and complete description for a web analytics role, tip of the hat to Mediagrif for having a very clear description for this great job!

Position overview

The successful candidate will be responsible for driving the web analytics strategy of Mediagrif’s B2B websites, managing business analysis, reporting, and forecasting, and playing a key role in the business planning process. They will integrate data from different sources, track online performance as it relates to the total multi-channel business and develop insights with particular emphasis on understanding the effectiveness of user flows, new features, marketing programs and business intelligence. Their analysis will facilitate data driven decisions, highlighting strengths and weaknesses of the business units, presenting concise conclusions and proposing test paths for improvements. The candidate will be passionate about web analytics and remain updated on the latest web developments and best practices.

Responsibilities

General
  • Optimize implementation of Omniture web analytics toolset and utilize to analyze traffic and behavioral segments on our sites.
  • Develop and maintain consistent reporting framework; create holistic dashboards pulling data from different data sources.
  • Work with business owners to develop benchmark criteria for measuring online metrics; create and manage KPIs; provide analytical support, especially in the product definition phase; present cross-functional analysis.
  • Work with IT department, marketing, product management and business partners on data flow requirements and monitoring; perform audits and verify process integrity throughout campaign cycles.
  • Identify how effective various programs and Web site flows are performing, solicit and/or recommend improvements that target conversion rates and user satisfaction, implement and manage the testing of improvements and effectively communicate findings.
  • Deliver web analytics presentations in a variety of forums to employees at all levels of the company including to senior management team.
  • Assist the online and emarketing team by providing web analytics and insights to make data driven decisions; analyze web analytics information and provide recommendations in the context of campaigns, web site redesigns and new product development
  • Proactively seek to improve day-to-day efficiency and automate activities wherever possible.
Client management:
  • Possess understanding of clients’ market dynamics and act as a consultant to clients for the planning and execution of their online marketing needs
  • Provide recommendations on how to improve ROI of online marketing campaigns and product development
Project management:
  • Identify, create, prioritize, delegate tasks
  • Monitor project/task progress and escalate project related issues to appropriate teams (client, integration team, design team, etc.)

Education

BA or BS in Statistics, Business, Economics, Math or Computer Science (Masters preferred)

Experience

Minimum of three years hands on experience analyzing Web data for a medium to large ecommerce Web site or group of Web sites (5 years preferred)

Skills & competencies

Personal skills
  • Ability to speak and write fluently in French and English
  • Is goal and results oriented
  • Strong leadership, organizational, statistical, analytical and problem-solving skills
  • Self-motivated with excellent interpersonal and written communication skills
  • Ability to establish, nurture, and support good working relationships with internal and external clients
  • Client focused, creating an environment in which concern for client satisfaction is a key priority
  • Highly adaptable to change with the ability to identify, evaluate, and mitigate risk
  • Has strong initiative and is a creative thinker
  • Able to make complex decisions and solve problems involving varied levels of complexity, ambiguity and risk
  • Ability to cope with stressful situations
  • Curious nature is an asset
  • Highly effective working in teams with tight deadlines.
  • Ability to manage multiple projects simultaneously
Technical skills
  • Designing, constructing, and running reports using various online web-analytics solution
  • Constructing reporting dashboards in Excel with external queries.
  • Assisting in various web analytics projects, using data-mining techniques.
  • Mid-level expertise in the SEM (Search Engine Marketing) / PPC (Pay Per Click) and SEO (Search Engine Optimization) strategies and a minimum one year experience measure success of SEM/PPC and SEO campaigns / efforts.
  • Mastery of Excel, Powerpoint and other software tools for reporting and presenting data.
  • Thorough understanding of Web technologies such as HTML, CSS, XML
  • Expert at Access, SQL Server, Oracle, a plus.
  • Demonstrated ability to collect data, analyze trends, find stories, draw conclusions and act on recommendations.
  • Experience with web analytics program e.g. Coremetrics, WebTrends, Visual Sciences, Unica, Google Analytics)
  • Experience with statistical programs such as SAS and SPSS, and with statistical testing solutions such as Offermatica and Optimost.
  • Experience tracking solutions for technologies such as Flash, AJAX , video, etc.
  • Experience writing and de-bugging sophisticated JavaScript tags is desirable

If you want confidential/anonymous info about this job or Mediagrif, feel free to contact me privately, or get in touch with Human Ressources at Mediagrif directly.

If you are en employer seeking talent, or you are looking for a job, I'll be glad to relay the information.

Wednesday, January 21, 2009

WASP v1.05 released

Another round of bug fixing. This time I solved a couple of very obscure and hard to replicate issues.

As always, you can get the latest version from WebAnalyticsSolutionProfiler.com

I have also enabled the purchase of WASP Pro + Market Research. This version includes the capability to scan thousand of websites and get vendors market shares such as my study of the Top 500 online retail sites. This is especially useful for vendors & agencies to do competitive analysis and business development, as well as for market/financial analysts.

I also want to reiterrate my invitation to use User Voice to suggest and vote for new features!

Google Analytics: script to track outbound links and downloads

Update 2011-10-10! Check out gaAddons free script example! 
Important! Please see "gaAddons open source project: enhancing Google Analytics" instead.
In my opinion, tracking outbound links and downloads should be part of any good web analytics implementation.

In the past, Google Analytics trackPageView() call was often used to track outbound links and downloads. However, with the introduction of events tracking, I find it much more appropriate to use this technique. Counting downloads and outbound links as additional page views impacted several reports (Pageviews, Pages/Visit, Bounce Rate, Top Content, etc.) and people often had to set Goals in order to track those specific Pageviews as success events. No more!

Introducing gaAddons.js

The goal of this script is to automate a couple of common tagging requirements: outbound links, download links and email links tracking. It makes the job very easy and reduce risks of errors. Basically, just add a reference to gaAddons.js as explained bellow and voilà!

Outbound links:
  • Event: "outbound"
  • Action: "click"
  • Label: target URL
Email links:
  • Event: "mailto"
  • Action: "click"
  • Label: email
Download links:
  • Event: "download"
  • Action: "click"
  • Label: URL of downloadable document. The default looks for the following regular expression:
    /\.(docx*|xlsx*|pptx*|exe|zip|pdf|xpi)$
    It basically says: "Look for any links ending with a dot, followed by any of the popular file extension"
    Those can easily be changed in the gaAddons.js script.
If you have other automation you would like to see let me know. You are free to modify this script and re-distribute it as long as you keep the references to the authors at the top of the script.

Get going with gaAddons.js

To get going, do the following:
  1. Get the gaAddons.js script and save it somewhere on your own server
  2. Add this line near the closing BODY tag of all your pages, just after the standard Google Analytics tag:
    <script src="/js/gaAddons.js" type="text/javascript"></script>

Integrating the code

Your code should look like this:
<script src='http://www.google-analytics.com/ga.js' type='text/javascript'></script>
<script type='text/javascript'>
var pageTracker = _gat._getTracker("UA-999999-1");
pageTracker._trackPageview();
</script>
<script src='gaAddons.js' type='text/javascript'></script>

Your tracker variable should be named "pageTracker", otherwise you will need to modify the gaA_pageTracker variable in gaAddons.js code.

Looking at the stats

Once your change is live, give Google Analytics some time to gather data and look under Content/Event Tracking for outbound, mailto and downloads stats (if using events, the default), or under Content/Content Drilldown for /outbound, /mailto or /download statistics if using the page view method.

---
Credit where credit is due! A while back, Justin Cutroni at EpikOne published "Google Analytics Short Cuts". This script is an optimized/improved version of his script. Also inspired from the work of Brian Clifton, author of Advanced Web Metrics with Google Analytics. Contributors: Damon Gudaitis, Andy Edmonds.

Monday, January 19, 2009

WASP v1.04 now available

Just released WASP v1.04 at http://WebAnalyticsSolutionProfiler.com

Thanks to your bug reports and suggestions I have completed another round of improvements! Keep them coming!

Tips & Tricks

  • When the sidebar is open, right-click an item to copy its value
  • You can now resize the sidebar to the width you want!
  • The built-in data browser shown after a crawl includes pre-defined views for Google Analytics, Omniture SiteCatalyst, WebTrends and Search Engine Optimization
  • In the data browser you just have to click on a cell to filter those records, or click on a column header to toggle sorting in ascending or descending order
  • Use the "Export to CSV" or built-in data browser to easily spot missing or legacy tags, wrong data being sent or duplicate titles

Automated updates

In order to receive automated updates the WASP extension must be distributed through addons.mozilla.org (aka AMO), the official Firefox location. The AMO team is overwhelmed by the number of addons being submited and the approval process is taking months. As soon as WASP gets back on the AMO site you will start to receive automated update notices when I publish a new release. This will

Since WASP is a very complex and sophisticated extension, I'm actually discussing with the AMO Editors to help them with the workload and become an AMO Editor myself.

The future of WASP

WASP is the result of 2 years of thinking, testing ideas & developing in parallel with my other activities. During all that time you have been witnesses of the evolution of WASP and have benefited of WASP for free, For sure, it was buggy at time and not always working well. It was a beta!

I'm convinced WASP v1.0 is a much easier tool than debuggers, provides much more accurate results than other alternatives and at the same time, is a fraction of the cost of high end alternatives. There is definitely a value to that! Give it a try if you haven't already. Then, get there and purchase your license! Vendors and agencies: you can get volume discounts!

I will continue to improve the product, solve issues and bring new features. It’s not a perfect product and I will never pretend it is: I’ve been in the business for long enough to know better than that.

I need your financial support to make YOUR job easier!

Wednesday, January 14, 2009

Predictive analytics

If you think web analytics is interesting, you should also look at "predictive analytics". I'm not talking about fellow bloggers who did predictions for the web analytics industry in 2009! Those are amusing at best and most of the time they sound like crystal ball gazing.

What is predictive analytics anyway?

I'm talking about the science of predictive analytics, what Wikipedia defines in this way:
Predictive analytics encompasses a variety of techniques from statistics and data mining that analyze current and historical data to make predictions about future events.
Eric Siegel, Ph.D. in data mining, text mining and predictive analytics defines predictive analytics in his own terms:
Predictive analytics is business intelligence technology that produces a predictive score for each customer or prospect. Assigning these predictive scores is the job of a predictive model which has, in turn, been trained over your data, learning from the experience of your organization.

Predictive analytics optimizes marketing campaigns and website behavior to increase customer responses, conversions and clicks, and to decrease churn. Each customer’s predictive score informs actions to be taken with that customer - business intelligence just doesn’t get more actionable than that.
Additional info on predictive analytics

Predictive analytics training

At one of the previous eMetrics conference I had the chance to attend Eric Siegel predictive analytics class. Great guy, I don't know if he is still doing is "analytics rap" song, but that was hilarious! But aside from being fun, the class was really enlightening with a good balance of concepts and real world examples and exercices. There will be a two day training session alongside eMetrics Toronto on april 2nd and 3rd and usually and probably another one at eMetrics in San Jose.

Predictive analytics conference

If a two day training isn't enough, you should attend the first ever Predictive Analytics World conference, going on in San Francisco, February 18-19. With speakers from Wells Fargo, the San Diego Super Computer Center, Google, Yahoo!, 3M and a slew of others, the conference will be organized in two tracks and offer two workshops: modeling and decision. If my predictions are right, this is going to be a hell of a conference! :)

Get 15% discount! Enter code immeriapaw09 when registering.

Take the Predictive Analytics World Survey

As predictive analytics quickly expands across verticals and applications, Eric and his team need your help to understand what the evolving landscape looks like. He kindly asked me to relay the information and take a few minutes, answer a handful of questions, and help him and everyone of us!

Survey result will be made available before Predictive Analytics World.

The PAW survey focuses on business applications of predictive analytics, complementing the Rexer Analytics 2008 Data Miner Survey, which covers the most popular software tools, which verticals have embraced modeling and more.

Take the survey now!

Tuesday, January 13, 2009

Cool features on the new WASP site

I revamped the WebAnalyticsSolutionProfiler.com website when I launched WASP v1.0 last week. I'm now leveraging a couple of pretty cool services. The number of innovative services is staggering, but here's a couple I find most interesting.

Kampyle

Kampyle feedback analytics: I've been using it for a while and like it very much. I talked about it many times already. For the site revamp, I have modified the feedback categories and sub-categories to better fit my needs. Each page includes the Give Feedback button so visitors can easily get in touch with me, either anonymously or providing their email. It's a great way to engage the conversation privately!

Get Satisfaction

Get Satisfaction is the "people powered customer support". A mix of discussion forum and trouble ticket system. Again, a great way to be "open" about feedback and issues. I'm also using it as a knowledge base, so whenever I get asked a question through other channels and find it valuable, I simply enter it in Get Satisfaction so it becomes available for my WASP users to see.

User Voice

User Voice might be the "voice of customer" at its best! WASP aficionados can ask for new features and enhancement and vote for the ones they find the most interesting. Right now, anonymous voting is enabled, but I could turn it off if I find there is some abuse...

Thursday, January 8, 2009

Do you hear a buzz? WASP v1.0 launched!


After about 18 months of nursing, the WASP is finally coming out of its nest and making its way out!

You still don't know what is WASP? It's the Web Analytics Solution Profiler, a specialized Firefox extension aimed at web analytics professionals who want to do quality assurance and understand how their web analytics solution is implemented.

There are three types of WASPs:
  • Free: Part-time analyst, and occasional testing.
  • Analyst: Full time analyst and marketing consultants.
  • Pro: Implementation specialists and consultants.
Check out WebAnalyticsSolutionProfiler.com
You can now purchase your own license!

Main features

WASP is the only solution of its kind! Think of it as a friendly debugger for web analytics (and more)!
  • The sidebar view provides in context information as you browse
  • The powerful crawler and built-in data explorer make it easy to do quality assurance of your tags
  • Frequent updates and enhancements
  • Work with over 120 web analytics, ad networks, voice of customer, multivariate testing and behavioral targeting solutions
  • Enhanced for Google Analytics and Omniture SiteCatalyst
  • Praised by web analysts around the world and cited as "Web Analytics Top 5 for 2008"

Get involved!

Do you have an idea for WASP? Check out the WASP User Voice and cast your vote for the features you would like to see in WASP, or propose new ones!

Thanks!

Developing such a tool is a long and tedious task. The web analytics community (that's you!) helped me out by providing invaluable feedback and suggestions. In return, WASP was free.

A free version will always be available, but now is the time for WASP to fly on its own and bring enough revenues to justify the countless hours spent nurturing it. But what's even more important, continue to improve WASP in the future.

WAA standards definitions and Dainow

I read Brandt Dainow argument about "The web analytics standard that failed us".

For some reason, it seems to be another rant against the Web Analytics Association, but that's another story and I don't want to know about it.

In this post, I'll give my opinion about:
  • Marketing and IT parallel universe
  • The value of the WAA
  • Consensus vs. authority
  • Algorithm and patents
I conclude with what I see in the field.

Marketing schmoozing vs engineering complexity?

In some respect, Dainow is right. The current WAA definitions is the result of "an amalgamation of various perspectives arrived at through a consensus" (from the WAA document) (i.e. avoid alienating any vendors). Just like the WAA site didn't have any web analytics tool for a long while in order to avoid political snafu with vendors. This is my opinion as a WAA member, based on informal discussion, and maybe I'm totally wrong here!

But beyond that, being myself from a technological background and a witness of the parallel universe separating marketing and IT, it's clear to me those definitions are written in marketing terms rather than engineering terminology. Obviously, Dainow would like to see the definitions written in a more engineering lingua franca. I thing that's the whole point of his argument.

Rather than throwing stones at it, Dainow should realize this document was published with the mention "For public comments". In a way, that's exactly what he is doing with his provocative style.

The value brought by the WAA

"The end goal is to have true metrics standards and uniform adoption of these standards throughout our industry". To get there, "the WAA established the Standards Committee to rationalize variations within the analytics community, and to create a standard terminology for the analytics community."

The point is right there: standard terminology. While Dainow aims to impose a technological standard which would need to be developed, supported and imposed by a standard organization, the WAA approach is to work with involved parties to develop a common terminology which, it hopes, will become a "de facto standard".

Consensus vs authority

We shouldn't forget the WAA is a volunteer organization. If I had to write "standards" as a job (i.e. be paid for it). I would certainly spend countless hours doing it and reaching out to vendors, signing strict NDAs, making friends with the engineers, etc. As volunteers, there's so much we can do. Just writing this post took me valuable time I could have spent invoicing my clients. In a troubled economy, we are looking closely at the time we spend doing "ancillary" stuff and how much those activities are bringing back in our own personal pockets. That's the sad reality!

The ISO is a standard body, and in order to claim ISO certification you need to abide by strict rules, undergo audits and pay to retain certification. On the other end, the IAB is a more open structure, much more similar to the WAA, that was able to define common terminology regarding online advertising and bring "standards, guidelines and best practices". Which one makes most sense? In my mind, the second option is the way to go!

Algorithm and patents

As a tutor of the UBC Award of Achievement in Web Analytics, I have had the chance to tutor to hundred of students. Do they really care about the engineering definition of a visit? Not really. They care about the way they will be able to communicate that information to managers and make them act on it.

My eBusiness MBA thesis will be a review of the literature available on the topic of web analytics. Simply put, literature about the concepts or use of specific products (especially Google Analytics) abound in the form of books and blogs. The rationale behind those is often a desire to stand out of the crowd and serve as bait for high income consulting services. However, scholarly literature with an engineering slant is almost non-existent. The reason might be simple: we're in the realm of privately owned algorithm and patents. For vendors, there is simply no interest in seeing what is perceived as a secret sauce revealed to the world.

Conclusion

Do we need an engineering definition of a page view, a visit, a visitor or a bounce (maybe especially bounce!)? Eventually, absolutely! But what we first need to achieve is a common communication language.

I think that's the true objective of the WAA Web Analytics Definitions document. And we're closer now than we were last year!

Monday, January 5, 2009

Quality assurance of web analytics tags implementation

This article covers the following elements:
  • The facts about web analytics quality assurance
  • The methodology usually employed to ensure quality
  • The solution: WASP v1.0 coming up!
  • Additional references: posts by fellow bloggers
  • eMetrics Toronto: round table discussion on this topic and speaking about web analytics maturity

The facts

John Lovett, from Jupiter Research, said that "42% of web analytics clients surveyed reported data accuracy as an important factor when selecting a vendor". Yet, a torough quality assurance of web analytics tags implementation is rarely part of the initial implementation and data quality is impacted as the site evolve and tags are left out of the loop.

I have personally conducted tests on dozens, if not hundred of sites, including vendor sites. So far, I have not found a single site that didn't have at least one of those common issues:
  • Missing tags
  • JavaScript errors
  • Missing data elements
  • Using the wrong account id
  • Following a semantic approach to page titles, links, site sections, etc.
  • Passing the wrong data: invalid numeric values, string with invalid characters, long strings, etc.
  • Not following the "domain of values" of some tags: for example, if you always send 1,2,3 to represent a membership levels, sending a value of "X" or "4" is an error.

Methodology

Web analytics implementation is much more complex than most people think. Quality assurance remains a cumbersome and complex task. You basically have 4 alternatives:
  1. Blind faith: Trust and wish everything is right... or slight variation on the same theme... check only the pages that are part of your KPIs and funnels and hope for the best
  2. Debuggers & proxies: Charles Proxy, ieWatch, Fiddler, Tamper Data, HTTP Watch or some vendor provided simple javascript debuggers. Visit all pages of your site to make sure the tags are working as expected. Then wait... and check in your web analytics tools of choice to see if it was recorded correctly
  3. Crawlers: Use a high-end tool such as Maxamine (now Accenture Digital Diagnostics), which is very powerful and targeted at larger companies or something like Web Link Validator, which would still require a fair amount of technical know-how. For Google Analytics there's also SiteScanGA but it also has some limitations (incomplete crawls if the site isn't fully indexed by Google, basic checking, etc.)
  4. WASP: The Web Analytics Solution Profiler was specifically built for quality assurance of web analytics implementations. The sidebar offers page-by-page view of the tags as you surf and the crawler allows you to check a full site or section of a site.

Solution: WASP v1.0 coming up!

WASP v1.0 is coming up this month and it will be very affordable. This version will include specific enhancements for Omniture SiteCatalyst and Google Analytics (such as sidebar friendlier view of tags - see screen capture) and future releases will continue to bring further enhancements.

The "Crawl from..." wizard will guide you through the steps to ease the process of crawling a whole site or a specific site section. Since the crawler runs "in context" of a real browser session, there is no better way to test. Options such as timeouts, filters, excluding your own traffic, pause/resume and data management policies makes it a unique approach to web analytics quality assurance of tagging implementations. The built-in data explorer makes it easy to spot untagged pages or wrong values being sent (see screen capture) and you always have the option the data to export to Excel and other formats.

Additional references

Some references about the challenges of tagging:

Speaking at eMetrics Toronto

I will be at the next eMetrics Marketing Optimization Summit in Toronto (March 29-April 1) where there will be round table discussions on various topics. I will be a facilitator for the discussion on "How Trustworthy is your Data?"

I will also be presenting about "Establishing your Online Analytics Maturity" in the "Jump Starting your Online Analytics Implementation: A Four Part Series". The session will be very interactive and you will get out of it with your own maturity assessment map! Get more information on the topic of web analytics maturity assessment here and here.