Thursday, December 8, 2011

A definition of web analytics & what really matters

A few days ago I published a blog post entitled "The ultimate definition of web analytics" at Online-Behavior.com. It raised a lot more interest then I initially expected and I found it quite refreshing to dig deeper into a subject. Many people contributed to the conversation; industry top figures as well as practitionners and consultants - people who took some of their precious time to contribute. Go read the post & the comments - they are worth it! And if you feel like it, it's always time to contribute!

Last week, while at eMetrics London to speak about the Online Analytics Maturity Model, several people came to me and shared their views about the article. Other speakers also talked about it and more broadly about the need for "web analytics" to evolve into something else - although there is no clear consensus, there is clearly a trend!

While there, Nicolas Malo had the brilliant idea to ask a couple of us on "what really matters in web analytics today?" - you can see this short interview below but I also recommend to watch the answers from Matthias Bettag, Steve Jackson, Neil Mason and Jim Sterne.


What really matters in Web Analytics today? #4 - Stéphane Hamel - eMetrics London - Dec 1st, 2011 from Nicolas Malo on Vimeo.

What do you think? What really matters in web analytics? Is it even "web" analytics - or should it be?

Thursday, October 6, 2011

We can always learn & improve: coaching from Joseph Carrabis

Joseph Carrabis, kite flying at eMetrics, 2007
I met Joseph Carrabis at my first speaking appearance at eMetrics San Diego in 2007. Joseph had organized an informal kite flying session - a great way to break the ice. Whoever met Joseph is immediately fascinated by his charisma; his ability to be attentive to one’s thoughts, emotions and non-verbal communication. Anthropologist, neuro-scientist, historian, researcher, teacher, author… or just a friendly fellow with whom it’s always interesting to chat.

Joseph made its mark in our little web analytics industry when he wrote a series of blog post on the unfulfilled promise of online analytics in 2009. The subject is still relevant to this day; see part 1 – the challenge, 2 – some solutions and 3 – the human cost and my own little contribution to the conversation here.

Time has passed; Joseph isn’t as visible as before in the traditional web analytics field – which doesn’t mean he’s not doing analytics, to the contrary! He’s just at another level. We occasionally bounce ideas and validate some assumptions related to the evolution of the market and the online analytics discipline.

As I continue to spend a lot of my energy looking into ways to make online analytics easier, and in my role at Cardinal Path, I do more speaking appearance and often meet with executives. I asked Joseph to look at a recording of one of my presentations. I spent a full day with Joseph and his lovely wife and colleague Susan. The experience was revealing – it allowed me to uncover little things that will help me improve my verbal and non-verbal communication skills and even become more aware of others. Subtle changes in the presentation content, gesture, tone and choice of words and even breathing all contributes to an overall improvement.

As with analytics, a continuous improvement approach and attention to details is the way to success. I’ve applied what I’ve learned at my recent speaking appearances in Stockholm and Vancouver and it did make a difference!

If you were there, tell me how I did, or come see me at the upcoming GAUGEeMetrics NY and other conferences!

Thursday, September 15, 2011

Phasing out gaAddons: it's not all bad!

The bad news: After much thinking, I have decided to stop developing and supporting gaAddons.
The good news: You can get a free, easy to customize source code sample to accomplish most of what gaAddons was doing! Check it out on jsFiddle!
The greatest: Cardinal Path provides professional audits of your web analytics implementation, complete implementation services from needs assessment to providing the exact tags required, and training.


Why are you stopping the development and support of gaAddons?
There are a number of reasons:
  • gaAddons goal was to make it easy to answer the most common implementation needs, such as tracking downloads and outbound links. Over time, it evolved to become a be-all and end-all for enhanced tagging of Google Analytics, leading to a more complex and heavy library.
  • Google Analytics v5 promises to address some of the items which were handled by gaAddons (for example, _trackLoadTime).
  • Continuing to work on gaAddons is a distraction from my role as Director of Strategic Services for Cardinal Path.
  • I have come to the conclusion that advanced implementations are better addressed when specifically tailored to the client needs.
  • I want to put more energy into the Online Analytics Maturity Model.
What does it mean for those who paid a license of gaAddons?
I will contact those who paid their license fee and share the original source code, as it stands today, so you can adapt it for your needs if you want to. No further support will be provided and you are not allowed to repackage or resell the code - basically, you are allowed to use it for your own needs (and your clients if you paid the agency license). I encourage everyone to look at the free sample code and start from it to make any enhancements for your specific needs.
What if I need help?
Cardinal Path offers professional audits of your web analytics implementation, can provide full implementation services, training and annual support contracts.
What if I pid a license fee?
All recurring subscriptions have been cancelled so you won't be charget again.

Wednesday, August 3, 2011

Stephane Hamel web analytics speaker schedule (fall 2011)

Just before taking some time off this summer, I wanted to share my fall conference schedule - as I like to say, meeting face to face is certainly one of the best social media outcomes! Come to one of my sessions, workshops or just grab me at a break and say hello!

Wednesday, May 18, 2011

Now blogging at Cardinal Path

After nearly 10 years and over 500 posts you will notice some silence at blog.immeria.net as I move things over to cardinalpath.com/blog.


The older posts will remain here, but please head over Cardinal Path to read my latest posts.

In your opinion, what's the right strategy in such a case? Please vote & comment.

What should I do about this blog?

Monday, May 2, 2011

Bob - here's Web Analytics. Web Analytics - please welcome Bob.

The following email from Bob (not really Bob, but let's call him Bob anyway) is representative of many emails I receive from people who want to do a career shift toward digital measurement and optimization, or web analytics if you prefer:
Hi Stephane,
The UBC Web Analytics Award of Achievement has been highly recommended to me to learn more of the technical side of SEO. I saw a comment you made on Lawrence Dyson's website and also recognized your name from the UBC website. I have always been interested in marketing and while I have experience in digital marketing, I want to get more technical experience in this area. I am 36 years old and taking these courses is part of a plan to shift careers. I've been an entrepreneur for the last 12 years, but am interested in going to work for an interesting company and being able to provide value in the marketing and management of their website. What are the job opportunities for someone graduating with the Web Analytics Award? I currently reside in the UK. Thank you in advance for your response!

Best regards,
Bob
Here's what I replied:
Hi Bob,
I strongly encourage you to do it – not that it is really "technical" about SEO (in fact, not at all), but mainly because it provides a very strong foundation on web analytics. With your entrepreneurial and marketing background, coupled with experience, I’m confident you can easily shift toward analytics – or in fact, maybe not as much a pure "web analyst" (spending your days looking at data & making recommendations to others) – you could be that manager who understands the value of data and make the decisions!

But that’s just my 1st thought based on what you’ve provided – you could ask the same question on the Yahoo! Web Analytics forum to get more input.

Sincerely,
Stéphane
Over the years tutoring the UBC program I have witnessed dozens of people with very diversified backgrounds entering our field. I often get asked about "a good bacgrkound to be an analyst". There is no such thing as a good or bad background - there are only people who want to leverage their own unique skills and experience. I've been told so often I didn't have a degree, or that I was just a techie and therefore too stupid to understand the business that I would never judge anyone who has the guts to make the plunge.

If anything, tutoring has given me this ability to easily spot who has the potential of becoming a top of the crop analyst... and who needs a little more help and guidance to be a decent one.

Other offerings

We hear a lot about UBC, but there are other offerings: USF, UToronto and of course Market Motive. Other universities are also integrating elements of web analytics in their marketing curriculum - a little survey of universities in my local market revealed USherbrooke, McGill and HEC covers web analytics - although they typically do not offer full semester classes. For one, I'm also teaching a full semester, graduate level class about "online analytics from a managerial perspective" at ULaval. Those students, of which many have real work experience, are the ones who will soon define online marketing strategies and manage businesses with a strong online component. The course is available online in French and I'm still working out the details to offer it in English. See Web Analytics Business Education at ULaval for further details.

What about more hands on, practical/technical training? Cardinal Path, which I joined recently, offers the popular Seminar for Success for Google AdWords and Google Analytics and there are many other offerings from vendors and agencies.

A note about the WAA Certification

If you read my comment from almost a year ago on Lawrence blog, you'll notice I was talking about the WAA Certification. Since then, nearly 50 people received the Certified Web Analyst title and I have myself proctored the test twice. More web analyst professionals are enrolling for the test and although it took a little longer then expected, awareness of Certification has significantly increased - not only for practitioners and consultants, but also for employers and clients.

My take

The industry is shaping itself and is answering a diverse array of educational and training needs - and that's excellent! If I had one complaint about the current WAA Education approach it would be its innability and seamingly unwillingness to list more academic and training resources (others than UBC and Irvine - both of which have close ties with the WAA). Personally I would prefer to see a more complete list of resources - even if not officially "endorsed" by the WAA - for the benefit of its members diversified needs and geographic considerations.

I have created a list of Web Analytics Academic Resources - please add, vote and share! Are there any other University programs you know of?

Any thoughts and comments about web analytics education are welcome!

Thursday, April 28, 2011

Cardinal Path announce key addition: me!

The recent announcement of a merger between Webshare, PublicInsite and VKI Studios was received very positively – Cardinal Path is destined to become the leading authority in the online digital measurement  space. Over the past couple of years I had the pleasure to discuss and work with both VKI Studios and Public Insite - John Hossack and Alex Langshur - two organizations and two persons I have utmost respect for. I valued the opportunities to work with them while enjoying the “independent thought leader” status; but now is the time for me to reconsider my role.

As of April 20th, I'm officially taking on the job of Director, Strategic Services for Cardinal Path. My role will be to push the envelope - tackle complex issues and challenges while bringing the most optimal and realistic solutions to make online digital measurement easier and more beneficial to our clients.

Alex Langshur, Cardinal Path co-founder and senior partner said "Adding Stéphane to our team solidifies Cardinal Path’s market position as a leading pure play web analytics and digital marketing firm. We’re assembling a group of thought leaders with deep experience across a variety of marketing verticals and we’re excited about what they can accomplish together for our clients."

Everyone I talked to were thrilled by the news. I'm very excited to join a team of unprecedented expertise  and profesionalism with a unique presence across North America.

More info:

Tuesday, April 19, 2011

eMetrics Toronto 2011: just a week away!

The annual Canadian web analytics pilgrimage is just a week away! eMetrics Toronto will be at the Sheraton Center, April 26-29th.

Once again, I will be very active and I hope to see you again, or meet you for the first time!
Are you a former UBC Award of Achievement in Web Analytics or Fundamentals of Business Analysis student? Maybe you were enrolled in my ULaval graduate-level online analytics course? Have you used WASP or gaAddons? Maybe you are following @immeria on Twitter or read my blog? Partners and clients please come say hello! Let's chime in on my former role as WAA Director and Treasurer, the industry, career development or the future of online analytics!

That's a lot for a couple of days! Oh! Did I mention the lobby bar tradition? :)

Tuesday, April 5, 2011

Analytics Canvas v1.1: serious ETL for web analytics

Analytics Canvas from Toronto-based nModal isn't merely a Google Analytics API tool, it's closer to a powerful ETL (Extract, Transform, Load). In fact, it reminds me of SAS Enterprise Miner.

Powerful ETL

I've been on the beta group and v1.1 should be available by the time you read this [1] - James Standen and the team at Analytics Canvas know their stuff! Basically, Analytics Canvas powerful visual query interface allows you, among other things, to:
  • intuitively build complex queries and apply advanced ETL rules on the data (transformation, filtering, join, sort, segment, summarize, etc.)
  • extract more data than the 10k per call imposed by GA API
  • combine multiple GA profiles, even from different accounts
  • easily extract more than the 10 metrics constraint imposed by GA API
  • work with sampled or full data sets and visualize the results 
  • combine with other data sources, be it Excel, MS SQL, Oracle, mySQL, etc.
  • export to flat files, Excel or databases

Why is Analytics Canvas important?

The future of web analytics is dim... if we want to bring business value, we need to be business focused, not merely looking at website visits. That's why I think "web analytics" as we know it today will give way to "business analysis" and "business intelligence" - and to get there we need to achieve higher level of online analytics maturity - and that's where tools like Analytics Canvas become essential.

A real-world scenario

I want to merge online behavior data with back-office data. I want to look at visits grouped by date and hour, days since last visit (recency), visit count (frequency), pageviews, time on site for each visit, as well as 5 different goals for each of the authenticated users to my site [2]. This data will then be merged with a SQL data source containing demographics info and monetary values of past customers (without any PII, so we comply with GA TOS - see my previous post on this topic). The end result is a nice RFM analysis model I can use to analyze customer behavior and make much better recommendations that speaks to business stakeholders.

The challenge
  • the volume of data can be substantial and break the 10k rows limit per GA call,
  • I have more than 10 metrics,
  • I want to group by date/time and get only the rows where there is a known user,
  • data needs to be joined with back-office data stored in a SQL database,
  • and of course, I want to export the result to an Excel file, or back into SQL, for further analysis.
The solution
The final canvas is quite easy to understand, let's look at each component.
The final canvas
We use elements from the Block Library to create a canvas.
Blocks library to build complex ETL
The Data sources are easy to define (here, I'm showing a GA query)
Simple yet powerful query builder
We can use the Join block to create any types of joins (those familiar with SQL will relate to inner, outer, exclusive and full outer joins).
The Join block

The Filter block
The SQL data source isn't shown here, I used the Excel data source for the sake of this example. The picture below shows the Filter block with a simple rule. The output then contains a new column with a True/False result.
Sample output
And finally, when we run our query, we can preview the results that will get exported to Excel.

My take

This simple example took only a few minutes to build. Doing adjustments is a snap and doesn't require a single line of programming or tedious testing rounds. The Analytics Canvas approach is different from Excel plugins like TatvicShufflepoint or Next Analytics (which I'm using for a super-cool dashboard to be presented here in the coming days as part of the "The math behind web analytics" series). Those coming from a database or even stats background will probably be more comfortable with Analytics Canvas than, say, purely marketing people. Analytics Canvas definitely offer very strong capabilities unprecedented in any other GA API tools I've seen.

If you want to extract massive amount of GA data and merge with back-office sources, Analytics Canvas is definitely worth a look!

Analytics Canvas will be exhibiting at eMetrics Toronto, April 25-29 - I will be there too!

---
[1] other than being on the beta list, I have received no compensation for this article.
[2] as per GA TOS, you are not allowed to store any PII info [wikipedia] in GA - but you can use a unique user id (not an email!) or in a similar scenario, you could use the e-commerce transaction id to reconcile with back-office data as long as it's not merged with any PII.

Tuesday, March 29, 2011

Web analytics ethic: from theory to practice

A week ago I published three short cases where people were invited to comment on whether they were legal, ethical and abiding by their web analytics vendor Terms Of Service (TOS). Inspired from my own experience and after much talk about the WAA Code of Ethic, sessions at the recent eMetrics and discussions I had with some vendors, I thought participation would be much higher.

Here’s my point of view and some info from the brave souls who were up for the task! You should really read the previous post before continuing!

Photo: stock.xchng
Disclaimer: I'm not a lawyer nor a specialist of ethics - this information is provided as is... do your homework!

The majority of the 14 respondents were from the US and UK with some participants from Canada and other European countries. Unsurprisingly, most respondents said they were using Google Analytics.

Case #1: matching transaction id against back-end.

Unsure No Yes
It is legal 20% 0% 80%
It is acceptable based on my TOS 27% 20% 53%
It is ethical 13% 13% 73%

In my opinion, this is perfectly legal – the data was collected with user consent in the context of a commercial relationship. It is also ethical – it is common and accepted to send a “thank you” email, along with the purchase details and some offers. The fact it is sent through traditional snail mail doesn’t matter – or does it? Since the transaction was done online, there is usually an expectation communications will also be conducted online. As one of the respondents put it, “At the end of the day, 'ethical' depends more on your relationship with your customer than anything else”. All serious tools vendors TOS specifically prohibit sending Personally Identifiable Information (PII) to their system.

A transaction id, which is clearly not PII, is typically set by your back-end system and stored in your web analytics service of choice. This is a piece of data coming from your own system, and used back to merge against it, generally no TOS issue – except with Google Analytics TOS! (emphasis mine)
7. PRIVACY . You will not (and will not allow any third party to) use the Service to track or collect personally identifiable information of Internet users, nor will You (or will You allow any third party to) associate any data gathered from Your website(s) (or such third parties' website(s)) with any personally identifying information from any source as part of Your use (or such third parties' use) of the Service. You will have and abide by an appropriate privacy policy and will comply with all applicable laws relating to the collection of information from visitors to Your websites. You must post a privacy policy and that policy must provide notice of your use of a cookie that collects anonymous traffic data.
Repeat: "You will not associate any data gathered from your website(s) with any personally identifiable information from any source as part of your use of Google Analytics". Essentially, if you use Google Analytics, you should not extract transaction ids to merge them back against your own system. This is a non-sense to me and I know of several organizations that are actually doing it – probably without realizing they are breaking their GA TOS. Let’s hope this will be revised.

Case #2: matching product id (SKU) against back-end

Unsure No Yes
It is legal 7% 0% 93%
It is acceptable based on my TOS 27% 0% 73%
It is ethical 7% 0% 93%

Legal, ethical and no TOS issue. The key element here is that no PII is involved. From a business standpoint, what’s interesting is the ability to use behavioural data to correlate with sales in order to build a predictive model where we “know” which online behaviours are early indicators of upcoming sales and therefore, adjust inventories accordingly.

Case #3: key created from (potential) PII without user consent

Unsure No Yes
It is legal 33% 20% 47%
It is acceptable based on my TOS 53% 40% 7%
It is ethical 33% 33% 33%

If I got it right, in the US: last name alone, 5 digits zip code or last digits of phone number are not considered PII.

However, in California, OPPA specifies what is typically a non-PII become PII when combined with other data (such as having gender associated with a specific person). In Canada, the PIPEDA law stipulates data must be collected with user consent and used for the purpose it was collected for. In Europe, and especially Germany, a last name is PII (so are IP addresses and a whole bunch of things!).

Is it ethical? In this specific case, the data is stored even if the transaction isn’t fully completed. Therefore, this practice is against the 3rd WAA Code of Ethic guideline: User Control. It is also against PIPEDA in Canada.

What about the TOS? In general, this wouldn’t be an issue and it doesn’t really matter if this string is further encoded to obfuscate it. However, Google Analytics TOS still doesn’t allow us to use this key to merge with any other data that could contain PII.
In airports, the stand by list typically shows first three letters of last name and first letter of first name

My take

While there are passionate arguments on "free vs paid" in the #measure tweet universe, I was sincerely disappointed a topic like ethic and legal didn’t raise much interest. Is it because of a lack of interest? Fear of being wrong?

Either way, it makes me wonder if web analysts happily embrace the WAA Code of Ethic because it feels good and it's a worthy cause... or are just full of it! I guess what’s most important for now isn’t to know all there is to know about ethic, legislations and TOS, but to take action when innapropriate situations are uncovered.

I don't pretend to know more than anyone else, in fact, I'm willing to be wrong! If you have comments or additional useful references, I would love to hear from you!

Monday, March 21, 2011

Web analytics ethic trivia

While I'm still working on the 3rd post in my series on "the math behind web analytics", I thought we could play a little game related to the WAA Code of Ethic.


Read the three cases below and for each one, think if this is something you might be doing already or would feel ok to do... or not. Then you'll be invited to vote and comment (anonymously).

  1. For an ecommerce site using your web analytics vendor of choice, the transaction id, along with traffic source data (referrer, campaign, search keywords, etc.) and micro-conversions info (which other business valuable tasks were completed) are extracted using the API. The transaction id are then looked up against your sales database in order to do further segmentation and build a customer list (name, address, purchase details, demographics, etc.) that will be used to send a "thank you" snail mail, along with a 50% discount on a future purchase. Clearly, there is a customer-vendor relationship in place and the information for the purchase was collected with user consent.

    Is this legal? Is this allowed by your vendor TOS? Is this ethical?

  2. Still for the same ecommerce website, you extract the product SKUs, along with item quantity sold and the same source data and micro-conversions info. The data is merged against the back-end inventory database using the SKU and predictive models are developed to know which stock levels are optimal for each SKU.

    Is this legal? Is this allowed by your vendor TOS? Is this ethical?

  3. A financial institution typically has several types of requests: credit card, mortgage and other financing inquiries, retirement simulator, insurance quotes, etc. Completed requests are stored in back-end systems for processing - those transactions are frequent targets of fraudulent behavior or are abandoned along the way. One way to generate a unique key is this: first 3 letters of last name + 1st letter of first name + 4 digits zip code (or last 3 characters of postal code) + last 4 digits of phone number. The result, for me, would be HAMS4C02637.

    Is this legal? Is this allowed by your vendor TOS? Is this ethical?

    Update: to make it clearer, this type of key could be used for lookups against other systems - and could be encrypted using MD5 to make it more obscure - but it is still built from input data even if the transaction isn't fully completed.
Let's see what you think - I'll share my point of view a bit later (along with pointers to reference material). I certainly don't pretend to be a lawyer or a professional of ethics, I'm just an analyst with some experience. Those three cases are inspired from real situations.

The question is... what would you do?

If you are up for it, head over here to vote and comment.

Monday, March 14, 2011

A major web analytics agency is born: Cardinal Path

Today at the eMetrics Marketing Optimization Summit, my good friends Alex Langshur, President and Founder of PublicInsite, John Hossack, President and CEO of VKI Studios, David Booth and Corey Koberg of WebShare announced they are joining forces to become one of the most significant players in the digital analytics world.


With offices in Ottawa, Vancouver, Boston, San Diego, Burlington, Phoenix, Mountain View and Chicago, Cardinal Path unites an impressive team of thought leaders across a range of disciplines, authors, speakers and top business consultants. Justin Cutroni of WebShare, a well known figure in our field comes to mind, but there are also Brian, Michael, Kent, Ken and Scott - about 35 of them!

I've known Alex & John for a long time and I have utmost respect for both of them. Their professionalism and the expertise they have built through their respective agencies is outstanding. Their success stems from hard work, obviously, but I've found in both of them that undefinable human touch, sense of ethic and respect. Interestingly, as an independent consultant, I was lucky to have both of them be what I called "angel advisors" - sounding boards for all my crazy ideas, but also sharing and collaborating on anything "analytics" related.

While there has been many mergers on the vendors side, this is the first big one on the services side and the strengths of the various partners are very complementary:
  • private, public, non-profit, education sectors; with several flagship clients Like NBC, Harvard University, Library of Congress, Electonic Arts, Virgin, etc.
  • deep ecommerce expertise with leading brands, and boatloads of knowledge for non-commerce, lead-gen and brand based sites;
  • "build for success" approach - the new firm has site design/capability that enables them to architect the success elements and ensure full end to end visibility.

I want to be among the first to wish them success in what will undoubtedly be a great adventure and bright future!

Thursday, March 10, 2011

The math behind web analytics: mean, trend, min-max, standard deviation

In the second installment of this series, we will leverage Excel to take over where Google Analytics left us.
  1. The math behind web analytics: the basics

Basic charting in Excel

The very first thing to do is to show the data as a simple line graph. For this post, I simply used visits to my blog in January of 2011. After some minor visual adjustments we end up with something like this:
Figure a: simple Excel charting
Figure b: time series (visits)
There are already some striking things: peaks & valleys corresponding to weekdays and weekends, and a week apparently performing better than others. Now we can easily apply some basic statistics on our time series.

Mean

The mean [wikipedia] is often referred to as the "average", which, in reality, is the "arithmetic mean". This is very simple math: add all the numbers and divide by the number of data points.

Look at Figure c - what can you tell about the red line crossing the whole graph? In a time series like daily visits for a month, honestly... we can't tell much! Yet, only averages are reported by most web analytics tools - so please, don't even bother saying "the average number of visits this month was X"!
Figure c: showing mean, trend, min & max and control limits.
Learning point: The average is rarely a good indicator in a time series such as those found in web analytics because it is influenced by extreme values (known as outliers [wikipedia]). At best, in the case above, one might want to calculate the mean for weekdays and the mean for weekends. As a rule of thumb, if you have less than 30 data points, use the median.
Figure d: descriptive statistics

Median and mode

The median [wikipedia] is the middle value. The mode [wikipedia], on the other end, is the value appearing the most frequently. Again, in a time series, where the spread of values (the standard deviation explained below) is large, those descriptive statistics [wikipedia] (Figure d) are usually of little interest.

Min & max

The min and max values are... well.. the maximum and minimum values in a time series. Those could be qualified as "anecdotes" - we could be thrilled we've got so much traffic on a single day, or deceived by a poorly performing day, but knowing that has absolutely no value if we can't explain why.

In the time series used in this example, the min value is 93 visits on Saturday, January 1st. What can we tell about that? Obviously, people were busy doing something else than visiting my blog. What happened during the 4th week, around January 25? I shared my views about our little web analytics community and recounted my contributions. In both cases, we have very plausible explanations and the min & max values were useful only because they made us ask "why?".

Trend

To me, the linear trend [wikipedia] (shown as a dotted line in Figure b) is one of the interesting modeling stats because it marks the begining of our regression analysis [wikipedia] capabilities - our ability to explain the why's and "this, therefore that". Basically, it can help us do some predictive analytics (albeit very simple). Remember y = mx + b? That is, the position of a point on the y axis (the visits) depends on a factor of x (the day) plus a starting baseline. I can tell, based on historical data, that I should get approximately 350 visits next Tuesday.

Standard deviation

If we do max - min we get the range [wikipedia], another descriptive statistic. Interesting at best. What's much more interesting is the standard deviation [wikipedia] - the variability of the data. As we've seen, the average isn't of much use because it is largely influenced by outliers. Standard deviation gives an appreciation of the spread of values around the mean, or if you prefer, the variation in a distribution of values.

Why is this important?

Figure e: control limits at +/- 1.5 sigma
First because standard deviation will be used to set control limits [wikipedia] (Figure e) - which in turn will be useful to define our tolerance and targets (covered in a later post). While control limits are typically set to +/- 3 times the standard deviation from the mean - I have found +/- 1.5 times (for a total of 3) to provide a better and easier indicator of values going below or above our historical track record (shown as the grayed area in Figure b). Basically, it gives us an easy way to set alerts when our metric might be going out of whack!

Secondly, a large variation is an indication of an unstable process (think conversion rate), or low reproductibitiliy (anecdotal campaign success), or if you prefer, a larger standard deviation reduces our ability to predict the value of Y given a certain X. Basically, as analysts, we want to explain the past, but we also want to provide insight on how to fix issues and seize opportunities - we eventually want to be able to predict outcomes of our recommendations.

Coming up: normal distribution, histogram and box-plots

In the next installment we'll look at what is an histogram as well as normal distribution and their impact on our analysis. Also, although nifty spinning 3D-shadowed-shiny-Flash graphs are impressive... we'll look at box plots elegant simplicity yet powerful and under-used visualization tool.

What do you think of this series so far? What would you like to see discussed or any examples you would like to see?

Monday, March 7, 2011

The math behind web analytics: the basics

Introduction

I tutored about 700 students enrolled in web analytics and business analysis classes at UBC and nearly a hundred in the new graduate-level class in online analytics I'm teaching at Laval University. Most students at ULaval are enrolled in MBA specializing in ebusiness or marketing - one day, they will manage organizations and leverage analytics to make better business decisions. In the meantime, questions and assignments are an endless source of inspiration and challenges to solve.

This is a first post of a series entitled "the math behind web analytics". The idea stems from a question posted by a UBC student to the Yahoo! Web Analytics forum: "what mathematics does a web analyst need to know?" I was somewhat baffled by the replies: "plus, minus, min, max, average... not much practical use of it (mathematic/statistics) within web analytics" or "simple counts or averages" and of course, "percentage... because that's what you'll use most often, e.g. with KPIs, conversion rates, etc".

What?!

Assignment: basic analysis

One of the first assignment in the ULaval class is simply stated as "analyze the visits to website XYZ" and the students are provided a data set.

Learning point: When referring to a metric broken down by a time-based dimension, we refer to a "time series": a sequence of data points measured at uniform time intervals.

In this first post we address what appears to be easy and obvious: graphing the data.


All web analytics tools provide basic visualization functions, as in the example shown above from Google Analytics. This graph shows visits by month.

Learning point: Notice how I used thirteen months of data instead of twelve. This is especially important to be able to compare year-over-year and more easily spot seasonality. Basically, we should always include at least one additional period in our analysis. Here, we clearly see an upward trend and certainly some year-to-year progress.

However, a monthly breakdown hides some interesting elements. When the same data is shown by day, we can see something slightly different:

The trap

This is the extent of visualization you'll get in most tools. And most would-be analysts will report something like "there was X number of visits, and the average was Y visits/day" simply because this is what the tool says. Some will mention an upward trend but won't be able to quantify it, at best, a few will switch from monthly to dayly view and mention the very common weekdays/weekend pattern.

What's most important, I rarely see an explanation for what happened where we see spikes of traffic - which, in this case, are explained by marketing and external, business-related events.

Coming up: Excel to the rescue

In the next installment of "the math behind web analytics" we'll use Excel to do some basic analysis.

Monday, February 28, 2011

Advisory to the Higher Education Marketing agency

I'm proud to announce I will play an advisory role to Montreal-based agency Higher Education Marketing, which offers specialized online marketing services to colleges & universities across Canada: web strategies & development, analytics, SEO, SMS alerts, campaigns management, social media and mobile marketing, etc. See "Stéphane Hamel to Play Advisory Role to Higher Education Marketing" for the official press release.

Founder and CEO Philippe Taza inquired about the concepts and approach put forth in the Online Analytics Maturity Model. He said "the OAMM provides a clearly defined way to asses our clients analytical capabilities and allow us to leverage their strengths while addressing their weaknesses - leading to a longer and more profitable client relationship".

Advisory role to agencies

The advisory role I offer to agencies is a win-win-win scenario:
  1. Partners can tap into more than 20 years of work experience on countless projects, the past 15 years dedicated to online strategies, analysis and bringing the most optimal and realistic solutions.
  2. Clients & prospects have the guarantee I will provide the best advices, and if I can't do it myself, I will leverage the growing network of partners to offer them the most appropriate alternative.
  3. For me, it's a way for me to remain independent while extending my capabilities and leveraging the strengths of the network. 

A growing network

Last November Napkyn was the 1st official partnering agency, HEM is a second one and you can expect others to be announced in the coming weeks and months. This is directly in line with my desire to see the Online Analytics Maturity Model being used and officially adopted by more agencies - while allowing me to finance & continue to improve it. Of course, while doing so, those agencies also benefit from a sounding board, my expertise and my skills.

On the vendor side, iPerceptions was a natural partner since they bought WASP. In January I announced a similar advisory role with TagMan.

My goal is to continue to nurture and grow a network of vendors, agencies and other parties interested in exchanging and collaborating. If you would like to be part of it, don't hesitate to reach out!

Wednesday, February 9, 2011

Plagiarism in web analytics academia

Background

I have been tutoring the UBC Award of Achievement in Web Analytics program since 2007 - 700 students and counting. Add another 100 in the new graduate-level online analytics class I'm teaching at Laval University (Quebec-city) and my own experience as a student.


Think of students ability to become analysts as a normal distribution. You get a couple of students who are really outstanding - I would hire or recommend them anytime. At the lower end, there are a few students who are struggling - not because they can't do it, but maybe because they lack some experience, rigor and discipline, or their profile is radically distant from the online world. Some are in this program to understand WA but not actually do it i.e. their job requires that they can work with web analytics but it is not part of their job responsibilities – sometimes this is the reason they are on the low potential: that is not what they want or ever will do as a career. Then, in the middle, you get the majority of students who have the potential to become good analysts.

Picture from annaOMline at stock.xchng
Then there are the "others" - those who are not only outliers, but are liars to themselves. Over the last couple of months I have seen the plague of plagiarism spreading.

The "easy now" syndrome

In "Undermining our future as web analysts" I was referring to a study highlighting the neurological changes happening in our brain as a result of quick problem solving abilities. I mentioned the following:
"In our search for immediate gratification we are quickly going into the tactical and forgoing the strategic aspect of analytics and longer term business optimization."
Some students are pretty efficient at conducting online "research" - finding relevant resources from blogs to support their argument - this is fine and valid. However, the "easy now" syndrome makes it sound like a couple of Google searches and a couple of well selected ctrl-c/ctrl-v can do the trick - this is called "plagiarism". The quick tactic to deliver an assignment is the wrong strategic choice for your career.

What is plagiarism?
UBC's Faculty of Arts has a great article on the topic entitled "Plagiarism avoided: taking responsibility for your work". From this article:
"Plagiarism is a form of academic misconduct in which an individual submits or presents the work of another person as his or her own (UBC Calendar, 44). Simply put, plagiarism is taking the words or ideas of another person, and submitting them without the proper acknowledgement of the original author."
They distinguish two forms of plagiarism - "complete": entire essay is copied from one or multiple authors, and "reckless plagiarism is often the result of careless research, poor time management and lack of confidence in your own ability to think creatively".

What I see - and the consequences

Academic rigor: Assignment quality varies a lot - a long litany of words without any document structure or formatting is much more susceptible to improper citations and references. I constantly remind students to carefully read the assignment guidelines and start with an empty skeleton with a cover page, an intro, each of the assignment points to be addressed, a conclusion and room for bibliography.

In this case, a strong warning is often sufficient and I consider it to be part of the learning process. Plus, if you can't structure your work and clearly communicate, how will you perform as an analyst?

Wait a minute! There are those phrases you read and you think "wait a minute, there's something odd here" - the writing style is too different, or it reminds me of something I've read somewhere else. Picking a couple of phrases at random, a quick Google Search, and bingo! I can very easily find the reference. As Michele Hinojosa mentioned on Twitter, "Does it not occur to them that you've probably read everything they could plagiarize?"

When grading an essay, especially when there is no clear right or wrong answer, I look at the thought process, arguments and supporting references, as well as overall quality.

Is it a single occurrence of a phrase or short paragraph wrongly cited, or is it a blatant and significant appropriation of someone else's work? I have no pity for the second scenario - automatic 0, advise the student so he/she can explain, and refer the situation to a formal review committee within UBC.

The consequences: The obvious consequence is failing the assignment or the whole course. Depending on the circumstances we can give the opportunity to do a make-up project. The extreme case is being suspended from the program and having a note of misconduct put on the student's permanent transcript.
"Plagiarism is a serious issue at the university and will not be taken lightly should it occur."
Winnie Low, Program Leader
Techonology, Media and Professional Programs
UBC Continuing Studies

Plagiarism and the WAA Code of Ethic

While we are pushing for a WAA Code of Ethic, every resurgence of plagiarism is painful - not only as someone involved in academia, but also as an analyst endorsing ethical beliefs of privacy, transparency, consumer control, education and accountability. If, as a student, they play the "easy now" game, what can we expect of liars and cheaters when they become analysts?

Plagiarism in the web analytics industry: how many times did I see my work being ripped off my blog - there is a level of acceptance. What about my Excel dashboard sample or the Online Analytics Maturity Model being repurposed by unscrupulous freelancers and agencies removing credit and charging their clients? Plagiarism or fair use?

Don't hesitate to share your thoughts - have you played the "easy now" game? Do you think plagiarism is a serious issue?

Other resources:
UBC Regulation on Plagiarism
Plagiarism.org

Tuesday, January 25, 2011

What is one person impact on the web analytics community?

I especially enjoyed reading Kevin Hillstrom article this morning: "Hashtag Analytics: Removing a Member of the Community". Not only because Kevin is one of the top marketing analyst and writes great and useful articles, but especially because I was the subject of an experiment in the small Twitter #measure community.

What is the significance of "2%"?

I'm not sure which is most disturbing to me "fine young man" or "about 2% of the community no longer participates".

I think Kevin analysis is robust when it comes to the small #measure community on Twitter. However, it's also a bit disturbing to me...
  • 2%: is it a low or high community impact? It's hard to tell without some benchmarking
  • there is no "multiplicity" aspect to the analysis; it doesn't account for any other activities but Twitter.

It's not about me, it's about the community

When Eric Peterson mentioned several times "It’s not about you, it’s about the community..." on Twitter and in a blog post, there were some subtle messages that ended up creating some level of discomfort and misunderstanding in the community. Myself included.

First, we have to wonder what is the "web analytics community"? Is it the Twituniverse? Is it only what happens online and is visible to most? Kevin analysis doesn't account for the non-twitter activities that might be contributing even more to the growth of our industry. Once we understand this limit we can better appreciate the quality of his work.

For example, I salute Eric willingness to recognize others contributions in our community and I was sincerely touched when he said Stéphane "brings an enthusiasm to his work in the web analytics community that few can match and so I appreciate his passion". The recent initiative from Jason Thompson to help bring water to local communities in need - although it has little to do with our web analytics community - is a great cause and several people in the #measure community were happy to contribute. This exemplifies a community of interest that extends toward other causes and important subject matters.

What do I (and you) want from the community?

It's not about "me", but it is certainly about a passion I have for what I do. In "The best promotion I never got: My new year’s resolution advice", Rommil Sandiago puts it this way:
  • I want to make my mark
  • I want to be recognized for my efforts
  • I seek to not only challenge the status-quo but achieve results in doing so
  • But most importantly, I want to be trusted with something strategically important
Regardless of our experience and how involved we are in the community I think most of us are motivated by a sense of accomplishment. At least for me, accomplishment is a much stronger motivator than money!

Accomplishments: personal

While I was enjoying a great job with very good compensation, I still wanted to accomplish something more. One of the key accomplishment of my career was to take online classes to do an eBusiness MBA while working, catering to my family and starting my own business. It took me six years to get my degree. Despite the fact I didn't have an undergraduate degree I ended up being twice on the honor roll - and I'm now teaching a graduate class on web analytics which I'm especially proud of.

Make a mark; be recognized; challenge status-quo and shift toward a more strategic role.

Accomplishments: for myself, for the community

Helping the "community" can take different shapes and forms - be it our little #measure community, raising money to provide water in poor countries as Jason did, or being a volunteer medical first responder - as I did - but that's not the point... Here are some of the personal accomplishments I'm proud of, most of which had an impact on the web analytics community:
  • Web Analytics Solution Profiler: I pioneered the first true "in context" QA tool for web analytics in 2006. Since then, other products have emerged, be it tools like Ghostery which allows you to block web analytics tools or ObservePoint which came out and offered a slightly different approach. Note that WASP is now fully owned by iPerceptions. Maybe, in some ways, I have contributed to an increased awareness of the importance of tagging quality and made a little step toward Tag Management Systems - thus my interest and recent announcement of my involvement with TagMan.
  • Online Analytics Maturity Model: see the little story behind the Online Analytics Maturity Model.
  • gaAddons: enhancements for Google Analytics, is more recent and is gaining great momentum.
  • Web Analytics Association: I've been deeply involved with the WAA - to the extent of publicly taking a stance when folks where bashing it (think member value, globalization and Certification). Volunteering on a number of committees like Education and Certification. Spending two years on the Board and as Treasurer and countless hours doing volunteer work "behind the scene".
  • Web Analysts Without Borders: Adam Laughlin, myself and a bunch of volunteers gave their time to literally become team members of SaveTheChildren.org.
  • UBC Award of Achievement in Web Analytics: over the years, I have tutored over 650 students and the feedback is always very positive. Students likes the fact I'm going beyond the course content and share my experience, tips & tricks, and like my teaching style.
  • ULaval MRK-6005 web analytics, graduate level class: this is another way of getting involved in the community. I could spend more time consulting - it would be much more profitable - but teaching isn't just about sharing my knowledge, it's also an amazing way for me to continue to learn. Also, as part of the learning process, we helped the Sainte-Justine UHC Foundation.
  • Web Analytics Wednesdays local meetups: I started doing Web Analytics Wednesday in 2006 but eventually realized the WAW solution wasn't right for me: it was in English while my audience is French, I could as well find local sponsors and email members in my community directly.
  • Web Analytics Canada - Québec LinkedIn Group: I created the group about a week ago and it's already over 100 members strong. A small step toward an official WAA Chapter! This is in addition to the 400+ subscribers to my "Analytique Web au Québec" newsletter.
  • eMetrics and other conferences: with San Francisco and Toronto coming up, it will be my 15th time speaking at eMetrics. Being part of the community also means I invested time & money to get there, share and learn.

Parting thoughts

In this day and age of social media and ease of getting all kinds of metrics it's easy to fall for popularity contests and boast ourselves of being among the top in any given industry. At the end of the day, when finally getting to bed after a long day of work, can we look back and feel we've accomplished something useful and positive for ourselves, our family and our community?

No metric will tell you the answer.

Monday, January 24, 2011

gaAddons v2.1.2 released!

The gaAddons user base is growing and I'm happy to announce the general availability of gaAddons v2.1.2.

Enhancements

Bug fixes

  • _trackDownload & _trackOutbound are now "stackable" - meaning you can have them appear multiple times in a single push() call.
  • Fix CSS3 selector for _trackOutbound, _trackMailTo, _formAnalysis
  • _setXDomain now add parameters only when the click/mouseup event is fired instead of automatically changing all links (which caused issues when the page was reloaded)
  • _setXDomain now excludes email links automatically
  • Removed support for "area" elements in _trackOutbound and _trackMailTo

Ideas for the next release

Now that the code base is pretty strong & stable, I will be able to shift gear and work on other cool features such as:
  • Internal campaign tracking
  • Social media tracking (like the Facebook like, ShareThis, AddThis and such)
  • YouTube video tracking
  • eCommerce micro-format support
Please let me know which ones you would like to see first - or any other ideas!

Using gaAddons? I want to hear from you!

The user base is growing steadly with the addition over 125 new site owners testing or already using gaAddons in just a month. I'm receiving lots of positive feedback and I'm grateful to the early adopters who continues to help with the beta releases cycles.

I'm tracking the number of downloads and obviously keeping an eye on sales, but one of the thing I'm not doing very efficiently right now is keeping a tab of which sites are really using gaAddons. This will change in the coming weeks with the addition of a Google authentication requirement and some cool benefits for subscribers.

In the meantime, if you have stories to share about how gaAddons helped you I would love to hear them!

Friday, January 21, 2011

Joining the Tagman advisory board

Paul Cook, CEO and founder of TagMan, a Tag Management System (TMS) with real-time attribution announced that Mike Peralta, COO of search re-targeting business Magnetic and myself have joined the company's advisory board. The announcement comes a few days after Series A funding round.

To celebrate the fundraising, Tagman released a funny short video spoofing Star Wars (see it on YouTube).


I'm really excited to join the other members of TagMan's advisory board: John Marshall, founder of ClickTracks and Market Motive, providing training courses and certification in online marketing; Calvin Lui, former President & CEO of Tumri; and Tom Sipple, a Vice President at Interactive Corporation leading the monetization strategy, direct sales, aggregator partners, mobile and advertising operations groups for Dictionary.com

About my role with Tagman, Paul said "Stéphane is a leading voice in the enterprise analytics technology community" - I'm bringing my experience in resolving tagging challenges and will be sharing my views of the web analytics evolving market. Tagman pioneered the tag management system space and have developed a robust and effective approach. This is a logical step; with WASP, the Web Analytics Solution Profiler (now owned by iPerceptions), I pioneered the concept of “in-context quality assurance of tags”. I believe WASP contributed to increased awareness to the need of serious tag management solutions.

This role is directly aligned with my 2011 objective of strengthening my position as an independent thought leader. I'm already playing an advisory role to iPerceptions and the Napkyn agency in Ottawa. I'm also providing feedback and comments to a number of startups and other announcements are on the way.

Leveraging the Online Analytics Maturity Model approach, I'm also coaching several agencies who wants to further develop their web analytics practice. Of course, I still offer my services directly to clients; however, in 2010 I had to pass on great opportunities because I was overloaded. I was often solicited for consulting or simply for guidance to select an agency or vendor. Especially for agencies, the non-exclusive agreements guarantee prospective clients will be directed to the partner best suited to answer their needs - based on their maturity, geographic region and verticals. This is creating a healthy and dynamic ecosystem where everyone is winning.

Don't hesitate to contact me to know more about Tagman or the services I can offer - be it for maturity audit and guidance or advisory role to vendors and agencies.