Tuesday, March 29, 2011

Web analytics ethic: from theory to practice

A week ago I published three short cases where people were invited to comment on whether they were legal, ethical and abiding by their web analytics vendor Terms Of Service (TOS). Inspired from my own experience and after much talk about the WAA Code of Ethic, sessions at the recent eMetrics and discussions I had with some vendors, I thought participation would be much higher.

Here’s my point of view and some info from the brave souls who were up for the task! You should really read the previous post before continuing!

Photo: stock.xchng
Disclaimer: I'm not a lawyer nor a specialist of ethics - this information is provided as is... do your homework!

The majority of the 14 respondents were from the US and UK with some participants from Canada and other European countries. Unsurprisingly, most respondents said they were using Google Analytics.

Case #1: matching transaction id against back-end.

Unsure No Yes
It is legal 20% 0% 80%
It is acceptable based on my TOS 27% 20% 53%
It is ethical 13% 13% 73%

In my opinion, this is perfectly legal – the data was collected with user consent in the context of a commercial relationship. It is also ethical – it is common and accepted to send a “thank you” email, along with the purchase details and some offers. The fact it is sent through traditional snail mail doesn’t matter – or does it? Since the transaction was done online, there is usually an expectation communications will also be conducted online. As one of the respondents put it, “At the end of the day, 'ethical' depends more on your relationship with your customer than anything else”. All serious tools vendors TOS specifically prohibit sending Personally Identifiable Information (PII) to their system.

A transaction id, which is clearly not PII, is typically set by your back-end system and stored in your web analytics service of choice. This is a piece of data coming from your own system, and used back to merge against it, generally no TOS issue – except with Google Analytics TOS! (emphasis mine)
7. PRIVACY . You will not (and will not allow any third party to) use the Service to track or collect personally identifiable information of Internet users, nor will You (or will You allow any third party to) associate any data gathered from Your website(s) (or such third parties' website(s)) with any personally identifying information from any source as part of Your use (or such third parties' use) of the Service. You will have and abide by an appropriate privacy policy and will comply with all applicable laws relating to the collection of information from visitors to Your websites. You must post a privacy policy and that policy must provide notice of your use of a cookie that collects anonymous traffic data.
Repeat: "You will not associate any data gathered from your website(s) with any personally identifiable information from any source as part of your use of Google Analytics". Essentially, if you use Google Analytics, you should not extract transaction ids to merge them back against your own system. This is a non-sense to me and I know of several organizations that are actually doing it – probably without realizing they are breaking their GA TOS. Let’s hope this will be revised.

Case #2: matching product id (SKU) against back-end

Unsure No Yes
It is legal 7% 0% 93%
It is acceptable based on my TOS 27% 0% 73%
It is ethical 7% 0% 93%

Legal, ethical and no TOS issue. The key element here is that no PII is involved. From a business standpoint, what’s interesting is the ability to use behavioural data to correlate with sales in order to build a predictive model where we “know” which online behaviours are early indicators of upcoming sales and therefore, adjust inventories accordingly.

Case #3: key created from (potential) PII without user consent

Unsure No Yes
It is legal 33% 20% 47%
It is acceptable based on my TOS 53% 40% 7%
It is ethical 33% 33% 33%

If I got it right, in the US: last name alone, 5 digits zip code or last digits of phone number are not considered PII.

However, in California, OPPA specifies what is typically a non-PII become PII when combined with other data (such as having gender associated with a specific person). In Canada, the PIPEDA law stipulates data must be collected with user consent and used for the purpose it was collected for. In Europe, and especially Germany, a last name is PII (so are IP addresses and a whole bunch of things!).

Is it ethical? In this specific case, the data is stored even if the transaction isn’t fully completed. Therefore, this practice is against the 3rd WAA Code of Ethic guideline: User Control. It is also against PIPEDA in Canada.

What about the TOS? In general, this wouldn’t be an issue and it doesn’t really matter if this string is further encoded to obfuscate it. However, Google Analytics TOS still doesn’t allow us to use this key to merge with any other data that could contain PII.
In airports, the stand by list typically shows first three letters of last name and first letter of first name

My take

While there are passionate arguments on "free vs paid" in the #measure tweet universe, I was sincerely disappointed a topic like ethic and legal didn’t raise much interest. Is it because of a lack of interest? Fear of being wrong?

Either way, it makes me wonder if web analysts happily embrace the WAA Code of Ethic because it feels good and it's a worthy cause... or are just full of it! I guess what’s most important for now isn’t to know all there is to know about ethic, legislations and TOS, but to take action when innapropriate situations are uncovered.

I don't pretend to know more than anyone else, in fact, I'm willing to be wrong! If you have comments or additional useful references, I would love to hear from you!