David Brooks, an author and a columnist over at nytimes.com, recently wrote an article titled The Philosophy of Data where he says: “If you asked me to describe the rising philosophy of the day, I’d say it is data-ism.”
And I couldn’t agree more. Companies want to be data-driven. This means, making decisions based on data and not on gut feeling, thus increase the quality of the decision. Makes sense. But as long as not all the online marketers who claim they increase your ROI by 300% through data-driven decisions are secretely statisticians, this is not as easy as it sounds.
While the advantages of basing your arguments and decisions on data are obvious, there are great risks tied to it, too. I believe these risks can be categorized into collection risks, and interpretation risks.
The first mistakes which can lead to dirty and biased data can happen during data collection. Today more and more data is unstructured (e.g. Tweets, Facebook Posts, Images, Videos) and thus need cleaning before analysis. If you fuck that up, your whole analysis and decision will be just plain bull shit. Shit in, shit out, as @anbraendle would say. Another problem is, and this is far more common to the average online marketer and Google Analytics guru, attribution modeling. Last click, first click, etc. Attribution modeling basically tries to solve the issue of our multi-device, multi-touchpoint online marketing world. Example: User sees your AdWords ad on his mobile, clicks, surfs, leaves. Two days later he sees your display banner on his favorite news page on his MacBook, clicks, checks your page out leaves. One week later he sees your LinkedIn ad on his iPad, clicks, buys your bloody product. Do you now count this conversion to your shitty LinkedIn campaign and increase your spend because you think it is oh-so efficient? Or do you count it to your AdWords mobile campaign? Would the user have bought your product also without the LinkedIn campaign? Did the click on his MacBook matter? Would he have bought your product if he had viewed LinkedIn on his MacBook? You get the point. Read one way of solving this probleme here.
I hope most of the people working in online marketing had one or two courses on basic statistics back at university. At that time, you didn’t know what to use this shit for, therefore you didn’t make an effort to remember all those distributions and Central Limit Theorems your fugly prof was telling you. And then the way of data-driven thinking came. Now you wish you’d still remember some stuff (at least I do wish it). My point is, if you want to act data-driven you need to be able to understand and interpret the data correctly. And by correctly I mean based on science, maths, statistics. You remember n > 30 means the data is always statistatically significant? Well, you’re wrong. Read this. So, back to my point: As long as you’re not a good statistician, you can’t really help yourself, your company or your clients with data-driven decisions.
How does one solve this conundrum?
Easy. Educate yourself! Buy books (e.g. Head First Data Analysis is awesome), watch some online lectures on statistics (checkout Introduction to Statistics at Udacity), listen to some pod casts or read Wikipedia articles. No one remembers the stuff from university. Humans forget. Therefore, we need to re-learn. And this time, it’s more fun!
So stop claiming that you’re a data-driven super hero before you can ace basic statistics. Because that’s just a big fat lie. But if you learn it and can interpret the data correctly, you’re going to make a significant difference to the quality and outcome of your work.