Community
I recently saw the following post on LinkedIn:
“95% of banks in the study have created innovation labs.”
This figure seemed extremely high to me. I did a quick-and-dirty survey of three banks in my circle. Not one of them has an innovation lab.
Nevertheless, I couldn't conclude that the author of the post was lying because I couldn’t find evidence of the use of the three techniques to lie with Big Data that I described a couple of years ago. Nor could I spot any sleight-of-hand in his analysis.
https://twitter.com/GTM360/status/997084561208209409
I was about to reconcile myself with the 95% figure when I read the following comment by fintech thought leader Alex Jimenez:
“95% of the banks with innovations labs have innovation labs (one closed when they ran out of beer).”
Hmmm. Alex had a point.
I then saw the following snarky interpretation by another fintech thought leader Ron Shevlin:
“No issue. It said 95% of the bank IN THE STUDY. Clearly, the study was a handpicked sample of banks that have innovation labs. And one bank that didn't.”
Ron's emphasis on “IN THE STUDY” gave me the epiphany moment that this could be a brand new way to lie with Big Data.
Let me call it:
#4. Pixie Dust Sample
In this method, you compile a cohort of members that supports your hypothesis. You then add a small sprinkling of truly random subjects in order to give the impression that your sample is representative of the population. When you run the survey on this sample, you'll obviously be able to prove whatever you set out to prove.
In a properly-conducted survey, you'd draw a random sample of banks and ask each bank if it had an innovation lab. I'd expect such a survey to reveal that 15-20% of banks have innovation labs (notwithstanding the 0% result of my personal survey).
In the pixie dust sample method, you compile a sample by Googling for "banks with innovation labs". You ask each bank in that sample whether it has an innovation lab. You'd expect all of them to say yes. But, realistically, Google is not infallible, so its search results might contain erraneous entries of a few banks that don't have an innovation lab. Besides, a couple of banks might have shuttered down their innovation labs because they ran out of beer or for some other reason. Ergo 95%, not 100%, of banks in your survey would say yes.
----------
This LinkedIn post brought back memories of a fintech whose "pay later" product I'd trialed a while ago.
I got a call after a few months from a market research agency asking me if I'd heard about this deferred payment product.
I said yes.
The caller said thanks and hung up without asking any further questions.
I haven't heard about this fintech after that.
Connecting the dots, I guess this is what happened behind the scenes:
Most people - like me - said yes when asked if they'd heard about the pay later product. A few people might have forgotten about the product and said no. Ergo the survey found that "95% of people have heard about the fintech's pay later product".
The fintech was delighted to hear that is pay later product enjoyed such a huge amount of brand awareness and promptly reported this figure to its VCs. Since VCs tend to have short attention spans, the fintech's founders didn't get into the details of survey methodology or composition of sample. The VCs felt that the fintech's brand awareness was very high and directed its founders to stop all marketing campaigns.
Unsurprisingly, the fintech has disappeared from the market.
Companies that use the three techniques described in my blog post How To Lie With Big Data can score benefits, at least in the short term, before they're exposed.
However, anybody who uses the fourth tactic outlined in this post would be committing harakiri (aka seppuku aka suicide) by lying with big data. As this fintech did - unwittingly or otherwise.
Notwithstanding the exact status of the said fintech or the accuracy of my conjecture of the behind-the-scenes events, the purpose of this post is to highlight the strong possibility that lying with Big Data can have the unintended consequence of killing the liar.
This content is provided by an external author without editing by Finextra. It expresses the views and opinions of the author.
Ben Parker CEO at eflow uk ltd
23 December
Pratheepan Raju Advisory Enterprise Architect at TCS
Kuldeep Shrimali Consulting Partner at Tata Consultancy Services
Jitender Balhara Manager at TCS
22 December
Welcome to Finextra. We use cookies to help us to deliver our services. You may change your preferences at our Cookie Centre.
Please read our Privacy Policy.