Site icon Marketing Thought

The Danger of Data Mining

Is data analysis leading to bigotry? It is a sensitive subject. Data enthusiasts (and I’d probably be in this camp) hope analysis can get rid of silly ideas. After all when we get better information we will be able to combat old prejudices. I am genuinely optimistic. Still there is a major problem where your theory is weak. This is the danger of data mining.

Your Prejudicies Impact Your Choices

That said. it is important to understand that your prior beliefs always impact your analysis. If you come at a problem with a strongly held incorrect perspective you are likely to find an answer that suits your prior prejudice. If there is enough data you will look and look until you find what you want to find.

This is the reason why data mining can be such a problem. If you look at enough data you will find a vast array of patterns. Many of these will be innocuous. Some will be obviously absurd to all. Unfortunately, some patterns will appeal to some people and fit with their worldview. The problem is no different in scholarship. Too much marketing scholarship seems to just aim to find relationships in data. Unfortunately, if you don’t think deeply about what you are doing you end up with nonsense. At its worst this can be plain offensive.

Mining Data Is Dangerous

Casual Bigotry In Marketing

This is the case with Gerry Tellis and his colleagues’ look at how products takeoff. This paper is not too old, 2003, but reads like something from a bygone age. (Old history books often explained Greek successes because they were European — unlike the Persians. They haven’t dated well). Tellis and his colleagues focused on product adoption in European countries. They give an explanation that, to my mind, should have been cast in the dustbin generations ago.

When doing a regression with secondary data there is often much hay to be made by throwing in “cultural” factors. Throw in enough of them and some will “explain” what is happening (i.e. be associated with the outcome). This is the danger of data mining.

These cultural factors are often poorly specified. It is not that differences between countries don’t exist. The differences used are often extremely poorly substantiated and badly explained which is a major problem. After mining data a vast range of post hoc explanations can be fitted to the data once it is mined. You can then explain the data using any number of dubious stories. Some of the stories are pretty offensive. Given the vast amount of data associated with each country/culture/region or whatever is being discussed patterns are there. It doesn’t mean you should rely on them if your theory is rubbish.

Protestants And Product Adoption

To explain new product adoption Tellis and colleagues turn to the protestant work ethic. This is a popular phrase often used to explain patterns in data when authors don’t want to think any more deeply. Of course, there are differences between European countries. Still, throwing ‘% of protestants’ into a regression and stating that is a “reason” for the difference is not substantially different to throwing in skin color. It might not be as obvious red-flag bigotry but it is bigotry nonetheless.

“The major religious difference among nations in Europe is the ratio of Protestants to Catholics. There is strong evidence in sociology that Protestant religions are more supportive of a high need for achievement than is the Catholic faith (McClelland 1961, Weber 1958). Therefore, we will operationalize need for achievement by the percentage of Protestants (see Parker 1997).” (Tellis, Stremersch, and Yin, 2003, page 198)

Aren’t convinced that this is an example of a problematic lack of thought about potential prejudice? Then note that they say: “We use climate as a proxy for industriousness (reverse scaled).” (Tellis, Stremersch, and Yin, 2003, page 198). Really they say this. I am not making it up. They say people in warm climates don’t work as hard, let’s just think that one through for 5 seconds. This is a paper in a major marketing journal in 2003, not a Victorian piece justifying imperialism. Seriously.

The Danger Of Data Mining

Researchers need to be more careful with their “theory”, they don’t want their data mining to be associated with prejudice.

Read: Gerry Tellis, Stefan Stremersch, and Eden Yin (2003) The International Takeoff of New Products: The Role of Economics, Culture, and Country Innovativeness, Marketing Science, 22 (2), 188-208

For more on measuring culture see here.

Exit mobile version