In a comment to Dan Woods’ article “Amazon’s John Rauser on What Is a Data Scientist?,” (http://www.forbes.com/sites/danwoods/2011/10/07/amazons-john-rauser-on-w… ) “nyhacker” commented:
“So far there is a standard obstacle to such work: Business is still organized as it was in Henry Ford’s day where the supervisor knows more and the subordinate is to apply labor to the ideas of the supervisor. So, unless the CEO has such a background, any such expert will have to report to someone who doesn’t understand the work, and that mostly can’t work. Net, for a person to get a career from such material, they have to start their own company to do such work and sell the results. For me, I need to get back to it!”
This commenter hit the nail on the head. I’d say the problem is less severe in net-savvy companies, but even in those companies where “data science” has been a core competency all along, (financial services, telco, oil and gas E&P, for example) this is still an issue. Consider Wall Street, for example. Here the crème de la crème are recruited from the best schools, with exquisite skills and credentials to become “quants” or, more accurately, “financial engineers,” inventing new types of securities out of thin air like credit default swaps that nearly destroyed the world financial system. Was this a failing of data science? I think not. Instead, these models were passed to portfolio managers, in my analytics types taxonomy, (http://www.constellationrg.com/research/2012/02/trends-analytic-types-ro… Type II-A’s to Type II-B’s) who were free to run the models with their own assumptions, such as, “home values will continue to rise forever.” As “nyhacker” said, the recipient of the data scientist’s work is often passed on to people who don’t really understand it. When that happens, one of two things occurs. Either the receiving party doesn’t act, which was pretty typical in the days when we called this data mining, or they act with either fantastic or disasterous consequences.
Tom Davenport advises that adopting an analytic culture requires a goodly number of these “data scientists” gathered centrally reporting to a C-person, preferably the CEO, who has an analytical background and does understand this stuff, but unless you’re on Wall Street or Silicon Valley, you can swing a dead cat around by the tail and not hit one of these characters.
So what’s the answer? I like to say that data doesn’t speak for itself. Don’t put too much stock in the ability of data scientists to change the world. It takes the efforts of lots of people working together to make really good (or really evil) things happen. They play a role, but they aren’t prime movers.
I believe the danger here is that the current fascination with the terms “big data” and “data scientist” is counterproductive as we waste so much time trying to define them. Even worse, the data scientist thing has started a rush of programs and a sense of urgency to find, recruit, train, hire, seduce and cherish these guys. In point of fact, they are right in front of you, and have been all along.
Here is an example. One of the three largest credit card companies in 1999 has over one thousand analysts developing SQL queries against one of the largest (at the time) data warehouses in world, over 50 terabytes. The data was extracted and separated into test datasets and production runs to develop and continuously refine quantitative models for fraud detection, churn action and other programs. The terms “big data” and “data scientist” didn’t exist yet, but that’s exactly what this was. And remember the number, one thousand. They didn’t grow on trees, they were recruited and learned on the job over time.
The good news, for an old quant like me, is that quantitative methods are suddenly fashionable and I get to have a little fun before it gets benched in favor of the handsome newcomer, whatever that will be.