Last week, my boss and I had the pleasure of speaking with Anthony Goldbloomd and Karthik Sethuraman of Kaggle (http://www.kaggle.com/) and we were both impressed by what this San Francisco startup has cooking. Kaggle is a platform for crowd-sourcing solutions to complex analytics problems. There’s a great writeup at Forbes about the company.
Here’s the gist of it, though:
Traditional thinking would say that if you have a complex data set that you’d like to unravel to increase your predictive capabilities, you would lock your best statistical minds in a room and tell them not to come out until they solved the problem. The logic goes that a room full of bright minds collaborating and sharing their work should be able to ladder to the most effective solution.
Kaggle, however, turns that logic on its head by betting that competition is at least as productive as collaboration. In the world of Kaggle, participants sign on to create the solution algorithm that most effectively predicts future results and rocket to the top of the leaderboard. How do they measure this? Let’s say you have two years of data containing weather info, Twitter sentiment, and consumption of a popular beverage. Kaggle withholds the last 6 months of data and releases the first 18 months to the participants for them to work with. Their algorithms are tested agains the last 6 months and an accuracy score is generated. The Kaggle team told us it is common to see 80% accuracy hit within the first 2 weeks.
Unlike the room full of your analytics resources, the Kaggle participants do not share work and they only see the accuracy scores of the other participants. The drive to win is what brings them back to improve upon their previous attempts.
A similar approach (and similar success) was seen in the Fold.it (http://fold.it) initiative where gamers were able to crack a protein folding puzzle in two weeks that had stymied scientists for years. Fold.it allowed for regular gamers without knowledge of organic chemistry to participate by baking the rules of organic chemistry into the game. To win a Kaggle competition, there are no such short cuts (you need deep analytics expertise), but the driving force remains competition and gaming.
I include this with my recent posts because it is another example of tribal behaviors forming around concepts, even ones as esoteric as predictive analytics.
Now, if they can just come up with a better algorithm for the BCS rankings…