Monday, March 24, 2014

Diversity and Collisions: A Problem-Solving Strategy

Last week I had the privilege of attending the Global Business Forum sponsored by Baylor's Hankamer School of Business. One of the most intriguing presentations was by Will Cukierski of Kaggle. Kaggle uses a competition model to engage data scientists in the development of predictive models. Cukierski indicated that Kaggle has seen that the groups most likely to solve a problem are frequently those most distant from the problem. Cukierski pointed to the work of Karim R. Lakhani. The company InnoCentive, much like Kaggle, provides a platform for crowd sourcing scientific problems. In "The Value of Openness in Scientific Problem Solving" Lakhani et. al. (2007) found that "the more heterogeneous the scientific interests attracted to the solver base by a problem, the more likely the problem is to be solved" (p.7).

On a related note, I heard a story on NPR last week highlighted the work of Richard B. Freeman and Wei Huang on the effects of ethnic identity on authorship of scientific papers. In their paper "Collaborating with people like me: Ethnic co-authorship within the US," Freeman and Huang examined the surnames of authors between 1985 and 2008. The focus on the research was the impact of homophily ("birds of a feather flock together") on the quality of science produced. I was most interested in their final interpretive remarks, in which they hypothesize, based on their data, that "greater diversity and breadth of knowledge of a research team contributes to the quality of the scientific papers that the team produces" (p.19).

A related point, and the biggest takeaway for me from Cukierski's presentation last week, was Kaggle's approach to combining predictive models, from different teams, and often different academic domains, to come up with even better predictions. When they take the top two solutions and combine them, they tend to get better results than if using only the first-place model. Adding the third result will likely make the model even better. It is only after some number of approaches (the example graph put this n at 15) that additional models make the overall model less accurate. Better models come from combining multiple solutions, from multiple perspectives, not from selecting the one best solution.

There are likely several factors at play here. First, by bringing together multiple research teams from around the world, Kaggle is playing the part of the weak tie in the social network (Granovetter, 1973). Second, in a more diverse network, one is more likely to find those outside of the "normal science" (Kuhn, 1962), and therefore not bound by associated disciplinary constraints . Third, the competition model takes advantage of what we know about brainstorming and individual creativity (Cain, 2012, ch.3).

As I've written before, I see the research library of the future playing a key role in facilitating these kinds of interdisciplinary collisions, both with the spaces we provide (e.g. a "research commons") and the services we develop. So perhaps we can look to Kaggle and InnoCentive as a model for a type of academic library programming. Could we host our own Kaggle-like competition? A component of Baylor University's strategic vision, Pro Futuris, is "Informed Engagement," which includes an intent to "address systemic problems facing our community." What if we hosted a competition to find the best interdisciplinary solutions to community problems, partnering with local agencies and initiatives? What if we engaged faculty to make participation in the competition a course assignment?


Cain, S. (2012). Quiet: The power of introverts in a world that can’t stop talking New York: Crown Publishers.

Freeman, R. B., & Huang, W. (2014). Collaborating With People Like Me: Ethnic co-authorship within the US (Working Paper No. 19905). National Bureau of Economic Research. Retrieved from http://www.nber.org/papers/w19905

Granovetter, M. S. (1973). The Strength of Weak Ties. The American Journal of Sociology, 78(6), 1360–1380.

Granovetter, M. S. (1983). The Strength of Weak Ties: A Network Theory Revisited. Sociological Theory, 1, 201–233.

Kuhn, T. S. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.

Lakhani, K. R., Jeppesen, L. B., Lohse, P. A., & Panetta, J. A. (2007). The Value of Openness in Scientific Problem Solving (No. 07-050). Cambridge, MA: Harvard Business School. Retrieved from http://www.hbs.edu/faculty/Pages/download.aspx?name=07-050.pdf

No comments:

Post a Comment