Social Science in an Era of Corporate Big Data
In my last post, I explored the characteristics of Facebook’s model (i.e., exemplary) users. Today, I want to discuss the model users in the company–i.e., the data scientists who try to build stylized versions of reality (models) based on certain data points and theories. The Facebook emotion experiment is part of a much larger reshaping of social science. To what extent will academics study data driven firms like Facebook, and to what extent will they try to join forces with its own researchers to study others?
Present incentives are clear: collaborate with (rather than develop a critical theory of) big data firms. As Zeynep Tufekci puts it, “the most valuable datasets have become corporate and proprietary [and] top journals love publishing from them.” “Big data” has an aura of scientific validity simply because of the velocity, volume, and variety of the phenomena it encompasses. Psychologists certainly must have learned *something* from looking at over 600,000 accounts’ activity, right?
The problem, though is that the corporate “science” of manipulation is a far cry from academic science’s ethics of openness and reproducibility.* That’s already led to some embarrassments in the crossover from corporate to academic modeling (such as Google’s flu trends failures). Researchers within Facebook worried about multiple experiments being performed at once on individual users, which might compromise the results of any one study. Standardized review could have prevented that. But, true to the Silicon Valley ethic of “move fast and break things,” speed was paramount: “There’s no review process. Anyone…could run a test…trying to alter peoples’ behavior,” said one former Facebook data scientist.
Grant Getters and Committee Men
Why are journals so interested in this form of research? Why are academics jumping on board? Fortunately, social science has matured to the point that we now have a robust, insightful literature about the nature of social science itself. I know, this probably sounds awfully meta–exactly the type of navel-gazing Senator Coburn would excommunicate from the church of science. But it actually provides a much-needed historical perspective on how power and money shape knowledge. Consider, for instance, the opening of Joel Isaac’s article Tangled Loops, on Cold War social science:
During the ﬁrst two decades of the Cold War, a new kind of academic ﬁgure became prominent in American public life: the credentialed social scientist or expert in the sciences of administration who was also, to use the parlance of the time, a “man of affairs.” Some were academic high-ﬂiers conscripted into government roles in which their intellectual and organizational talents could be exploited. McGeorge Bundy, Walt Rostow, and Robert McNamara are the archetypes of such persons. An overlapping group of scholars became policymakers and political advisers on issues ranging from social welfare provision to nation-building in emerging postcolonial states.
Postwar leaders of the social and administrative sciences such as Talcott Parsons and Herbert Simon were skilled scientiﬁc brokers of just this sort: good “committee men,” grant-getters, proponents of interdisciplinary inquiry, and institution-builders. This hard-nosed, suit-wearing, business-like persona was connected to new, technologically reﬁned forms of social science. . . . Antediluvian “social science” was eschewed in favour of mathematical, behavioural, and systems-based approaches to “human relations” such as operations research, behavioral science, game theory, systems theory, and cognitive science.
One of Isaac’s major contributions in that piece is to interpret the social science coming out of the academy (and entities like RAND) as a cultural practice: “Insofar as theories involve certain forms of practice, they are caught up in worldly, quotidian matters: performances, comportments, training regimes, and so on.” Government leveraged funding to mobilize research to specific ends. To maintain university patronage systems and research centers, leaders had to be on good terms with the grantors. The common goal of strengthening the US economy (and defeating the communist threat) cemented an ideological alliance.
Government still exerts influence in American social and behavioral sciences. But private industry controls critical data sets for the most glamorous, data-driven research. In the Cold War era, “grant getting” may have been the key to economic security, and to securing one’s voice in the university. Today, “exit” options are more important than voice, and what better place to exit to than an internet platform? Thus academic/corporate “flexians” shuttle between the two worlds. Their research cannot be too venal, lest the academy disdain it. But neither can it indulge in, say, critical theory (what would nonprofit social networks look like), just as Cold War social scientists were ill-advised to, say, develop Myrdal’s or Leontief’s theories. There was a lot more money available for the Friedmanite direction economics would, eventually, take.
Intensifying academic precarity also makes the blandishments of corporate data science an “offer one can’t refuse.” Tenured jobs are growing scarcer. As MOOCmongers aspire to deskill and commoditize the academy, industry’s benefits and flexibility grow ever more alluring. Academic IRBs can impose a heavy bureaucratic burden; the corporate world is far more flexible. (Consider all the defenses of the Facebook authored last week which emphasized how little review corporate research has to go through: satisfy the boss, and you’re basically done, no matter how troubling your aims or methods may be in a purely academic context.)
So why does all this matter, other than to the quantitatively gifted individuals at the cutting edge of data science? It matters because, in Isaac’s words:
Theories and classiﬁcations in the human sciences do not “discover” an independently existing reality; they help, in part, to create it. Much of this comes down to the publicity of knowledge. Insofar as scientiﬁc descriptions of people are made available to the public, they may “change how we can think of ourselves, [and] change our sense of self-worth, even how we remember our own past.
It is very hard to develop categories and kinds for internet firms, because they are so secretive about most of their operations. (And make no mistake about the current PR kerfuffle for Facebook: it will lead the company to become ever more secretive about its data science, just as Target started camouflaging its pregnancy-related ads and not talking to reporters after people appeared creeped out by the uncanny accuracy of its natal predictions.) But the data collection of the firms is creating whole new kinds of people—for marketers, for the NSA, and for anyone with the money or connections to access the information.
More likely than not, encoded in Facebook’s database is some new, milder DSM, with categories like the slightly stingy (who need to be induced to buy more); the profligate, who need frugality prompts; the creepy, who need to be hidden in newsfeeds lest they bum out the cool. Our new “Science Mart” creates these new human kinds, but also alters them, as “new sorting and theorizing induces changes in self-conception and in behavior of the people classiﬁed.” Perhaps in the future, upon being classified as “slightly depressed” by Facebook, users will see more happy posts. Perhaps the hypomanic will be brought down a bit. Or, perhaps if their state is better for business, it will be cultivated and promoted.
You may think that last possibility unfair, or a mischaracterization of the power of Facebook. But shouldn’t children have been excluded from its emotion experiment? Shouldn’t those whom it suspects may be clinically depressed? Shouldn’t some independent reviewer have asked about those possibilities? Journalists try to reassure us that Facebook is better now than it was 2 years ago. But the power imbalances in social science remain as funding cuts threaten researchers’ autonomy. Until research in general is properly valued, we can expect more psychologists, anthropologists, and data scientists to attune themselves to corporate research agendas, rather than questioning why data about users is so much more available than data about company practices.
Image Note: I’ve inserted a picture of Isaac’s book, which I highly recommend to readers interested in the history of social science.
*I suggested this was a problem in 2010.