Unrepresentative Turkers?

Dave Hoffman

Dave Hoffman is the Murray Shusterman Professor of Transactional and Business Law at Temple Law School. He specializes in law and psychology, contracts, and quantitative analysis of civil procedure. He currently teaches contracts, civil procedure, corporations, and law and economics.

You may also like...

6 Responses

  1. Orin Kerr says:


  2. John Gastil says:

    Our research uses Turk large samples and generally finds them left-of-center (in terms of ideology/party), though not as skewed in most other respects as one might guess.

  3. Mike says:

    Innocent question: Is this subject to human subjects protocols?

  4. Dave Hoffman says:


    Basically, yes. Though of course most survey research is exempt (which is a determination that most IRBs will insist on making on their own).

    John – thanks, interesting. I’m not terribly surprised by ideology finding in your research. I was surprised by how distinctive this skew was. Crazy result!

  5. anon says:

    Dave, I don’t know if this will make you copacetic or not, but as it happens, just out today in PLoS One is this study:

    Incorrect beliefs about memory have wide-ranging implications. We recently reported the results of a survey showing that a substantial proportion of the United States public held beliefs about memory that conflicted with those of memory experts. For that survey, respondents answered recorded questions using their telephone keypad. Although such robotic polling produces reliable results that accurately predicts the results of elections, it suffers from four major drawbacks: (1) telephone polling is costly, (2) typically, less than 10 percent of calls result in a completed survey, (3) calls do not reach households without a landline, and (4) calls oversample the elderly and undersample the young. Here we replicated our telephone survey using Amazon Mechanical Turk (MTurk) to explore the similarities and differences in the sampled demographics as well as the pattern of results. Overall, neither survey closely approximated the demographics of the United States population, but they differed in how they deviated from the 2010 census figures. After weighting the results of each survey to conform to census demographics, though, the two approaches produced remarkably similar results: In both surveys, people averaged over 50% agreement with statements that scientific consensus shows to be false. The results of this study replicate our finding of substantial discrepancies between popular beliefs and those of experts and shows that surveys conducted on MTurk can produce a representative sample of the United States population that generates results in line with more expensive survey techniques.

    Full text here: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0051876. (Full disclosure: I’m married to one of the authors.)

    For those asking about IRB involvement in MTurk and similar research, you’ll see in the ethics statement in the above paper that the U of Ill IRB reviewed (and approved, obviously) the study. Although the IRB waived the usual requirement for signed informed consent (pretty difficult to do online), the fact that the IRB waived that requirement suggests that, in this case at least (IRBs will differ dramatically in how they view this and much other research), the IRB didn’t merely review the study to confirm its exempt status but in fact reviewed it as regulated human subjects research (if it’s exempt research, the regulatory requirement for written informed consent doesn’t apply and, hence, needn’t be waived). Part I of this paper provides the statutory and regulatory background on this system: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2138624 (full disclosure: my paper).

    But I can do you one better. Most researchers use MTurk as a subject pool for conducting survey research, etc. But you can, of course, also use it for its original purpose — to crowdsource work (in this case, research) itself, in which case Turkers act as researchers rather than the subjects of research. For an argument that such crowd-sourced research work (using the example of the incredibly successful FoldIt site) should nevertheless be subject to IRB review because it “has the potential to cause harm to participants, manipulates the participant into continued participation, and uses participants as experimental subjects,” see this recent article in the J. of Med. Ethics: http://jme.bmj.com/content/early/2012/11/29/medethics-2012-100798.short?g=w_jme_ahead_tab, which I will probably blog about when I recover from my bewilderment (the risks the authors identify are that (1) citizen scientists could be doing something more valuable with their time (!), and (2) that they may suffer from internet addiction, undermining the voluntariness of their consent).

    Note that because IRBs necessarily have substantial discretion in interpreting and applying federal research regulations, and because the research ethics literature is often used for this gap-filling, papers like this aren’t the benign thought experiments of bioethicists, but, via IRBs, may have real-world impact on how knowledge production is regulated. (More (slightly more restrained) kvetching in this vein in my paper cited above.)

  6. Um, yeah. I hadn’t intended to post the above anonymously, obviously.