Unrepresentative Turkers?

Like many others, I’ve been using Amazon Mechanical Turk to recruit subjects for law & psychology experiments.  Turk is (i) cheap; (ii) fast; (iii) easy to use; and (iv) not controlled by the psychology department’s guardians.  Better yet, the literature to date has found that Turkers are more representative of the general population than you’d expect — and certainly better than college undergrads! Unfortunately, this post at the Monkey Cage provides a data point in the contrary direction:

“On Election Day, we asked 565 Amazon Mechanical Turk (MTurk) workers to take a brief survey on vote choice, ideology and demographics.  . . . We compare MTurk workers on Election Day to actual election results and exit polling.  The survey paid $0.05 and had seven questions:  gender, age, education, income, state of residence, vote choice, and ideology.  Overall, 73% of these MTurk workers voted for Obama, 15% for Romney, and 12% for “Other.”  This is skewed in expected ways, matching the stereotypical image of online IT workers as liberal—or possibly libertarian since 12% voted for a third party in 2012, compared to 1.6% percent of all voters. . .  In sum, the MTurk sample is younger, more male, poorer, and more highly educated than Americans generally.  This matches the image of who you might think would be online doing computer tasks for a small amount of money…”

Food for thought.  What’s strange is that every sample of Turkers I’ve dealt with is older & more female than the general population.  Might it be that Turk workers who responded to a survey on election habits aren’t like the Turk population at large?  Probably so, but that doesn’t make me copacetic.

You may also like...

6 Responses

  1. Orin Kerr says:


  2. John Gastil says:

    Our research uses Turk large samples and generally finds them left-of-center (in terms of ideology/party), though not as skewed in most other respects as one might guess.

  3. Mike says:

    Innocent question: Is this subject to human subjects protocols?

  4. Dave Hoffman says:


    Basically, yes. Though of course most survey research is exempt (which is a determination that most IRBs will insist on making on their own).

    John – thanks, interesting. I’m not terribly surprised by ideology finding in your research. I was surprised by how distinctive this skew was. Crazy result!

  5. anon says:

    Dave, I don’t know if this will make you copacetic or not, but as it happens, just out today in PLoS One is this study:

    Incorrect beliefs about memory have wide-ranging implications. We recently reported the results of a survey showing that a substantial proportion of the United States public held beliefs about memory that conflicted with those of memory experts. For that survey, respondents answered recorded questions using their telephone keypad. Although such robotic polling produces reliable results that accurately predicts the results of elections, it suffers from four major drawbacks: (1) telephone polling is costly, (2) typically, less than 10 percent of calls result in a completed survey, (3) calls do not reach households without a landline, and (4) calls oversample the elderly and undersample the young. Here we replicated our telephone survey using Amazon Mechanical Turk (MTurk) to explore the similarities and differences in the sampled demographics as well as the pattern of results. Overall, neither survey closely approximated the demographics of the United States population, but they differed in how they deviated from the 2010 census figures. After weighting the results of each survey to conform to census demographics, though, the two approaches produced remarkably similar results: In both surveys, people averaged over 50% agreement with statements that scientific consensus shows to be false. The results of this study replicate our finding of substantial discrepancies between popular beliefs and those of experts and shows that surveys conducted on MTurk can produce a representative sample of the United States population that generates results in line with more expensive survey techniques.

    Full text here: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0051876. (Full disclosure: I’m married to one of the authors.)

    For those asking about IRB involvement in MTurk and similar research, you’ll see in the ethics statement in the above paper that the U of Ill IRB reviewed (and approved, obviously) the study. Although the IRB waived the usual requirement for signed informed consent (pretty difficult to do online), the fact that the IRB waived that requirement suggests that, in this case at least (IRBs will differ dramatically in how they view this and much other research), the IRB didn’t merely review the study to confirm its exempt status but in fact reviewed it as regulated human subjects research (if it’s exempt research, the regulatory requirement for written informed consent doesn’t apply and, hence, needn’t be waived). Part I of this paper provides the statutory and regulatory background on this system: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2138624 (full disclosure: my paper).

    But I can do you one better. Most researchers use MTurk as a subject pool for conducting survey research, etc. But you can, of course, also use it for its original purpose — to crowdsource work (in this case, research) itself, in which case Turkers act as researchers rather than the subjects of research. For an argument that such crowd-sourced research work (using the example of the incredibly successful FoldIt site) should nevertheless be subject to IRB review because it “has the potential to cause harm to participants, manipulates the participant into continued participation, and uses participants as experimental subjects,” see this recent article in the J. of Med. Ethics: http://jme.bmj.com/content/early/2012/11/29/medethics-2012-100798.short?g=w_jme_ahead_tab, which I will probably blog about when I recover from my bewilderment (the risks the authors identify are that (1) citizen scientists could be doing something more valuable with their time (!), and (2) that they may suffer from internet addiction, undermining the voluntariness of their consent).

    Note that because IRBs necessarily have substantial discretion in interpreting and applying federal research regulations, and because the research ethics literature is often used for this gap-filling, papers like this aren’t the benign thought experiments of bioethicists, but, via IRBs, may have real-world impact on how knowledge production is regulated. (More (slightly more restrained) kvetching in this vein in my paper cited above.)

  6. Um, yeah. I hadn’t intended to post the above anonymously, obviously.