Link To TurkPrime

How to run successful experiments and get the most out of Amazon's Mechanical Turk

Monday, November 20, 2017

Strengths and Limitations of Mechanical Turk


Hundreds of academic papers are published each year using data collected through Mechanical Turk. Researchers have gravitated to Mechanical Turk primarily because it provides high quality data quickly and affordably. However, Mechanical Turk has strengths and weaknesses as a platform for data collection. While Mechanical Turk has revolutionized data collection, it is by no means a perfect platform. Some of the major strengths and limitations of MTurk are summarized below.
Strengths
A source of quick and affordable data
Thousands of participants are looking for tasks on Mechanical Turk throughout the day, and can take your task with the click of a button. You can run a 10 minute survey with 100 participants for $1 each, and have all your data within the hour.
Data is reliable
Researchers have examined data quality on MTurk and have found that by and large, data are reliable, with participants performing on tasks in ways similar to more traditional samples. There is a useful reputation mechanism on MTurk, in which researchers can approve or reject the performance of workers on a given study. The reputation of each worker is based on the number of times their work was approved or rejected. Many researchers use a standard practice that relies on only using data from workers who have a 95% approval rating, thereby further ensuring high-quality data collection.
Participant pool is more representative compared to traditional subject pools
Traditional subject pools used in social science research are often samples that are convenient for researchers to obtain, such as undergraduates at a local university. Mechanical Turk has been shown to be more diverse, with participants who are closer to the U.S. population in terms of gender, age, race, education, and employment.
Limitations
There are two kinds of potential limitations on MTurk, technical limitations, and more fundamental limitations with the platform. Many of the technical limitations of MTurk have been resolved through scripts written by researchers or platforms such as TurkPrime, which help researchers do things they were not previously able to do on MTurk including
  • Exclude participants from a study based on participation in a previous study
  • Conduct longitudinal research
  • Make sure larger studies do not stall out after the first 500 to 1000 Workers
  • Communicate with many Workers at a time.
There are however several more fundamental limitations to data collection on MTurk:
Small population
There are about 100,000 Mechanical Turk workers who participate in academic studies each year. In any one month about 25,000 unique Mechanical Turk workers participate in online studies. These 25,000 workers participate in close to 600,000 monthly assignments. The more active workers complete hundreds of studies each month. The natural consequence of a small worker  population is that participants are continuously recycled across research labs. This creates a problem of ‘non-naivete’. Most participants on Mechanical Turk have been exposed to common experimental manipulations and this can affect their performance. Although the effects of this exposure have not been fully examined, recent research indicates that this may be impacting effect sizes of experimental manipulations, comprising data quality and the effectiveness of experimental manipulations.

Diversity

Although Mechanical Turk workers are significantly more diverse than the undergraduate subject pool, the Mechanical Turk population is significantly less diverse than the general US population. The population of MTurk workers is  significantly less politically diverse, more highly educated, younger, and less religious compared to the US population. This can complicate the way that data can be interpreted to be reliable on a population level.

Limited selective recruitment

Mechanical Turk has basic mechanisms to selectively recruit workers who have already been profiled. To accomplish this goal Mechanical Turk conducts  profiling HITs that are continuously available for workers.  However, Mechanical Turk is structured in such a way that it is much more difficult to recruit people based on characteristics that have not been profiled. For this reason while rudimentary selective recruitment mechanisms exist there are significant limitations on the ability to recruit specific segments of workers.


Solutions
TurkPrime offers researchers more specific selective recruitment opportunities, and has some features in development to help researchers target participants who are less active and therefore more naive to common experimental manipulations and survey measures. TurkPrime also offers access to PrimePanels, which has access to over 10 million participants, who can be selectively recruited, and are more diverse.


References:


Peer, E., Vosgerau, J., & Acquisti, A. (2014). Reputation as a sufficient condition for data quality on Amazon Mechanical Turk. Behavior research methods, 46(4), 1023-1031.