This is the first in a series of three posts -- a trilogy! -- about pain points on Amazon's mechanical turk, from a requester's perspective.
I'm a frequent user of mturk. I like the service, and spend a large fraction of my research budget there. That means I also feel its limitations pretty acutely. Today I want to write about a problem that I've noticed on mturk: skimming and cherry picking. (A few weeks ago, I complained about ubuntu. Why is it that we only hurt the computing systems we love?)
Here's the problem: even within a batch, not all HITs are equally difficult. I've discovered that some workers (smart ones) will skim quickly through a batch and cherrypick the easy HITs. For instance, given a list of blog posts to read and evaluate, some turkers will skip the long ones and only code the short ones.
Individually, skimming makes perfect sense. If you do, you can certainly make more dollars per hour. As a bonus, you might even get a higher acceptance rate on your HITs, because short HITs lend themselves to unambiguous evaluation. The system rewards strategic skimming.
But from a social perspective, skimming is counterproductive. It wastes time overall, because time spent skimming is time not spent completing tasks*. It's not really fair to other workers. It wreaks havoc on many approaches for determining accuracy. (As a requester, I've experienced this personally.) From a scientific standpoint, it can also ruin the validity of some kinds of data collection.
I first ran into clear evidence of skimming over a year ago. At first, I didn't want to say anything about it, because I didn't want to give anyone ideas. At this point, I see it all the time. One easy-to-observe bit of evidence: the hourly rate on most HITs will start high, and fall over time**. This is because skimmers grab the quick, easy tasks first, leaving slower tasks for later workers.
I can't really blame turkers for approaching their work in a clever way. Instead, I lay the blame on Amazon, for making counterproductive behavior so easy.
It's especially galling because it would be very easy to fix the problem. On the HIT design page, they should add a "Turkers can only accept HITs in the order they're presented" flag to each batch. For tasks with this flag checked, turkers would be be shown one HIT at a time. They'd be unable to view or accept others in the batch until they'd completed the HIT in front of them***. This would effectively deny turkers control over which HITs they choose to do within a batch****. It would end the party for skimmers, but make the market more efficient overall. A simple tweak to the market -- problem solved.
How about it Amazon?
* You can think about the social deadweight loss from skimming like this:
Let T be the total time all workers spend completing HITs. Skimming doesn't change T -- the total amount of task work is constant. But skimming itself is time consuming. Let S be the deadweight loss due to skimming on a given batch. Like T, the total wage for a given batch is also constant. Call it W.
In aggregate, the effective hourly wage for the whole batch without skimming is W/T. With any amount of skimming it is always less: W/(T+S). So although skimming may improve the hourly wage of the most aggressive cherry pickers, on the whole it always hurts the hourly wage of the mturk market as a whole.
** Yes, yes -- I know that this is not an acid test: there are other explanations for hourly rates that decline over the life of a task. Still, it's good corroborating evidence for an explanation that makes a lot of sense to begin with.
*** Only viewing one HIT at a time might make it harder for turkers to get a sense of what a given batch is like. There's a simple fix for this as well: allow turkers to see the next k tasks, where k is a small number chosen by the requester. This might make it harder to build a RESTful interface for turkers, though. I haven't thought it through in detail.
**** It's possible that requesters would abuse this power by doing a bait-and-switch: showing easy HITs first and then making them more difficult once workers have invested in learning the task. This seems like a minor concern---if the tasks get tough or boring, turkers can always vote with their feet. But if we're worried about it, there's an easy fix here as well: take control of the HIT sequence away from requesters, just like we took it away from workers. It would be very easy to randomize the order of tasks when the "no skimming" box is checked. Or allow requesters to click a separate "randomize tasks" box, with Amazon acting as credible intermediary for the transaction.