Boto Mturk Tutorial: Fetch results and pay workers

This is another tutorial of the mturk series, in this one I will explain how to fetch the ready results from mturk trough python boto and how to approve or reject payments to the workers.
Before continue I suggest you to read my first tutorial about boto and mturk if you didn’t it already.

Well, before continuing for have a good test case I suggest you to publish some hits on the mturk sandbox and do it trough the workers sandxbox, in this way you will have some results ready to be fetched.
The protagonist of this tutorial is the method get_reviewable_hits,

get_reviewable_hits(hit_type=None, status=’Reviewable’, sort_by=’Expiration’, sort_direction=’Ascending’, page_size=10, page_number=1)

as you canunderstand by the name this method fetch the hits that have the status of “reviewable” or “reviewing” this means all the hits that have all assignments (Number of answer required from different workers) completed or that are expired.
As you can also understand from parameters this method give you back by default just the first 10 reviewable hits, the maximum page size that you can have is 100, this means that if you have more than 100 hits ready you have to call this method more than one time with incremental page number.
Well, the first thing that we do is write a method that fetches all reviewable hits, it accept as unique parameter an mturk connection object.

def get_all_reviewable_hits(mtc):
    page_size = 50
    hits = mtc.get_reviewable_hits(page_size=page_size)
    print "Total results to fetch %s " % hits.TotalNumResults
    print "Request hits page %i" % 1
    total_pages = float(hits.TotalNumResults)/page_size
    int_total= int(total_pages)
    if(total_pages-int_total>0):
        total_pages = int_total+1
    else:
        total_pages = int_total
    pn = 1
    while pn < total_pages:
        pn = pn + 1
        print "Request hits page %i" % pn
        temp_hits = mtc.get_reviewable_hits(page_size=page_size,page_number=pn)
        hits.extend(temp_hits)
    return hits

The list of hits returned by the method is a list of boto HITS objects.
This object doesn’t contain the assignments, you have to call another method for get the assignments of a particular HIT id.
The next step is tho iterate trough this list and for each HIT calls the method get_assignments(hit_id)

This method will return all the answers to your hits.
Below the complete script for print to screen all the assignments of your hits.

from boto.mturk.connection import MTurkConnection

ACCESS_ID ='your access id'
SECRET_KEY = 'your key'
HOST = 'mechanicalturk.sandbox.amazonaws.com'

def get_all_reviewable_hits(mtc):
    page_size = 50
    hits = mtc.get_reviewable_hits(page_size=page_size)
    print "Total results to fetch %s " % hits.TotalNumResults
    print "Request hits page %i" % 1
    total_pages = float(hits.TotalNumResults)/page_size
    int_total= int(total_pages)
    if(total_pages-int_total>0):
        total_pages = int_total+1
    else:
        total_pages = int_total
    pn = 1
    while pn < total_pages:
        pn = pn + 1
        print "Request hits page %i" % pn
        temp_hits = mtc.get_reviewable_hits(page_size=page_size,page_number=pn)
        hits.extend(temp_hits)
    return hits

mtc = MTurkConnection(aws_access_key_id=ACCESS_ID,
                      aws_secret_access_key=SECRET_KEY,
                      host=HOST)

hits = get_all_reviewable_hits(mtc)

for hit in hits:
    assignments = mtc.get_assignments(hit.HITId)
    for assignment in assignments:
        print "Answers of the worker %s" % assignment.WorkerId
        for question_form_answer in assignment.answers[0]:
            for key, value in question_form_answer.fields:
                print "%s: %s" % (key,value)
        print "--------------------"

As you can see the scripts call the get_assignments method for each hit id and after that iterate trough it for fetching the answers.
In the line 36 you see an answer[0], maybe you are thinking “why don’t iterate trough all answers ?”
For try to give a clear explanation first let’s give some definition thaw will be valid on the next rows.

  • A “question form answer” is the single answer to a single question of your form.
  • An “answer” element is the set of all the “question form answer” of your QuestionForm
  • An “assignment” is the set of all the “answers” of the same worker

In practice each worker can give just 1 “answer” to the hit, for that the assignment will contain always just one “answer”.
“answers” element is just a reflection of the xml structure, boto translate it as array of one element.
If this explanation has been clear, you just have to know which method use for accept and refuse payments to the workers.
The operations of pay and refuse have do be done on the “assignments” unit, in fact they accept the assignment id as a parameter.

approve_assignment(assignment_id, feedback=None)

reject_assignment(assignment_id, feedback=None)

Both methods accept also a feedback string, this is the message that the workers will receive as explanation for the approved/rejected assignment, be kind :-D .
When you don’t need anymore an hits you can “delete” it from mturk by calling the method

disable_hit(hit_id, response_groups=None)

I suggest you to read the documentation about disable_hit method.
I leave you with an edited version of the loop that pay all workers and disable the hits.
See you soon ;-)

for hit in hits:
    assignments = mtc.get_assignments(hit.HITId)
    for assignment in assignments:
        print "Answers of the worker %s" % assignment.WorkerId
        for question_form_answer in assignment.answers[0]:
            for key, value in question_form_answer.fields:
                print "%s: %s" % (key,value)
        mtc.approve_assignment(assignment.AssignmentId)
        print "--------------------"
    mtc.disable_hit(hit.HITId)

  • Trevor

    Thanks for the example(s)! This has been very helpful. Python is the first language I’ve studied intensely, and MechTurk is my first experience working with an API, so you’re explanation has been very useful in understanding a lot of the process.

    Thanks again!

    -TA