Boto Mturk Tutorial: Create hits

This tutorial will be the first of many about mturk and Boto, a python interface to Amazon Web Services .
When I started to develop python tasks for automate some process by using amazon mturk was a little bit difficult found enough information about the usage of Boto and about mturk, for this reason I want to make those things easy for others developers that, like me some time ago, are starting to deal with Amazon Mturk.
Let’s start from the origins what you need is:

Boto library (2.0b4): you can install it with easy_install or download the package from the github page.
Amazon Web Services keys: create an account or login if you already have one on aws.amazon.com, after that go to you account admin panel and than security settings.
In the page scroll to the section Access credentials and keep note of the Access Key ID and Secret Access Key.

We are almost ready for use this keys with Boto but before that we have to login with the aws account on https://requestersandbox.mturk.com/ for create the Mturk Sandbox account.

Ok, now we can start to write some code for test if our keys work and if we can connect to the Mturk engine.
Below there is the code for connect to the mturk api and get the account balance.

from boto.mturk.connection import MTurkConnection

ACCESS_ID ='your access key'
SECRET_KEY = 'your secret key'
HOST = 'mechanicalturk.sandbox.amazonaws.com'

mtc = MTurkConnection(aws_access_key_id=ACCESS_ID,
                      aws_secret_access_key=SECRET_KEY,
                      host=HOST)

print mtc.get_account_balance()

The expected result is a list with one value, normally $10,000.00 (I never see numbers like this on my bank account :-( )

If everything works fine we can start to create an mturk HIT.
Basically an HIT is a question, or a collection of questions. I strongly suggest you to read the HIT Data Structure before continue with this tutorial.

Let’s start to create our first HIT with 2 questions:
1 mandatory with choices and another one not mandatory with a free text answer.

from boto.mturk.connection import MTurkConnection
from boto.mturk.question import QuestionContent,Question,QuestionForm,
Overview,AnswerSpecification,SelectionAnswer,FormattedContent,FreeTextAnswer

ACCESS_ID ='your acces key'
SECRET_KEY = 'your secret key'
HOST = 'mechanicalturk.sandbox.amazonaws.com'

mtc = MTurkConnection(aws_access_key_id=ACCESS_ID,
                      aws_secret_access_key=SECRET_KEY,
                      host=HOST)

title = 'Give your opinion about a website'
description = ('Visit a website and give us your opinion about'
               ' the design and also some personal comments')
keywords = 'website, rating, opinions'

ratings =[('Very Bad','-2'),
         ('Bad','-1'),
         ('Not bad','0'),
         ('Good','1'),
         ('Very Good','1')]

#---------------  BUILD OVERVIEW -------------------

overview = Overview()
overview.append_field('Title', 'Give your opinion on this website')
overview.append(FormattedContent('<a target="_blank"'
                                 ' href="http://www.toforge.com">'
                                 ' Mauro Rocco Personal Forge</a>'))

#---------------  BUILD QUESTION 1 -------------------

qc1 = QuestionContent()
qc1.append_field('Title','How looks the design ?')

fta1 = SelectionAnswer(min=1, max=1,style='dropdown',
                      selections=ratings,
                      type='text',
                      other=False)

q1 = Question(identifier='design',
              content=qc1,
              answer_spec=AnswerSpecification(fta1),
              is_required=True)

#---------------  BUILD QUESTION 2 -------------------

qc2 = QuestionContent()
qc2.append_field('Title','Your personal comments')

fta2 = FreeTextAnswer()

q2 = Question(identifier="comments",
              content=qc2,
              answer_spec=AnswerSpecification(fta2))

#--------------- BUILD THE QUESTION FORM -------------------

question_form = QuestionForm()
question_form.append(overview)
question_form.append(q1)
question_form.append(q2)

#--------------- CREATE THE HIT -------------------

mtc.create_hit(questions=question_form,
               max_assignments=1,
               title=title,
               description=description,
               keywords=keywords,
               duration = 60*5,
               reward=0.05)

If you don’t see any output everything worked well, for be sure navigate to https://workersandbox.mturk.com/ and see if your hit is in the list (can take some time to appear, you can also check on the requester sandbox admin panel).
The code is very clear and self explained but I will give you anyway some description with links to the Boto DOCS.

Overview: The overview is a free content of the question form, this means that can be everything (binary content, an html content, ex.), remember that it isn’t a “question object” is just an arbitrary content, in my example is a link to the website.
http://boto.cloudhackers.com/ref/mturk.html#boto.mturk.question.Overview

Question: The question is mainly made by the AnswerSpecification object, that specify which kind of answer have to be rendered for the question and by the QuestionContent object. The content can be, like for the Overview, a text, a binary content, an html content, ex.
http://boto.cloudhackers.com/ref/mturk.html#boto.mturk.question.Question
http://boto.cloudhackers.com/ref/mturk.html#boto.mturk.question.AnswerSpecification
http://boto.cloudhackers.com/ref/mturk.html#boto.mturk.question.QuestionContent

QuestionForm: This is the container for all your questions and overviews, basically is an extension of a python standard list.
http://boto.cloudhackers.com/ref/mturk.html#boto.mturk.question.QuestionForm

create_hit method: This is the method for create an hits, for see which parameters it accepts take a look to the docs.
A little suggestion, the question and questions parameters are different, the first one accept a Question object if you want to create a hit with just a question, the second one accept a QuestionForm.
http://boto.cloudhackers.com/ref/mturk.html#boto.mturk.connection.MTurkConnection.create_hit

Well, now you should be able to create an hit, you can also do your hits on the worker sandbox for see how they works and for have some results to fetch.
In the next tutorial I will speak about fetching result from your hits and accept or refuse payments to the workers.
See you soon ;-)

This entry was posted in Python and tagged , , , , . Bookmark the permalink.
  • http://twitter.com/hfuecks Harry Fuecks

    Thanks for a great tutorial Mauro.

    One small mystery – when running your script above that creates a hit, why isn’t the hit shown under “Batches in progress” in the requester sandbox?

    I see it OK when I “Manage Hits Individually” and in the Worker sandbox, just not in batches.

  • Mauro Rocco

    Yep, is a still a mystery, just look at this thread

    https://forums.aws.amazon.com/thread.jspa?threadID=24993

    Seem that only the hits created trough the website are showed on batches.

  • http://twitter.com/hfuecks Harry Fuecks

    Ah well. Thanks for the tip!

  • http://twitter.com/hfuecks Harry Fuecks

    One further question – how to get a submit button at the bottom of the form ( like this http://screencast.com/t/I30cr0NRx6 – from one of the sample templates )?

    Your example uses the default “Submit Hit” button – http://screencast.com/t/fjGf9JXE – and looking at the botoa API, there doesn’t seem to be an easy way to change this.

    Many thanks.

  • Mauro Rocco

    Yep, seems that there is no way, otherwise you can do what you want with external question, I really like it becouse are just iframe that point to an url on you server.

  • Pingback: Boto Mturk Tutorial: Fetch results and pay workers | Mauro Rocco

  • Joseph Turian

    How long does it typically take for your HIT to appear in the list?

  • Mauro Rocco

    Between 5 and 30 seconds in the SandBox system, normally the production one is faster.

  • Pingback: » Python+EC2 update p2 Optimal Energetics – My PhD

  • Gopal

    Hi, do you know how to get the RequesterAnnotation attribute from the hit? doing hit.RequesterAnnotation doesn’t work.

  • Mauro Rocco

    @Gopal when I can’t find an attribute normally I use python dir() function this should help you to understand if the parameter is there.
    Remember that RequesterAnnotation is visible only to the user that created the hit.

  • Thomas

    Question: I get this back – how can I use https connection with boto?

    NonSecureRequestThis request must be made over a secure channel. You must use ‘https’ rather than ‘http’….

  • Mauro Rocco

    @Thomas, which version of boto are you using ?
    By default the last version of BOTO uses https

    https://github.com/boto/boto/commit/547d45997c6674a408ca732c1a5816596c352cb6

  • Amir

    HITLayoutId can be passed to allow using a form designed on the mturk site.

    http://docs.amazonwebservices.com/AWSMechTurk/2012-03-25/AWSMturkAPI/ApiReference_HITLayoutArticle.html has: “You can use either the HITLayoutId or the Question parameter when calling CreateHIT, but not both.”

    But the boto’s MTurkConnection.create_hit does not seem to support it adding HITLayoutId.
    So I am trying to fit the existing forms we got and pass through a structure with Questions / QuestionForm etc as your sample has.
    Extending MTurkConnection.create_hit and adding code to code to support HITLayoutId is also a way I am considering.
    Any solution which works would be excellent. Let me know if you got a way to solve this. I can post back here when it’s solved.

  • Mauro Rocco

    Hi @Amir
    By looking here http://boto.cloudhackers.com/en/latest/ref/mturk.html#boto.mturk.connection.MTurkConnection.create_hit (2.0.6 dev)
    looks like that the create method accept a parameter HitLayout that by defaults is none.
    This parameter is in fact the HitLayoutId as you can see from the source code here:

    https://github.com/boto/boto/blob/develop/boto/mturk/connection.py#L218

    So I think that this is what you need.

    Let me know

    Regards

  • Amir

    Hey Mauro,

    I can’t pass the HitLayoutId with the current 2.5.2 boto code.
    It is lacking the support you got on latest.
    Following code in create_hit() is missing:
    “if hit_layout is None:”
    and
    “params['HITLayoutId'] = hit_layout”

    I could not used the latest (2.6.0?) code it didn’t work at all, so had to use 2.5.2. (something with security if I recall, saw a few posts on that online, one said that previous version works)

    Either way, thanks for pointing this solution, I will try the 2.5.2 code and have my updated version of create_hit match your 2.6 code and see if it works.

  • Mauro Rocco

    @Amir Keep me posted on this please. Thank you