This tutorial will be the first of many about mturk and Boto, a python interface to Amazon Web Services.
When I started to develop python tasks for automate some process by using amazon mturk was a little bit difficult found enough information about the usage of Boto and about mturk, for this reason I want to make those things easy for others developers that, like me some time ago, are starting to deal with Amazon Mturk.
Let’s start from the origins what you need is:
Boto library (2.0b4): you can install it with easy_install or download the package from the github page.
Amazon Web Services keys: create an account or login if you already have one on aws.amazon.com, after that go to you account admin panel and than security settings.
In the page scroll to the section Access credentials and keep note of the Access Key ID and Secret Access Key.
We are almost ready for use this keys with Boto but before that we have to login with the aws account on https://requestersandbox.mturk.com/ for create the Mturk Sandbox account.
Ok, now we can start to write some code for test if our keys work and if we can connect to the Mturk engine.
Below there is the code for connect to the mturk api and get the account balance.
from boto.mturk.connection import MTurkConnection ACCESS_ID ='your access key' SECRET_KEY = 'your secret key' HOST = 'mechanicalturk.sandbox.amazonaws.com' mtc = MTurkConnection(aws_access_key_id=ACCESS_ID, aws_secret_access_key=SECRET_KEY, host=HOST) print mtc.get_account_balance()
The expected result is a list with one value, normally $10,000.00 (I never see numbers like this on my bank account )
If everything works fine we can start to create an mturk HIT.
Basically an HIT is a question, or a collection of questions. I strongly suggest you to read the HIT Data Structure before continue with this tutorial.
Let’s start to create our first HIT with 2 questions:
1 mandatory with choices and another one not mandatory with a free text answer.
from boto.mturk.connection import MTurkConnection from boto.mturk.question import QuestionContent,Question,QuestionForm, Overview,AnswerSpecification,SelectionAnswer,FormattedContent,FreeTextAnswer ACCESS_ID ='your acces key' SECRET_KEY = 'your secret key' HOST = 'mechanicalturk.sandbox.amazonaws.com' mtc = MTurkConnection(aws_access_key_id=ACCESS_ID, aws_secret_access_key=SECRET_KEY, host=HOST) title = 'Give your opinion about a website' description = ('Visit a website and give us your opinion about' ' the design and also some personal comments') keywords = 'website, rating, opinions' ratings =[('Very Bad','-2'), ('Bad','-1'), ('Not bad','0'), ('Good','1'), ('Very Good','1')] #--------------- BUILD OVERVIEW ------------------- overview = Overview() overview.append_field('Title', 'Give your opinion on this website') overview.append(FormattedContent('<a target="_blank"' ' href="http://www.toforge.com">' ' Mauro Rocco Personal Forge</a>')) #--------------- BUILD QUESTION 1 ------------------- qc1 = QuestionContent() qc1.append_field('Title','How looks the design ?') fta1 = SelectionAnswer(min=1, max=1,style='dropdown', selections=ratings, type='text', other=False) q1 = Question(identifier='design', content=qc1, answer_spec=AnswerSpecification(fta1), is_required=True) #--------------- BUILD QUESTION 2 ------------------- qc2 = QuestionContent() qc2.append_field('Title','Your personal comments') fta2 = FreeTextAnswer() q2 = Question(identifier="comments", content=qc2, answer_spec=AnswerSpecification(fta2)) #--------------- BUILD THE QUESTION FORM ------------------- question_form = QuestionForm() question_form.append(overview) question_form.append(q1) question_form.append(q2) #--------------- CREATE THE HIT ------------------- mtc.create_hit(questions=question_form, max_assignments=1, title=title, description=description, keywords=keywords, duration = 60*5, reward=0.05)
If you don’t see any output everything worked well, for be sure navigate to https://workersandbox.mturk.com/ and see if your hit is in the list (can take some time to appear, you can also check on the requester sandbox admin panel).
The code is very clear and self explained but I will give you anyway some description with links to the Boto DOCS.
Overview: The overview is a free content of the question form, this means that can be everything (binary content, an html content, ex.), remember that it isn’t a “question object” is just an arbitrary content, in my example is a link to the website.
Question: The question is mainly made by the AnswerSpecification object, that specify which kind of answer have to be rendered for the question and by the QuestionContent object. The content can be, like for the Overview, a text, a binary content, an html content, ex.
QuestionForm: This is the container for all your questions and overviews, basically is an extension of a python standard list.
create_hit method: This is the method for create an hits, for see which parameters it accepts take a look to the docs.
A little suggestion, the question and questions parameters are different, the first one accept a Question object if you want to create a hit with just a question, the second one accept a QuestionForm.
Well, now you should be able to create an hit, you can also do your hits on the worker sandbox for see how they works and for have some results to fetch.
In the next tutorial I will speak about fetching result from your hits and accept or refuse payments to the workers.
See you soon