Generic web-based annotation tool for Mechanical Turk (v0.1)
By Alexander Sorokin, University of Illinois at Urbana-Champaign.
sorokin2@uiuc.edu; syrnick@gmail.com
Our tools provide general interface for annotation images. We currently support simple binary questions, bounding boxes and polygons.
Annotation process overview
Setting up the annotation process is simple. We set up a simple web server, put the web part of the tools and the data there. We define the annotation task in a simple XML file and place it on our web server as well. We install Mechanical Turk command line tools and additional helpers. After that we submit the images to Mechanical Turk and get the results back. The results are converted into XML files, which can be read by Matlab or displayed on the web.
Download
- Amazon Mechanical Turk Command Line Tools - somewhere....
- MTurk CMD helpers
- Full demo web pack
- Full demo local pack
- Web code only (for deployment). Later. For now simply copy demo web pack.
- Flash sources. Only for those who want to dig through it.
Example setup
We use a labeling human heads as a running example. We assume that we have two machines: local (e.g. laptop) and the web server. Lets call the web server http://visual-hits.s3.amazonaws.com/demo-heads/. The web server only serves static web pages, javascript and flash files. As an example we can use Amazon S3 storage, which can be immediately used as such web server.
The code and the data have to be placed in very certain locations relative to our root folder:
Web location | Description |
---|---|
code/ | Contains all flash, javascript and HTML files. The only files that require editing here are instruction***.html |
tasks/ | Contain XML files, defining the annotation task. |
frames/ | Contain all images that we annotate. All frames are stored in format "video/frame.jpg". Note, that all frames must be jpeg and the extension is automatically added to the frame name. |
annotations/ | Contain existing annotations that we want to display on the web. |
In our example, to annotate the frame frame001 from sequence sqn01, we will place the file frame001.jpg into
http://visual-hits.s3.amazonaws.com/demo-heads/frames/sqn01/frame001.jpg
Now lets setup our local machine.
- Register at Amazon Mechanical Turk (probably put $1-$5 there for the testing)
- Download and install Amazon Command Line tools for Mechanical Turk. When you are done with the installation, you should be able to run getBalancs.sh.
- Download and install command-line helpers for MTurk(see wrappers page for details). Run "source source.env" (after editing it)
- Extract demo-heads-local.tgz into ~/mturk_demo_faces and go there.
- Run MT_run.sh -sandbox and see the tasks you requested (that are using my web server)
- There are several important files here:
File | Description |
---|---|
workload.input | This tab-separated file lists the images that we want to annotate. In general, this file lists variable parameters of the tasks. The first row provides the names of the columns. |
workload.question | Contains the URL of the page that renders the question. Notice, how ${Frame} is used. This value varies for each task. |
workload.properties | This file contatains Title, Description, Pay and other important information. We revisit it later. |
workload.results | The file containing all results downloaded from Mechanical Turk. It is parsed using parse_results.py into meaningful collection of xml files. |
- Go to the sandbox and do your assignments.
- Download results: MT_getResults.sh -sandbox
- Run parser: ./parse_results.py run/ . This will create folder run/results with xml files containing all data from the interface.
- Approve and delete results (MT_approveAndDeleteResults.sh approves all, MT_only_approve.sh uses workload.approve_file to approve assignments, MT_only_reject.sh uses workload.reject_file to reject assignments). Of course, all should be called with "-sandbox" now.
- To switch to production system simply omit "-sandbox".
Switching web server
- Replicate directory structure: code/ frames/ tasks/ annotations/
- Edit workload.question and parse_results.py.
- That's it.
Running new task
- Make sure you understand and update workload.properties (task title, pay, num assignments)
- Check the instructions (they should be as clear as possible)
- Test a couple examples in sandbox.
- Run in production and test couple examples. If they don't work, immediately remove them.
Task definition XML
Task definition consists of targets. Each target has a name and an annotation type. We have 3 types of annotations:
- Binary decisions: <annotation type="presence"/>
- Bounding box: <annotation type="bbox"/>
- Polygon: <annotation type="outline"/>
The target is specified as
<target name="TargetName"> ...annotation type... </target>
TargetName becomes the title of a button.
Notes
- All data comes back in user-interface coordinate frame. The image is resized to be 500 pixels on the bigger size. It is then centered in the 500x500 box. All coordinates are reported in that 500x500 box. The easy way to get image coordinates is using the matrix:
max(imW,imH)/500 0 -max(0,(imH-imW)/2) 0 max(imW,imH)/500 -max(0,(imW-imH)/2) 0 0 1
In the case of 640x480 images, it's
1.28 | 0 | 0 |
0 | 1.28 | -80 |
0 | 0 | 1 |
Note, that due to vector nature of flash, the 500x500 is an arbitrarily chosen coordinate system. It is unlikely to change in the near future.
- If you want to parse XML in Matlab, I use XMLTree . Unfortunately they don't support the attributes nicesly. Here's an example of how I deal with it.
Using Flash to display the annotations
The flash interface can display the annotations on top of images. Parse_results.py creates results.html with proper links to displays. However for the links to work, you need to move all xml files to annotations/<task_name>/ on the web server. See the -web pack for example or here.
Coming soon...
- Grading made easy (-ier)
- Single markers
- Label and give a name
- Map buttons on the right to keyboard
- Make bounding box nicer (e.g. click-and-drag)
- Make outline more like in GWAP - continuous drawing
Let me know if you have preferences over these features.
Also, if you want and can program some of those in Flash, any help is appreciated.