Hivemind Release — Handlebars, Agreement Checking, Tesseract, Editing & more

Riaz Karim
Hivemind
Published in
4 min readNov 11, 2018

We’re excited to announce our latest release to the Hivemind platform with a host of new features.

Hivemind embraces Handlebars

With this release, you can now use the Handlebars syntax to template instance instructions and schemas. This means that you can upload many instances without repeating the instructions on each instance.

For example, if you set the instructions template on the task to:

Click on the following link: [{{Data.Name}}]({{Data.URL}}) and find {{Collect}}.

And you have the following instance data:

{  
"Data":{
"URL":"https://hvmd.io",
"Name":"Hivemind",
"LastVisited":"2 July 2018"
},
"Collect":"the author of the most recent blog post"
}

Hivemind will render the instance as:

Click on the following link: Hivemind and find the author of the most recent blog post.

As you can see, you can combine this template with Markdown (used to render the link above) to get really neat instructions. You can also change the template by editing the task to change how the instance is displayed without having to cancel and recreate each one. Check out the relevant section in the docs for more information.

Advanced Agreement Checking

More advanced agreement checking is now available in Hivemind Studio. Options include comparing strings by ignoring case, spaces, symbols, or using Jaro-Winkler and Levenshtein distance to agreement check based on string similarity. There’s also a number of options available for dealing with proximity of numerical values and omitting fields completely from any agreement check. Check out the section in the docs or the ‘Data Quality’ tab in Studio for more details.

Active Tasks are now editable

Certain fields on a task are now editable even after they have been sent to the contributors. You can edit the instructions, templates, output and agreement options among others. Remember: changing these fields can have unintended ramifications for your data quality.

Extracting text from images with Tesseract

We have integrated the excellent Tesseract OCR library as a Hivemind agent to help when building workflows to digitise images of text. For consistently formatted text, Tesseract generally provides good results across a broad range of languages. OCR output quality may suffer however when provided with inconsistent document formatting or a poor quality image. In these cases, an augmented workflow consisting of an initial task completed by the Tesseract agent followed by an OCR clean up task completed by a human, can be used to deliver high quality results. Watch out for the ‘OCR’ tile on the Studio Task Creation page to get started.

Support for Locales and Qualifications on MTurk

You can now access the full power of MTurk qualifications to target specific pools of workers by qualification or locale. These are available on the ‘Settings’ tab when you create an MTurk task. See here for documentation for further information.

Breaking API Change: Task Instructions -> Documentation

We don’t take breaking API changes lightly and only do so to address significant issues in usability or functionality. In response to feedback around potentially confusing naming around the various markdown fields, we have renamed ‘instructions’ on a task to ‘documentation’. We feel this results in a clearer structure of the task containing the higher-level documentation and the instance having the specific instructions.

Changelog

  • Launch of https://docs.hvmd.io: the official Hivemind Docs.
  • A new OCR Agent to extract text from images is now available in Studio and through the AGENT-OCR qualification on the API.
  • Active tasks are now editable. Certain fields have been made available for editing after a task has been created/started. See the task page in Studio for the ‘Edit’ button.
  • Advanced agreement checking is now available through the platform providing access to numerical, set and string-based operations. See: https://docs.hvmd.io/#agreementoptions
  • MTurk Qualifications are now supported via Studio. Settable on the ‘Settings’ tab when you create an MTurk task.
  • Contributor analytics have been moved to Workbench for Lead Contributors. Consequently, Studio access has been removed for Leads.
  • Instance templates for data-driven instances on tasks. You can now use Handlebars templates in your tasks documentation and instance instructions. See: https://docs.hvmd.io/#using-instance-templates
  • Templating for instance schema overrides. Handlebars templating has also been made available for override-schemas on instances. See: https://docs.hvmd.io/#adding-instances
  • Breaking API Change: On task creation, ‘instructions’ has been renamed to ‘documentation’ on the API.

--

--