DataTurks On-prem: A fully self-hosted data annotation solution.

Supports E2E tagging of data items like images, text, video for your machine learning projects: http://dataturks.com

At Dataturks, we have a super streamlined set of tools and workflows that make it super easy for large teams to collaborate and build high-quality datasets for their ML applications.

Even though the cloud version of Dataturks provides an easy to use platform right from the browser and is been used by more than 4000 ML practitioners, there are situations where companies need to bring the solution internal to their infra, below is a guide on setting up Dataturks internally.

Highlights:

  • Total time to setup: 15 minutes.
  • Works fully offline, no internet connection required.
  • Docker-based installation: We provide a docker image.
  • Supports Linux, Windows etc.
  • Supports all features as supported on the Dataturks web.

Steps:

  1. Install Docker (skip if you already know how to)

Below is an example of setting up docker on an AWS EC2 instance running Amazon AMI:

[ec2-user ~]$ sudo yum update -y
[ec2-user ~]$ sudo yum install docker -y
#Start the Docker Service
[ec2-user ~]$ sudo service docker start

2. Load Dataturks docker Image:

When you signup for an on-prem plan, we provide you a docker image link and a license file for your account.

#Download the image file from the link
[ec2-user ~]$ curl -o dataturks_docker.tar.gz https://s3-us-west-2.amazonaws.com/images.onprem.com.dataturks/dataturks_docker_3_3_0.tar.gz
#Extract the docker
[ec2-user ~]$ tar -xvzf dataturks_docker.tar.gz
#Load dataturks docker image
[ec2-user ~]$ sudo docker load --input ./dataturks_docker.tar
#Start the docker image
[ec2-user ~]$ sudo docker run -d -p 80:80 dataturks/dataturks:3.3.0

Wait for 2–3 minutes for the system to come up. The open your browser and got to http://localhost (or using the server IP, DNS name etc)

3. Add License [Not Needed any more :) ]

As a onetime setup, you need to enter your license. This license has your service capacity provisioned, like the validity of your account till an expiry date.

Note: While adding license the container needs to have internet access.

4. All done and ready to use.

Now you can click on “Login” to start creating accounts and create a new project etc.

Things to note:

  • All data you upload and all your metadata are stored in the container. Always make sure to download your data before you destroy the container.
  • Prefer to upload URL of images since all data you directly upload like images etc, is stored locally and hence requires sufficient disk space for the container.

If you have any query or need any help, please contact us at support@dataturks.com