By Jorge Campos
Hi, I am working at tagtog.net and today I want to show you how you can use our text annotation tool to train machine learning (ML) custom models, and how easy it is to use them.
If you don’t have an account yet, just sign up. The start plan is free. You can manually annotate text, train custom ML models and annotate automatically up to a certain number of requests per month. We will use this plan to build a custom model for recognizing dates in text.
Once you have an account, create a new project. I will name mine
We don’t select any pre-trained model as we want to train a custom model from scratch 💪.
You can now define the type of entities you want to extract from text. In our case we will only use one Entity type:
date. If you don't like the default color, pick your own. It will be used to highlight the annotations for this entity type.
You also need to activate Machine Learning, so go to the Annotations tab and check the option to create automatic annotations using ML.
We don’t plan to annotate any other text features so let’s move on and import some text into the text annotation tool to start training our ML model. Go to the
Documents tab and import the following sample plain text:
Mahatma Gandhi was born on 02/10/1869.
Microsoft released Windows 98 on June 25th, 1998.
I was born on April 7th, 1976.
The Haloid Company made the first public announcement of xerography on June 16th of 1911.
Hurricane Michelle hit Cuba on 2003.
SpaceX landed a Falcon 9 rocket on 12/22/2015.
Amazon was founded in July of 1994.
The PlayStation 4 was released in 2013.
Professor Stephen Hawking died in the early hours of 14 March 2018.
Python language was created by Guido van Rossum and first released in 1991.
Let’s now annotate the dates in the text annotation editor.
There is only one entity type defined, thus all entities will be annotated as dates.
When you have finished annotating, use the
Confirm button to tell tagtog these annotations are ready to be used as training data. When you click this button, in the background, the custom model is being trained and deployed automatically using all the confirmed annotations in your project.
The training and deployment process is done at one step and it is really quick, meaning you can use your new shiny model instantly, and that is what we are going to do.
From now on, all new text you import will be automatically annotated using the latest version of your model. So let’s add some new text:
Texas Instruments announces the development of the first commercial transistor radio on October 18th of 1954
As you can see, the date was automatically annotated. You can try to import more text and check by yourself. Are you interested in a more particular date format? Provide tagtog with some example, train the model and see the results immediately. If they are wrong, just correct them manually. tagtog learns from your feedback, to give you more accurate results next time.
If you want to automate a process using your new model, it will be easier using the API. For example:
curl -u Sophia:pwd -X POST -d 'text=Sections of the Financial Reporting Manual have been updated on December 1, 1999' 'https://www.tagtog.net/-api/documents/v1?project=dateNER&owner=Sophia'
At 🍃tagtog.net we aim to democratize text analytics.
👏 👏 👏 if you liked the post and want to share it with others!