RITA 101

RITA is a language which is designed to analyse natural language. Usually to find and extract information from some kind of text: comment, article, legal document etc.

Let’s teach computer to understand a basic small talk!

Start with “Hi”

One obvious statement we want to catch is greeting

Simple variations which comes to my mind are:

  • Hello
  • Hola
  • Good morning/evening/afternoon
  • … maybe even “Good day”

RITA allows us to create an array of these options. Each option is inside quotes, array starts with { and ends with }. To be able to use it — lets assign to some variable

So…


In this article I want to introduce, that good ol’ templates are not the only option. Product itself is under intense development, many things can change along the way, but the core principle will remain the same — ability to visually define structure of text, and have machine build fluent text for you.

Intro

AcceleratedText — https://github.com/tokenmill/accelerated-text is an OpenSource tool which allows to generate text.

We define how text is to be generated in what we call a Document Plan:

DocumentPlan of Restaurants

DocumentPlan creates a structure and custom data is usually filled from CSV (while there’s an option for direct API call…


Photo by Annie Spratt on Unsplash

Machine Learning popularity is skyrocketing these days, every one wants to build model and to solve some kind of problem.

However, often forgotten truth, is that you need proper data to train your model, and you need vast amounts of it. Actually, correct term would be “Information” rather than “Data”, but “Data” became large buzz word these days, so I’m forced to use it.

What do I mean by “proper data”

At this point I will focus on language domain (NLP).

There are some exceptions when you can give just loads of text, compute several weeks on GPU’s and get proper result, however, majority of cases require…


Riemann (http://riemann.io) is an awesome tool. There’s no doubt about it. However stiff learning curve and need for Clojure knowledge in order to config, makes it highly unpopular.

At Mintly we managed to integrate it as our monitoring and alerting and monitoring framework and it works flawlessly. However decent amount of blood and sweat was used into creating it. Point of this blog is to guide potential users.

Gaining knowledge about Riemann

First major problem you will encounter — lack of documentation. I mean serious lack of it.

I have found this book: https://www.artofmonitoring.com to be ridiculously helpful. …


I want to begin with that this might not apply to every case or every one, this is more like a case study about what worked wonderfully for us. By creating a middleman service you’re getting a lot more freedom, but it is a lot easier to break whole system.

Lets begin with some introduction…

Our data is all about movie recommendations. At first glance — any database would fit for that, standard relational database should work flawlessly. Yes, for storing movie data it is more than fine, all relational stuff is handled very well due to promised integrity. …

Šarūnas Navickas

Data Engineer @ TokenMill, working with NLP and NLG stuff. Loves using different programming languages, searching for the best tool for the job.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store