Day in the Life of a Nitronaut | Data Team
In this installation of a ‘Day in the Life of a Nitronaut,’ we sat down with the two members of the Nitro Data team, Marek Kolodziej, Senior Research Engineer, and Malcolm Greaves, Research Engineer, to get an insider’s perspective on what it’s like to find treasure out of chaotic, unstructured data.
Marek Kolodziej, Sr. Research Engineer
Marek, can you tell us a little bit about what it means to be a Research Engineer?
As Research Engineers, simply put — we’re just really trying to classify and understand loads of data.
First, we’re focused on coming up with an idea or an algorithm to figure out how to use that data to power new features for our application. We ask questions like, “How do we use this data to make a feature work?” and for Nitro specifically, we ask questions like, “How do we know what kind of PDF you uploaded?” And second, we’re focused on using that algorithm in a format that works with our platform.
If you had to come up with a metaphor for your role, what would it be?
We’re the head chefs in the kitchen. We’re using the ingredients to cook the meals. Sometimes, we make a meal and it’s not the tastiest. But we’re using a gas stove instead of a convection stove, and hey — sometimes the gas stove is just a few degrees off and things don’t come out the way you want them to.
What is the ultimate goal of the Data Team?
Our goal is to classify documents based upon the format and the data within the document. We want to get a deeper understanding of the semantics of the document — and by extracting the semantic information, we’re able to understand the document the way a human would understand a document. That is how we describe a “smart document.” We’re not up-selling, spamming or using our findings for internal data consumption. Ultimately, we want to empower the end user so that they’re able to understand the information within the document in the most clear and straightforward manner possible.
Malcolm Greaves, Research Engineer
What is your favorite part about being a Research Engineer?
One of my favorite parts about my role is the technology that we work with. We use Scala, Spark and other cutting edge tools and frameworks. A wonderful aspect about being a Nitro engineer is that I don’t have to worry about untenable legacy code. Everyone, from platform to research, works really hard to make sure we have the right code for the right job. We’re constantly evaluating best practices, reevaluating our technical design choices, and striving to make an incredible product on a modern architecture. We aren’t dealing with old problems — we’re dealing with new problems. And with the Scala platform that we use — everything is modern and high quality.
What are some of the biggest challenges you face in your role?
The problems we face are often very open ended — we try to use a breadth of strategies so that we can determine which ones fail and which ones work best. Instead of focusing on one solution for 3 weeks — we try a bunch of solutions in 1 week — that way we’re able to limit different lines of thought, fail fast, and come to a workable solution quickly.
You can have the best algorithm in the world, but somehow it can still be ill-suited for the specific problem you’re trying to solve. Sometimes it’s not necessarily the “hot” algorithm (anything machine learning, recommenders, Artificial Intelligence, or Natural Language Processing is usually considered “hot”) that you end up choosing, but instead you go with the algorithm that works well for what you need.
What are you looking forward to this year?
Well, earlier this year we attended the Text by the bay conference and it was the best conference I’ve attended in years — the organizer is actually our Chief Scientist, Alexy Khrabrov. The great part was that it was extremely technical and it wasn’t led by Marketing or Sales — while those conferences definitely serve a great purpose, the technical audience is often alienated. Our team attends conferences with the goal of learning how we can do our job better and how we can be more productive and efficient.
They share cutting edge information about Big Data and technology, and it is an incredible way to learn new things as an engineer. I’m excited for Scala by the Bay and Big Data by the Bay because I know they’re going to be inspiring and informative. The speakers are literally celebrities in the world of Software Engineering — for instance, Martin Odersky, the creator of Scala, was a speaker at Scala by the Bay.
The Data Team is home to some of the most creative and curious individuals at Nitro. They’re tasked with discovering trends and assigning meaning in a world of big data — and their findings are the future of smart documents. Sound exciting to you? Then don’t forget to sign up for Scala by the Bay on August 13th-16thand stop by the Nitro booth to say hello to Marek and Malcom!
Originally published at blog.gonitro.com on August 6, 2015.