Key takeaways from Data Natives 2018

Dânia Meira
4 min readNov 25, 2018

Last Thursday and Friday I joined the Data Natives conference in Berlin for two immersive days of talks about different aspects of the Data Science field.

After meeting other practitioners, exchanging experiences, listening to more than 15 talks and being present in 2 panels, I saw a lot of alignment between the different speakers. They come from distinct backgrounds, have different roles and are working across various industries but still, the message was the same: lessons learned and guidelines to make your Data Science story a successful one.

Cassie Kozyrkov’s definition of Decision Intelligence

Already in the first morning Annina Neumann presented what I think is the overview of this message. In her talk “AI in Business Applications: Do’s and Don’ts” she explored three main points, which I later saw being further developed in more details by different speakers that will be mentioned accordingly:

Culture and Team

  • Data Science is an interdisciplinary field, so it’s not enough to hire one unicorn but instead build a team with diverse skill set that works in collaboration. Data Scientists sit in between the process which starts with the Decision Makers and end with the Engineers that deploy models in production.
  • Business and Management need to be onboard. Instead of overselling Machine Learning capabilities like if we were in some sci-fi movie, we could follow Noa Tamir’s advice in her talk “Building a Data Culture” and use storytelling to increase data literacy.
Noa Tamir and the Fundamentals of a Data Culture
  • Take calculated risks and learn from mistakes: because machine learning is not the kind of plug-and-play solution. It requires putting many pieces together at the right time to make it relevant.

How to find Use Cases

  • Define specific scenarios together with business. This is also part of the culture of collaboration. Because having hope that an artificial “intelligence” will fix a poor, fuzzy defined problem is not a strategy.
Although it seems sometimes it’s all there is on a kick-off of a project…
  • Have a clear benefit aligned with long term business KPIs, human acceptance criterias and company strategy and values. This topic was repeatedly mentioned in other talks, as it is an important step if you are designing your A&B test (A/B Testing: Lessons Learned By Dan McKinley), building your chatbot (Support Automation with Chatbots — The Buy or Make Dilemma By Erik Pfannmöller) or working on more feature engineering to improve the results of your model.

Methodology

  • Evaluate if the use case qualifies: is there data describing this process we want to tackle with machine learning? Is it in the appropriate precision level to model the process ? is there enough historical data to be able to feed a model realistically? Now quoting Marc Weimer-Hablitzel in his talk Big data is dead, we need to move from big data to ‘data thinking’. First, determine where the value is, only then start digging for data.
  • Integration is important: make it visible to end users. Sometimes the benefit is in a change in the process, not a change in the tech.
  • Iterate on the solution (the MVP approach). One great example on how to do this was brought up by Stephen Lumenta on his talk “The Relevance Engine — Detecting Engaging Language”: with a goal to build a model to detect news pieces that influence people’s opinions, a starting point is working with tabular data and simpler models like logistic regression, SVM, boosting trees. Then moving on to engineer more features like boolean attributes describing user events and the news content, also applying some trending time series models. The next iteration is to go deeper and feed the entire text of the news into LSTMs. Beyond that, there are still opportunities in applying transfer learning from pre-existing models.
  • Understanding if the end-goal is to get inspiration or performance is crucial when choosing the right algorithm. In Cassie’s talk “Decision Intelligence” she explored the idea that trust in the models comes from crafting well-designed tests.
From Cassie’s slides again: where we are in grey and the future she envisions in bold.

We know Data Science is a broad field, filled with a lot of inflated expectations and buzzwords, which are still characteristics reflecting its early stage. But, such convergence among this diverse set of speakers made me leave DN18 with a clear impression that we are on a conscious path to evolve it to a mature discipline. We are at a point now which is feasible to have real impact in many processes given the accelerated data collection, abundance of methodologies and on top of those, the democratization given the tools that are now available.

Taken while I was exploring the molecules in the VR lab: Permutation City feelings xD

Being part of such transformative and challenging movement makes me feel energized, and I am very excited to contribute to the blossoming for what is to come next.

--

--