Data + AI Summit 2020 from our Perspectives as Data Scientists, Presenters, and Marketers

Jan Hulka
DataSentics
Published in
7 min readFeb 10, 2021

What is Data + AI summit?

Data + AI summit is a gathering of thousands of data scientists as well as other professionals from the big-data world. It is held annually by Databricks, the original creator of Spark, Delta Lake, MLflow and Koalas. Some of you may know the physical event which took place in the previous years and was known as Spark + AI Summit. In 2020, the summit was transformed into a fully virtual experience.

In 2020, the Data + AI summit brought us some great speakers and overall, more than 125 sessions. This free-to-attend online event welcomed more than 20,000 data scientists, data engineers, analysts, business leaders and other data professionals.

The online conference agenda included:

  • Official keynotes
  • Spark, Delta Lake, MLflow and Koalas training sessions
  • Certifications
  • Sessions from hosted speakers

Apart from that, the event featured:

  • Main-track sessions
  • Networking
  • Expo hall with sponsors
  • AI Personalised dashboard with tailored tips
Screenshot: Data+AI Summit online platform

As DataSentics, we took part as sponsors in this year’s online Data + AI summit 2020 by Databricks. Therefore, we had the chance to see this famous 3-day event from more perspectives, namely — Data Scientist, Presenter, and Marketer. This article will tell you what the event is like in the online environment or reminds you of your memories for the event if you have participated as well.

We would like to share our perspectives with you, to give you an idea of what was the event like and recommend you the best talks we have followed.

Point of view: Data scientist

As a data scientist attending the Data + AI summit, you are probably interested in the talks on the main stage and obtaining some useful knowledge. Therefore, we asked Nikola Valesova, our data scientist, to tell us what her experience was like:

Data + AI Summit was a unique event thanks to the large number of speakers including well-known names, such as Matei Zaharia, Reynold Xin, and Malcolm Gladwell. The line-up consisted of a wide range of captivating topics that made it difficult to choose only one session at a time. Luckily, all sessions have been recorded and are still available (for free), so in case you missed a session, or you’d like to go back to some link or idea, it’s all just a few clicks away.

Here is an overview of the session that I found the most interesting and which I can recommend watching:

  • Building the Next-gen Digital Meter Platform for Fluvius by Maarten Herthoge
    Maarten walks us through the trend of innovation in the Belgian energy market and gives detailed insights on how they collected, stored, processed and served volumes of data to every single consumer and beyond in Flanders. Maarten concludes his session with the key takeaways, why Databricks was the right platform for them.
  • Building a MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few Minutes by Milan Berka
    Milan describes how we in DataSentics think about the problem of MLOps in general and what our real-life model productionalization experiences are. Then, he introduces a dedicated MLOps platform, which provides the necessary tooling, automation and standards to speed up and robustify the model productionalization process.
  • The Pill for Your Migration Hell by Roy Levin
    Roy tells us the story of migrating an ML-based product with a huge daily workload. He describes the migration strategy, how they approached the system decomposition into migration-able parts and what challenges they had to face.
  • End to End Supply Chain Control Tower by Tarun Rana
    Tarun explains in what ways their supply chain control tower differs from the traditional ones, the benefits it brings them and how the technologies by Microsoft and Databricks enable them to do cross-functional data-based applications.
  • Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360 by Jakub Štěch and Veronika Pješčaková
    Jakub and Veronika focus on improving customer experience in the banking sector and on the challenges that come with implementing this solution as a Spark and Databricks-centric analytics platform in the Azure cloud.
  • Personalization Journey: From Single Node to Cloud Streaming by Stefanos Doltsinis and Kostas Andrikopoulos
    Stefanos and Kostas introduce their journey of incorporating machine-learning and AI to the online gaming industry, which often requires responses in real-time, and, therefore, places a real challenge. They also talk about the technologies they used and in what ways they helped them in achieving their goal.

Point of view: Main track presenter

This time, at DataSentics we had two successful nominations for presenting on the virtual “main stage” of Data + AI Summit. Namely, it was “Building an MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few Minutes” and “Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360”. We were curious about what the experience will be like at such an event. So, we asked our machine learning architect Milan, who was on stage, to give us his comment on that:

I cannot help but wonder how the Data+AI summit (or previously Spark+AI Summit) is growing year over year — it’s frankly amazing to see. This was the 4th summit I attended and 2nd one where I had the honour to be a presenter. However, this time the experience was completely different — the event was virtual.

To be honest, I was a little sceptical at first, because there are these special or even intimate aspects when presenting live in person — you can see how the audience reacts, you can communicate with them, you see if there is an interest and stop for a moment or if you can continue, the discussions after the talk are more fun. At the same time, I admit there is also the nervosity and “you only have one chance to tell this right” aspects. I was therefore curious how different the virtual event will turn out.

First thing I have to praise is how well Data+AI organizational team managed the event and preparation for it. The typical problem with events of this magnitude is how to accommodate all the people — usually, the catering and venue space is the bottleneck. However, this is not a problem with virtual events. The virtual events have other challenges — how to make things more personal and of course the internet connection & stability can pose certain problems. Having one giant zoom call with lagging presenter = not good. Because of the latter, organizers decided to record the sessions in advance. This was sort of Hollywood experience for me as a real “production team” was assembled and we went through several takes to get the perfect shot. The post-production followed — cutting, matching audio and video (camera, screen share) etc. All this actually took quite some time to complete, but isn’t it pretty cool? Compared to the recording session, the actual live event was much less dramatic — presenter logged into the platform (which worked surprisingly fine), the video has begun to play and I could as a presenter watch it and excitedly wait if some attendees will react to the video and comment. Luckily, they did and we had a very fruitful Q&A section and now are in further talks with some of the session attendees. I was glad to see such interest from the audience, as it underlines how important the productionalization aspect of machine learning is. If you are interested, you can already watch the session on YouTube and if you are interested in more, we are more than happy to discuss the MLOps problematic with you. :)

Watch the recording:

Building a MLOps Platform Around MLflow to Enable Model Productionalization in Just a Few Minutes

Point of view: Marketer

Last but not least, our colleague Jan from marketing was taking care of the visibility of DataSentics on the event and was in charge of the virtual “stand” (booth) in the virtual “expo hall”. Surely, he spent a lot of time on the event, though in a bit different way than the data scientists or presenters:

Data + AI summit is really the one of its kind, due to its large and specific audience of professionals from the data world. It was an honour to have two of our team members to present on the official session main-track and naturally we wanted to be present as the official sponsors as well.

The communication and preparation of the event were really clear and simple. During the event, we had the opportunity to communicate live in the chat on our virtual booth and schedule 1–1 call. Every day of the summit we had also 2 of 30minutes live sessions where we presented some of our main topics and got into a brief discussion with attendees. Of course, nothing can beat the actual live event experience, but the organizers did really a good job on this event.

Picture: Data + AI attendee segmentation

Conclusion

Data + AI Summit was an outstanding event at which we’ve all had a great time. In times of the pandemic, it was a nice change to attend a conference again and network with so many people with incredibly varied backgrounds. We returned from the event with many new contacts, interesting knowledge and fascinating ideas in our heads.

Lastly, hats off to the event organizers, even though the event took place online for the first time, they have handled it perfectly. Everything was running smoothly, and the online portal was nicely designed and easy to use.

We are already looking forward to the Data + AI Summit 2021, are you joining us?

Additional recommended sources

--

--