Making the Right Call: A Data Engineer’s Guide to Switching Tools

Levi Pols
Auraidata
Published in
5 min readMar 13, 2023

--

Select the right tools for your stack

TL;DR This article emphasizes the importance of adaptability for data engineers in the tech industry. It presents a case study of how the author’s team overcame a roadblock by switching from Shiny to Streamlit for a self-service analytics dashboard. The article highlights the need to stay up-to-date with the latest tools and techniques to stay relevant in the field.

Staying Relevant in a Changing Industry

The tech industry is constantly in flux — new functionalities and features are being released all the time. And if data engineers want to remain relevant, we have to stay on top of these changes. As a data engineer, you’re constantly conjuring up new tricks and techniques to transform raw data into something useful. That’s why it’s absolutely essential to keep up with the latest features and tools in our field, and not to be afraid to switch to alternatives when the situation asks for it. This article shares how adaptability allowed us at Aurai to overcome a roadblock for a client by changing our approach and tools.

Adaptability: Think outside the box

The Importance of Adaptability for Data Engineers

Being adaptable to change allows us to quickly respond to new challenges and opportunities. When a new tool or feature becomes available, you need to be able to assess its potential benefits and drawbacks and decide whether it’s worth incorporating into your data stack. This flexibility can help you stay ahead of the curve, making your data infrastructure more efficient and effective. Sometimes, that means you have to be willing to switch tools when you face limitations with your current set-up.

Case Study: a Self-Service Analytics Architecture

Here’s a recent example of how adaptability helped us overcome a roadblock. We were tasked with designing, developing and deploying a self-service analytics architecture for one of our clients. This service would be accessed through a web app, where users could connect to a Snowflake Database and select desired data for generating predictions. We accomplished this by running a preprocessing script and calling a predictive model created by our data scientist through an API.

Roadblock: Limitations with Shiny and RSconnect

The customer was using RStudio Connect to deploy R apps with the Shiny library. However, we soon discovered that Shiny and the server we had running had limitations in terms of resources such as memory. The preprocessing pipeline would slow down the server severely or sometimes even crash it. Additionally, Shiny’s integration with Snowflake was inadequate for the customer’s complex requirements. We realized that we needed a more robust and flexible solution that could meet our complex requirements. And ideally, bypass running the pipeline on the RSconnect server.

Streamlit logo

Discovering Streamlit

In our quest for alternatives, we discovered that the deployment interface of our customer started supporting Streamlit, a cutting-edge framework for building data dashboards and machine learning apps. Streamlit’s intuitive syntax, built on Python, made it a natural fit for our team since our Data Scientists were developing with the Python language. Another plus was that Streamlit had recently been acquired by Snowflake. After examining their presented roadmap for further Snowflake integrations and features, we felt content and assured that it would be an excellent fit for our requirements.

Leveraging Streamlit’s Integration with Snowflake

One of the perks of using Streamlit was its ability to access Snowflake’s Snowpark library. Snowpark allows developers to build and run custom code in the Snowflake environment using popular programming languages like Java, Python, and Scala. This allowed us to trigger our preprocessing pipeline directly from our Streamlit app, which then was carried out on Snowflake’s Snowpark clusters optimized for heavy data-processing.

Switching Tools for a More Robust and Flexible Dashboard

In the end, switching from Shiny to Streamlit was a wise decision. Streamlit’s ability to integrate with Snowflake and easily call machine learning models through an API was exactly what we needed for our project. We were able to build a more robust and flexible dashboard that met our complex requirements, while also improving our data processing times and simplifying our data pipeline. By staying open-minded and being willing to switch tools, we were able to find the best solution for our client’s needs, and ultimately deliver a high-quality product to our clients.

Adapt to changing environments

The Lesson: Stay Adaptable and Stay Ahead of the Curve.

The lesson here is simple: being adaptable is crucial for us data engineers. Don’t get too attached to a specific set of tools or techniques. Be willing to try new things and stay up-to-date with the latest features and possibilities of the tools and software you use. By doing so, you’ll be able to overcome any roadblocks and stay ahead of the curve.

Aurai provides custom data solutions that help companies gain insights into their data. We engineer your company’s future through simplifying, organizing and automating data. Your time is maximized by receiving the automated knowledge effortlessly and enacting better processes on a foundation of relevant, reliable, and durable information. Interested in what Aurai can mean for your organisation? Don’t hesitate to contact us!

--

--