What’s Next? The Coming Low-code Revolution for Data

Sushant Rao
Prophecy.io
3 min readApr 7, 2022

--

When my parents got our first computer, it was the original IBM PC. I remember trying it out and thinking “wow, this is kind of a pain. You have to remember all these commands and such. How the heck is the rest of my family going to use it?” Sure enough, no one else but me touched the computer (not very personal was it :-). But, that was the state of the art for personal computers. Then Apple came out with Lisa. It was the first graphical user interface (GUI) for personal computers. While it was priced too high for anyone but businesses, you could see the potential. When Apple introduced the Macintosh, we got one. This was the first one that enabled the rest of the family to use it. Just point ‘n click and go. The transition from command-line interface (CLI) to GUIs would make computers actually personal and enable more and more people to use them.

If you think about most data tools, they are still in the age of CLIs. Sure, there are tools like Tableau and Looker that make it easier for most people to do analytics without knowing SQL. But, what about building data pipelines? When processing data on Apache Spark, you use a notebook. It’s basically a text editor, similar to a CLI. This means anyone that wants to do data preparation or integration, needs to become a Spark developer, making it difficult for many data practitioners (data analysts, data scientists, and data engineers) to use. This is why low code matters. Similar to the transition from CLI to GUI, low code will make data tools accessible to anyone that needs to process data. One side note, while it is low code, the UI must produce actual Spark code, an important consideration to avoid lock in.

A low-code user experience is really just the beginning of making it easier for more people to build, deploy, and manage data pipelines. You need to bring in the best practices from software engineering as they have done the same for applications. What will make this easier? For example, interactive development, execution, and debugging (with a visual drag and drop interface) on live Spark clusters ensures code works as intended. But, that’s just the start. You need integration with GIT to track and version changes and testing to make sure all changes are unit tested. These capabilities are part of a larger CI/CD process that de-risks moving code changes to production.

Since the job of a data pipeline is to ingest, process, and load data, these pipelines need to be considered and treated as business-critical infrastructure. End-to-end monitoring of pipelines is required to verify that all the data is being processed correctly and that jobs are taking place as scheduled. And since this is data, understanding the metadata and lineage, especially at the column level is required so data practitioners can trace the data back to the source.

Whew! That’s a lot of capabilities! How many different tools are going to be needed to do all of this? How does having to use so many different tools make this “easy?”

Well, that’s the magic of Prophecy. Prophecy is a complete, low-code data engineering platform. Data practitioners of all skill and experience levels are able to quickly, easily, and interactively develop, execute, and deploy data pipelines with 100% open-source Spark code. Prophecy incorporates software engineering best practices so any changes to data pipelines are tracked and tested, providing high confidence when code is moved to production. It also helps companies achieve operational excellence through end-to-end monitoring of data pipelines to ensure they are working as intended, along with search and column-level lineage to track data all the way back to the source. Prophecy provides all of these capabilities in an easy to use visual IDE. It’s the only tool that anyone that needs to do data engineering. This is the reason I joined Prophecy.

--

--