What is Data Engineering?

I explain to my past self the concept and challenges of Data Engineering

Luis Doriz
Nowports Tech and Product
5 min readJan 24, 2022

--

Data Engineering en plataforma digital

I have learned a lot throughout my first year as a Data Engineer, from debugging errors to having ownership and responsibilities on either product and features from the data area. I continue to acquire more information and learn more about this profession every day.

At the beginning of this journey, the meaning of data engineering was very extensive. Through the experience of various stages of the growth of the Data Science area in Nowports, my concept of the role is clear.

For the record, I joined Nowports in 2019 as the second tech employee, starting as a software engineer intern and becoming a junior software engineer months later. My responsibilities were to maintain full-stack JavaScript systems, such as the client and internal application.

After a year and a half, I had the opportunity to move from Engineering to Data Science as a Data Engineer. Since then, I have worked as a Data Engineer, seeing the area’s growth learning more every day.

What is Data Engineering?

In a nutshell, the Data Engineer in a data ecosystem is the provider, consumers being Data Analysts and Data Scientists.

The Data Engineer as a provider works on creating mechanisms and interfaces for data flow and access. These interfaces must be usable and valuable to the consumers. The Data Engineer is also responsible for all the movement, manipulation, and management of data.

“Data Engineering is the development, implementation, and maintenance of systems and processes that take in raw data and produce high-quality, consistent information that supports downstream use cases, such as analysis and machine learning.” — Fundamentals of Data Engineering.

In Nowports, Data Engineers help Software Engineers with data for the app features. Data Engineers deliver quality data to create awesome dashboards and reports for data analysts. For Data Scientists, we collect and find good data sources for creating powerful machine learning models.

Data Engineers play a vital role in being the middleware for data to various areas and functions. Their labor allows their teammates to focus on their main activities, including creating features, machine learning models, dashboards, and reports, instead of spending time collecting and transforming data for its consumption.

How do Data Engineers help others focus on their primary responsibilities?

Data Engineers and Data Scientists complement each other. As I told you before, Data Engineers are the providers, and Data Scientists are one of their many consumers.

In the workflow of the Data Engineers, they should ideally be responsible for collecting, moving, storing, exploring, and transforming data. At the same time, Data Scientists should only spend time on the top layers of the following pyramid: aggregate, label, learn and optimize.

Fuente: Fundamentals of Data Engineering

For a better understanding, Data Engineering fundamentally is not:

  • Building machine learning models.
  • Create reports or dashboards.
  • Perform data analysis.
  • Build KPIs.
  • Develop software applications.

All these areas are the main stakeholders that the Data Engineer serves or provides the necessary data.

How do I learn what Data Engineering is through data maturity?

Since the start of the Data Science area at Nowports, I’ve seen the growth of its specific subareas: Data Engineering, Data Science, and Data Analysis. To reach this maturity, it overcame various stages:

Stage 1: Starting with data
In this stage, the only members of Data Science were the current Head and the first Data Engineer. They had to plan and develop the existing architecture to satisfy the company’s mission and vision.

At this stage, a Data Engineer should focus on:

  • Getting involved with the stakeholders: This is important to consider all the stakeholders’ objectives.
  • Defining the right architecture: Determining the business goals and making data a competitive advantage.
  • Build a solid data foundation: Consumers such as Data Scientists and Analysts can generate reports and models that sum up the competitive advantage that data aims to have.

This stage was fundamental in Nowports for tracking, fare predictions, and automation. These features are still active, and their value increases thanks to the initial vision and architecture.

Stage 2: Scaling with data

Resultados de aplicar Data Engineering

At this stage at Nowports, every automation and architecture began to tie in with new features and capabilities, making it a more data-driven company. Additionally, Data Engineering roles have evolved from generalists to specialists.

Data Engineers goals at this stage are:

  • Establish formal data practices.
  • Create scalable and robust data architectures.
  • Adopt DevOps and DataOps practices.
  • Build systems that support ML.

Nowports keeps growing with more technologies that take the most advantage possible with the available data. The information that we possess is an asset for the organization; it has shown that the maturity of the data infrastructure has been successful by giving data a great value.

Stage 3: Leading with data
Nowports is a data-driven company with pipelines, reports, and models that allow consumers, such as application users, to perform self-service analysis.

Integrations of new data sources or technologies are more seamless than before, and immediate value is derived after implementations.

Now, Data Engineering will be based on a cyclic flow of:

  • Developing custom tools and systems that leverage data as a competitive advantage.
  • Focus on the enterprise side of data.
  • Create a community and environment where people can collaborate and speak openly, no matter their role or position.

As a Data Engineer, the previous cyclic flow will optimize the proper usage and quality of data for maximizing its value. At this moment, I realized that the role of Data Engineering is not only providing data but all the process in knowing your consumer, your providers, mission, vision, and how you satisfy each of them.

After going through all these three stages in the growth of Nowports, it has been incredible to see how a startup goes from having a local Tech team with less than ten people to employing more than 30 in many Latin American cities with such a good structure and roles. Some of our positions are Software Engineering, Data Engineering, DevOps, Data Scientists, Data Analysts, UX and UI Designers, Product Owners, and Managers.

This growth is paralleled by control and data management, allowing each team to work with valuable resources to create amazing products and features.

--

--