Transitioning into Data Engineering with a non-CS degree

Aniket Mitra
CodeX
Published in
2 min readJan 2, 2022
Photo by Christin Hume on Unsplash

As companies are transitioning to making data driven decisions, every function in the organization, be it Engineering, Sales, Marketing, Supply Chain, Finance, etc. are building Agile Data Teams. The Data Team mostly consists of Data/Business Analysts, Data Engineers, Data Scientists, Software Engineers, ML Engineers and Product Managers all of which work in conjunction to build Data Products for the business to consume and take qualitative actions to propel business growth.

There are numerous reports stating an increasing demand for Data Engineers year-over-year with no signs of slowing down. The role of Data Engineers is very pivotal in the whole process of building a reliable and high quality Data Product. It is not just about writing ETL/ELT pipelines. To be a successful Data Engineer, one has to look at the bigger picture.

Data Engineering is considered to be a specialization of Software Engineering and hence people might think that to make an advent into this domain, requires a CS degree. I agree that one with a CS degree are at a huge advantage but with the right trajectory and mindset, it is definitely possible to land a job in Data Engineering. It took me roughly 3 years to get an entry into this domain but this span can definitely be less for many individuals.

If you are at college and pursuing other engineering degrees but want to venture into Data Engineering as a career path once you graduate, set aside some portion of your time to get familiar with SQL and one high level programming language (preferably python). A bonus would be to incorporate data intensive projects in your curriculum.

There is no fixed blueprint which can guarantee entry into this domain but some must-haves are:

  • Advanced SQL skills (DDL and DML)
  • Proficient in one high-level language (Python and Java are the most sought after at the moment)
  • Knowledge of Cloud Services (one of GCP, AWS or Azure)
  • Big Data Ecosystem and Generalist View of Tools and Technologies
  • Building Dashboards

There are a ton of resources online to improve the above skills and if there is considerable interest to know the exact resources which helped me, I can write a follow-up blog with the relevant information.

To wrap up, some tenets which I practice in my day-to-day work are:

  • Business Acumen is very important to be successful
  • Automation is key for scalability but prior intensive research is necessary before deployment in production
  • “No Data” is better than Inaccurate and Partial data
  • Metrics and Logging should be embedded from the design phase and not as an afterthought
  • Picking a mature tooling (open-source and enterprise) is very critical for longevity and prevents rework

If you enjoyed this article, share it with your friends and colleagues!

--

--

Aniket Mitra
CodeX
Writer for

Data Engineer, Unlocking Insights through the power of Data