Sitemap
Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Engineer vs Data Engineer what’s the difference

4 min readMar 16, 2021

--

What are the main differences between working as a data engineer comparing to work as a web developer, backend developer? I was working a couple of years in both roles and this is my 6 items list of things that are quite different :)

Press enter or click to view image in full size

Size does matter
The size of the data you are processing is really important. When designing a pipeline you need to take that into account, you need to choose the proper algorithm for data you are processing, so that it’s stable, quick enough, and not costing too much (what too much means will of course vary very much depending on a specific company)

As for backend developer comparison, the size of data is something you look at, but often it’s only important for companies actually getting huge traffic to their apps/websites/systems. So big chances are you will not need to optimize for that very much when implementing new features and maintaining services.

Algorithm matters
As mentioned before algorithm you use also matters really much. Usually, there will be many options to process data and some of them will be much better than others. They may require additional work and logic but for example will make the pipeline much faster, or less costly / needing fewer resources. Some of the bigger tasks you will be working on will be about changing one algorithm for another which at a given moment performs much better.

When working as full-stack engineers, you usually don’t focus on algorithms too much, usually because it’s not needed. As mentioned most of the focus is on writing code that will be easy to change/maintain and add features to.

You will have a lot of time
This actually may be one of the downsides when working as a data engineer (or pros:) If you are used to instant feedback on whether a given thing is working or not, you will not get it here so easily. A lot of changes require testing on larger amounts of data. So often you will be waiting for some job to end to get this feedback. It’s good if you find something productive to work on during that time. Usually, it can be writing tests / learning something new, etc. And the most important thing, you still should work on reducing that time to a minimum, it’s basically much harder to debug jobs that take hours than those which take minutes. It is usually also much cheaper to have shorter ones ;) but that’s taking us to a different thing.

Press enter or click to view image in full size

Your work costs
Of course for backend/frontend engineers it also costs to run production with their code, but here you will fill it on another level. Quite often everything you do, like testing will cost money (often quite a lot). And if you do mistakes and will need to re-run something it will cost as well.

Overall it’s nothing to worry about (apart from situations when you have a boss, asking you about it quite often). It’s also possible you will see and do improvements, which can reduce costs by your monthly salary or more, then you can easily feel that you deserve the money ;)

You will not be sure if the code works
That’s one of the most frustrating things in data engineer work. For web developers, if your site seems to be working correctly it ‘usually’ is. For data engineers when often the only output of your work is transformed data it’s really easy for your pipelines to happily produce output that is totally not correct, silently. If you discover it a day later it’s quite good, but it may happen you will find out about this week/month after errors were introduced (Not necessarily by your team).

To actually help with that, I build an open-source tool for finding such issues (you can find it here :) Also a lot of the time you will build custom dashboards helping identify those cases and make sure that what you are producing is correct.

Less googling and fewer libraries to use
It often is that as a web developer you google half of the things you are doing. Especially if doing something new, trying to use a new library, adding a feature which is of course done by some Django/Flask library. This is most often not the case for Data Engineers. You will need to learn a bit (for example Spark) and you will google errors from running pipeline a lot (when using Spark), but new features usually are not implemented by adding a new library and googling how to use it :)

So whether you are thinking about changing your career path to becoming a data engineer, or already are one let me know your opinion on this point. Do you agree and see it in your work (whether as a data engineer or web developer) or you see it differently?

--

--

Nerd For Tech
Nerd For Tech

Published in Nerd For Tech

NFT is an Educational Media House. Our mission is to bring the invaluable knowledge and experiences of experts from all over the world to the novice. To know more about us, visit https://www.nerdfortech.org/.

Mateusz Klimek
Mateusz Klimek

Written by Mateusz Klimek

Data Engineer at heart, author of re_data framework helping you with monitoring data-quality.

No responses yet