It is incredible how fast data processing tools and technologies are evolving. And with it, the nature of the data engineering discipline is changing as well. Tools I am using today are very different from what I used ten or even five years ago, however, many lessons learned are still relevant today.

I have started to work in the data space long before data engineering became a thing and data scientist became the sexiest job of the 21st century. I ‘officially’ became a big data engineer six years ago, and I know firsthand the challenges developers with a background in “traditional” data development have going through this journey. Of course, this transition is not easy for software engineers either, it is just different.

Even though technologies keep changing — and that’s the reality for anyone working in the tech industry — some of the skills I had to learn are still relevant, but often overlooked by data developers who are just starting to make the transition to data engineering. These usually are the skills that software developers often take for granted.

In this post, I will talk about the evolution of data engineering and what skills “traditional” data developers might need to learn today (Hint: it is not Hadoop).

The birth of the data engineer.

Data teams before the Big Data craze were composed of business intelligence and ETL developers. Typical BI / ETL developer activities involved moving data sets from location A to location B (ETL) and building the web-hosted dashboards with that data (BI). Specialised technologies existed for each of those activities, with the knowledge concentrated within the IT department. However, apart from that, BI and ETL development had very little to do with software engineering, the discipline which was maturing heavily at the beginning of the century.

As the data volumes grew and interest in data analytics increased, in the past ten years, new technologies were invented. Some of them died, and others became widely adopted, that in turn changed demands in skills and teams’ structures. As modern BI tools allowed analysts and business people to create dashboards with minimal support from IT teams, data engineering became a new discipline, applying software engineering principles to ETL development using a new set of tools.

#data-engineering #sql #etl #analytics-engineering #big-data #data analytic

Data engineering in 2020
1.10 GEEK