What exactly is data engineering? How has it evolved in recent years and where is it going? How do you get started in the field? In this episode, Maxime Beauchemin joins me to discuss these questions and more.
Do you want to try out some of the tools and applications that you heard about on the Data Engineering Podcast? Do you have some ETL jobs that need somewhere to run? Check out Linode at www.dataengineeringpodcast.com/linode or use the code DATAENGINEERING17 and get a $20 credit (that’s 4 months free!) to try out their fast and reliable Linux virtual servers. They’ve got lightning fast networking and SSD servers with plenty of power and storage to run whatever you want to experiment on.
- Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
- Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
- You can help support the show by checking out the Patreon page which is linked from the site.
- To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
- Your host is Tobias Macey and today I’m interviewing Maxime Beauchemin
- How did you get involved in the field of data engineering?
- How do you define data engineering and how has that changed in recent years?
- Do you think that the DevOps movement over the past few years has had any impact on the discipline of data engineering? If so, what kinds of cross-over have you seen?
- For someone who wants to get started in the field of data engineering what are some of the necessary skills?
- What do you see as the biggest challenges facing data engineers currently?
- At what scale does it become necessary to differentiate between someone who does data engineering vs data infrastructure and what are the differences in terms of skill set and problem domain?
- How much analytical knowledge is necessary for a typical data engineer?
- What are some of the most important considerations when establishing new data sources to ensure that the resulting information is of sufficient quality?
- You have commented on the fact that data engineering borrows a number of elements from software engineering. Where does the concept of unit testing fit in data management and what are some of the most effective patterns for implementing that practice?
- How has the work done by data engineers and managers of data infrastructure bled back into mainstream software and systems engineering in terms of tools and best practices?
- How do you see the role of data engineers evolving in the next few years?