Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

444 Episodes

Build Your Own End To End Customer Data Platform With Rudderstack - E263

Summary

Collecting, integrating, and activating data are all challenging activities. When that data pertains to your customers it can become even more complex. To simplify the work of managing the full flow of your customer data and keep you in full control the team at Rudderstack created their…

Summary

Collecting, integrating, and activating…

14 February 2022 | 00:47:35


Scale Your Spatial Analysis By Building It In SQL With Syntax Extensions - E262

Summary

Along with globalization of our societies comes the need to analyze the geospatial and geotemporal data that is needed to manage the growth in commerce, communications, and other activities. In order to make geospatial analytics more maintainable and scalable there has been an increase in…

Summary

Along with globalization of our societies…

07 February 2022 | 00:59:54


Scalable Strategies For Protecting Data Privacy In Your Shared Data Sets - E261

Summary

There are many dimensions to the work of protecting the privacy of users in our data. When you need to share a data set with other teams, departments, or businesses then it is of utmost importance that you eliminate or obfuscate personal information. In this episode Will Thompson explores…

Summary

There are many dimensions to the work of…

06 February 2022 | 01:00:06


A Reflection On Learning A Lot More Than 97 Things Every Data Engineer Should Know - E260

Summary

The Data Engineering Podcast has been going for five years now and has included conversations and interviews with a huge number of guests, covering a broad range of topics. In addition to that, the host curated the essays contained in the book "97 Things Every Data Engineer Should…

Summary

The Data Engineering Podcast has been…

31 January 2022 | 00:41:36


Effective Pandas Patterns For Data Engineering - E259

Summary

Pandas is a powerful tool for cleaning, transforming, manipulating, or enriching data, among many other potential uses. As a result it has become a standard tool for data engineers for a wide range of applications. Matt Harrison is a Python expert with a long history of working with data…

Summary

Pandas is a powerful tool for cleaning,…

31 January 2022 | 01:00:22


The Importance Of Data Contracts As The Interface For Data Integration With Abhi Sivasailam - E258

Summary

Data platforms are exemplified by a complex set of connections that are subject to a set of constantly evolving requirements. In order to make this a tractable problem it is necessary to define boundaries for communication between concerns, which brings with it the need to establish…

Summary

Data platforms are exemplified by a…

23 January 2022 | 00:56:00


Building And Managing Data Teams And Data Platforms In Large Organizations With Ashish Mrig - E257

Summary

Data engineering is a relatively young and rapidly expanding field, with practitioners having a wide array of experiences as they navigate their careers. Ashish Mrig currently leads the data analytics platform for Wayfair, as well as running a local data engineering meetup. In this episode…

Summary

Data engineering is a relatively young…

23 January 2022 | 00:52:45


Automated Data Quality Management Through Machine Learning With Anomalo - E256

Summary

Data quality control is a requirement for being able to trust the various reports and machine learning models that are relying on the information that you curate. Rules based systems are useful for validating known requirements, but with the scale and complexity of data in modern…

Summary

Data quality control is a requirement for…

15 January 2022 | 01:02:30


An Introduction To Data And Analytics Engineering For Non-Programmers - E255

Summary

Applications of data have grown well beyond the venerable business intelligence dashboards that organizations have relied on for decades. Now it is being used to power consumer facing services, influence organizational behaviors, and build sophisticated machine learning systems. Given this…

Summary

Applications of data have grown well…

15 January 2022 | 00:50:14


Open Source Reverse ETL For Everyone With Grouparoo - E254

Summary

Reverse ETL is a product category that evolved from the landscape of customer data platforms with a number of companies offering their own implementation of it. While struggling with the work of automating data integration workflows with marketing, sales, and support tools Brian Leonard…

Summary

Reverse ETL is a product category that…

08 January 2022 | 00:44:57