Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

469 Episodes

Making The Total Cost Of Ownership For External Data Manageable With Crux - E307

Summary

There are extensive and valuable data sets that are available outside the bounds of your organization. Whether that data is public, paid, or scraped it requires investment and upkeep to acquire and integrate it with your systems. Crux was built to reduce the total cost of acquisition and…

Summary

There are extensive and valuable data…

17 July 2022 | 01:07:12


Joe Reis Flips The Script And Interviews Tobias Macey About The Data Engineering Podcast - E308

Summary

Data engineering is a large and growing subject, with new technologies, specializations, and "best practices" emerging at an accelerating pace. This podcast does its best to explore this fractal ecosystem, and has been at it for the past 5+ years. In this episode Joe Reis, founder…

Summary

Data engineering is a large and growing…

17 July 2022 | 00:56:39


Charting the Path of Riskified's Data Platform Journey - E306

Summary

Building a data platform is a journey, not a destination. Beyond the work of assembling a set of technologies and building integrations across them, there is also the work of growing and organizing a team that can support and benefit from that platform. In this episode Inbar Yogev and Lior…

Summary

Building a data platform is a journey,…

10 July 2022 | 00:39:57


Maintain Your Data Engineers' Sanity By Embracing Automation - E305

Summary

Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of…

Summary

Building and maintaining reliable data…

10 July 2022 | 01:05:08


Be Confident In Your Data Integration By Quickly Validating Matching Records With data-diff - E304

Summary

The perennial challenge of data engineers is ensuring that information is integrated reliably. While it is straightforward to know whether a synchronization process succeeded, it is not always clear whether every record was copied correctly. In order to quickly identify if and how two data…

Summary

The perennial challenge of data engineers…

03 July 2022 | 01:10:57


The View From The Lakehouse Of Architectural Patterns For Your Data Platform - E303

Summary

The ecosystem for data tools has been going through rapid and constant evolution over the past several years. These technological shifts have brought about corresponding changes in data and platform architectures for managing data and analytical workflows. In this episode Colleen Tartow…

Summary

The ecosystem for data tools has been…

03 July 2022 | 00:58:44


Bring Geospatial Analytics Across Disparate Datasets Into Your Toolkit With The Unfolded Platform - E301

Summary

The proliferation of sensors and GPS devices has dramatically increased the number of applications for spatial data, and the need for scalable geospatial analytics. In order to reduce the friction involved in aggregating disparate data sets that share geographic similarities the Unfolded…

Summary

The proliferation of sensors and GPS…

27 June 2022 | 01:07:01


Strategies And Tactics For A Successful Master Data Management Implementation - E302

Summary

The most complicated part of data engineering is the effort involved in making the raw data fit into the narrative of the business. Master Data Management (MDM) is the process of building consensus around what the information actually means in the context of the business and then shaping…

Summary

The most complicated part of data…

27 June 2022 | 01:09:08


Combining The Simplicity Of Spreadsheets With The Power Of Modern Data Infrastructure At Canvas - E300

Summary

Data analysis is a valuable exercise that is often out of reach of non-technical users as a result of the complexity of data systems. In order to lower the barrier to entry Ryan Buick created the Canvas application with a spreadsheet oriented workflow that is understandable to a wide…

Summary

Data analysis is a valuable exercise that…

19 June 2022 | 00:42:58


Level Up Your Data Platform With Active Metadata - E299

Summary

Metadata is the lifeblood of your data platform, providing information about what is happening in your systems. A variety of platforms have been developed to capture and analyze that information to great effect, but they are inherently limited in their utility due to their nature as storage…

Summary

Metadata is the lifeblood of your data…

19 June 2022 | 00:52:36