Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

454 Episodes

Self Service Open Source Data Integration With AirByte - E173

Summary

Data integration is a critical piece of every data pipeline, yet it is still far from being a solved problem. There are a number of managed platforms available, but the list of options for an open source system that supports a large variety of sources and destinations is still embarrasingly…

Summary

Data integration is a critical piece of…

23 February 2021 | 00:52:15


Building The Foundations For Data Driven Businesses at 5xData - E172

Summary

Every business aims to be data driven, but not all of them succeed in that effort. In order to be able to truly derive insights from the data that an organization collects, there are certain foundational capabilities that they need to have capacity for. In order to help more businesses…

Summary

Every business aims to be data driven,…

16 February 2021 | 00:52:16


How Shopify Is Building Their Production Data Warehouse Using DBT - E171

Summary

With all of the tools and services available for building a data platform it can be difficult to separate the signal from the noise. One of the best ways to get a true understanding of how a technology works in practice is to hear from people who are running it in production. In this…

Summary

With all of the tools and services…

09 February 2021 | 00:46:31


System Observability For The Cloud Native Era With Chronosphere - E170

Summary

Collecting and processing metrics for monitoring use cases is an interesting data problem. It is eminently possible to generate millions or billions of data points per second, the information needs to be propagated to a central location, processed, and analyzed in timeframes on the order of…

Summary

Collecting and processing metrics for…

02 February 2021 | 01:04:50


Making It Easier To Stick B2B Data Integration Pipelines Together With Hotglue - E169

Summary

Businesses often need to be able to ingest data from their customers in order to power the services that they provide. For each new source that they need to integrate with it is another custom set of ETL tasks that they need to maintain. In order to reduce the friction involved in…

Summary

Businesses often need to be able to…

26 January 2021 | 00:34:05


Using Your Data Warehouse As The Source Of Truth For Customer Data With Hightouch - E168

Summary

The data warehouse has become the central component of the modern data stack. Building on this pattern, the team at Hightouch have created a platform that synchronizes information about your customers out to third party systems for use by marketing and sales teams. In this episode Tejas…

Summary

The data warehouse has become the central…

19 January 2021 | 00:59:34


Enabling Version Controlled Data Collaboration With TerminusDB - E167

Summary

As data professionals we have a number of tools available for storing, processing, and analyzing data. We also have tools for collaborating on software and analysis, but collaborating on data is still an underserved capability. Gavin Mendel-Gleason encountered this problem first hand while…

Summary

As data professionals we have a number of…

11 January 2021 | 00:57:48


Bringing Feature Stores and MLOps to the Enterprise at Tecton - E166

Summary

As more organizations are gaining experience with data management and incorporating analytics into their decision making, their next move is to adopt machine learning. In order to make those efforts sustainable, the core capability they need is for data scientists and analysts to be able to…

Summary

As more organizations are gaining…

05 January 2021 | 00:47:41


Off The Shelf Data Governance With Satori - E165

Summary

One of the core responsibilities of data engineers is to manage the security of the information that they process. The team at Satori has a background in cybersecurity and they are using the lessons that they learned in that field to address the challenge of access control and auditing for…

Summary

One of the core responsibilities of data…

28 December 2020 | 00:34:24


Low Friction Data Governance With Immuta - E164

Summary

Data governance is a term that encompasses a wide range of responsibilities, both technical and process oriented. One of the more complex aspects is that of access control to the data assets that an organization is responsible for managing. The team at Immuta has built a platform that aims…

Summary

Data governance is a term that…

21 December 2020 | 00:53:33