Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

444 Episodes

Build More Reliable Distributed Systems By Breaking Them With Jepsen - E143

Summary

A majority of the scalable data processing platforms that we rely on are built as distributed systems. This brings with it a vast number of subtle ways that errors can creep in. Kyle Kingsbury created the Jepsen framework for testing the guarantees of distributed data processing systems and…

Summary

A majority of the scalable data…

28 July 2020 | 00:49:38


Making Wind Energy More Efficient With Data At Turbit Systems - E142

Summary

Wind energy is an important component of an ecologically friendly power system, but there are a number of variables that can affect the overall efficiency of the turbines. Michael Tegtmeier founded Turbit Systems to help operators of wind farms identify and correct problems that contribute…

Summary

Wind energy is an important component of…

21 July 2020 | 00:40:48


Open Source Production Grade Data Integration With Meltano - E141

Summary

The first stage of every data pipeline is extracting the information from source systems. There are a number of platforms for managing data integration, but there is a notable lack of a robust and easy to use open source option. The Meltano project is aiming to provide a solution to that…

Summary

The first stage of every data pipeline is…

13 July 2020 | 01:05:19


DataOps For Streaming Systems With Lenses.io - E140

Summary

There are an increasing number of use cases for real time data, and the systems to power them are becoming more mature. Once you have a streaming platform up and running you need a way to keep an eye on it, including observability, discovery, and governance of your data. That’s what…

Summary

There are an increasing number of use…

06 July 2020 | 00:45:36


Data Collection And Management To Power Sound Recognition At Audio Analytic - E139

Summary

We have machines that can listen to and process human speech in a variety of languages, but dealing with unstructured sounds in our environment is a much greater challenge. The team at Audio Analytic are working to impart a sense of hearing to our myriad devices with their sound recognition…

Summary

We have machines that can listen to and…

30 June 2020 | 00:57:29


Bringing Business Analytics To End Users With GoodData - E138

Summary

The majority of analytics platforms are focused on use internal to an organization by business stakeholders. As the availability of data increases and overall literacy in how to interpret it and take action improves there is a growing need to bring business intelligence use cases to a…

Summary

The majority of analytics platforms are…

23 June 2020 | 00:52:24


Accelerate Your Machine Learning With The StreamSQL Feature Store - E137

Summary

Machine learning is a process driven by iteration and experimentation which requires fast and easy access to relevant features of the data being processed. In order to reduce friction in the process of developing and delivering models there has been a recent trend toward building a…

Summary

Machine learning is a process driven by…

15 June 2020 | 00:46:13


Data Management Trends From An Investor Perspective - E136

Summary

The landscape of data management and processing is rapidly changing and evolving. There are certain foundational elements that have remained steady, but as the industry matures new trends emerge and gain prominence. In this episode Astasia Myers of Redpoint Ventures shares her perspective…

Summary

The landscape of data management and…

08 June 2020 | 00:54:59


Building A Data Lake For The Database Administrator At Upsolver - E135

Summary

Data lakes offer a great deal of flexibility and the potential for reduced cost for your analytics, but they also introduce a great deal of complexity. What used to be entirely managed by the database engine is now a composition of multiple systems that need to be properly configured to…

Summary

Data lakes offer a great deal of…

02 June 2020 | 00:56:17


Mapping The Customer Journey For B2B Companies At Dreamdata - E134

Summary

Gaining a complete view of the customer journey is especially difficult in B2B companies. This is due to the number of different individuals involved and the myriad ways that they interface with the business. Dreamdata integrates data from the multitude of platforms that are used by these…

Summary

Gaining a complete view of the customer…

25 May 2020 | 00:47:00