Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

459 Episodes

Fast And Flexible Headless Data Analytics With Cube.JS - E248

Summary

One of the perennial challenges of data analytics is having a consistent set of definitions, along with a flexible and performant API endpoint for querying them. In this episode Artom Keydunov and Pavel Tiunov share their work on Cube.js and the various ways that it is being used in the…

Summary

One of the perennial challenges of data…

21 December 2021 | 00:54:43


Building A System Of Record For Your Organization's Data Ecosystem At Metaphor - E247

Summary

Building a well managed data ecosystem for your organization requires a holistic view of all of the producers, consumers, and processors of information. The team at Metaphor are building a fully connected metadata layer to provide both technical and social intelligence about your data. In…

Summary

Building a well managed data ecosystem…

20 December 2021 | 01:05:34


Building Auditable Spark Pipelines At Capital One - E246

Summary

Spark is a powerful and battle tested framework for building highly scalable data pipelines. Because of its proven ability to handle large volumes of data Capital One has invested in it for their business needs. In this episode Gokul Prabagaren shares his use for it in calculating your…

Summary

Spark is a powerful and battle tested…

13 December 2021 | 00:42:10


Deliver Personal Experiences In Your Applications With The Unomi Open Source Customer Data Platform - E245

Summary

The core to providing your users with excellent service is to understand them and provide a personalized experience. Unfortunately many sites and applications take that to the extreme and collect too much information. In order to make it easier for developers to build customer profiles in a…

Summary

The core to providing your users with…

12 December 2021 | 00:57:34


Data Driven Hiring For Data Professionals With Alooba - E244

Summary

Hiring data professionals is challenging for a multitude of reasons, and as with every interview process there is a potential for bias to creep in. Tim Freestone founded Alooba to provide a more stable reference point for evaluating candidates to ensure that you can make more informed…

Summary

Hiring data professionals is challenging…

04 December 2021 | 00:50:03


Experimentation and A/B Testing For Modern Data Teams With Eppo - E243

Summary

A/B testing and experimentation are the most reliable way to determine whether a change to your product will have the desired effect on your business. Unfortunately, being able to design, deploy, and validate experiments is a complex process that requires a mix of technical capacity and…

Summary

A/B testing and experimentation are the…

04 December 2021 | 00:58:01


Creating A Unified Experience For The Modern Data Stack At Mozart Data - E242

Summary

The modern data stack has been gaining a lot of attention recently with a rapidly growing set of managed services for different stages of the data lifecycle. With all of the available options it is possible to run a scalable, production grade data platform with a small team, but there are…

Summary

The modern data stack has been gaining a…

27 November 2021 | 00:58:31


Doing DataOps For External Data Sources As A Service at Demyst - E241

Summary

The data that you have access to affects the questions that you can answer. By using external data sources you can drastically increase the range of analysis that is available to your organization. The challenge comes in all of the operational aspects of finding, accessing, organizing, and…

Summary

The data that you have access to affects…

27 November 2021 | 00:59:17


Exploring Processing Patterns For Streaming Data Integration In Your Data Lake - E240

Summary

One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near real time, but it can be challenging to understand the proper processing…

Summary

One of the perennial challenges posed by…

20 November 2021 | 00:52:53


Laying The Foundation Of Your Data Platform For The Era Of Big Complexity With Dagster - E239

Summary

The technology for scaling storage and processing of data has gone through massive evolution over the past decade, leaving us with the ability to work with massive datasets at the cost of massive complexity. Nick Schrock created the Dagster framework to help tame that complexity and scale…

Summary

The technology for scaling storage and…

20 November 2021 | 01:05:25