Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

486 Episodes

Bringing The Modern Data Stack To Everyone With Y42 - E295

Summary

Cloud services have made highly scalable and performant data platforms economical and manageable for data teams. However, they are still challenging to work with and manage for anyone who isn’t in a technical role. Hung Dang understood the need to make data more accessible to the…

Summary

Cloud services have made highly scalable…

06 June 2022 | 00:59:02


Data Cloud Cost Optimization With Bluesky Data - E293

Summary

The latest generation of data warehouse platforms have brought unprecedented operational simplicity and effectively infinite scale. Along with those benefits, they have also introduced a new consumption model that can lead to incredibly expensive bills at the end of the month. In order to…

Summary

The latest generation of data warehouse…

30 May 2022 | 01:03:25


A Multipurpose Database For Transactions And Analytics To Simplify Your Data Architecture With Singlestore - E294

Summary

A large fraction of data engineering work involves moving data from one storage location to another in order to support different access and query patterns. Singlestore aims to cut down on the number of database engines that you need to run so that you can reduce the amount of copying that…

Summary

A large fraction of data engineering work…

30 May 2022 | 00:41:22


Unlocking The Value Of Data Across The Organization Through User Friendly Data Tools With Prophecy - E292

Summary

The interfaces and design cues that a tool offers can have a massive impact on who is able to use it and the tasks that they are able to perform. With an eye to making data workflows more accessible to everyone in an organization Raj Bains and his team at Prophecy designed a powerful and…

Summary

The interfaces and design cues that a…

23 May 2022 | 01:10:56


Cloud Native Data Orchestration For Machine Learning And Data Engineering With Flyte - E291

Summary

Machine learning has become a meaningful target for data applications, bringing with it an increase in the complexity of orchestrating the entire data flow. Flyte is a project that was started at Lyft to address their internal needs for machine learning and integrated closely with…

Summary

Machine learning has become a meaningful…

23 May 2022 | 01:07:08


Designing And Deploying IoT Analytics For Industrial Applications At Vopak - E289

Summary

Industrial applications are one of the primary adopters of Internet of Things (IoT) technologies, with business critical operations being informed by data collected across a fleet of sensors. Vopak is a business that manages storage and distribution of a variety of liquids that are critical…

Summary

Industrial applications are one of the…

16 May 2022 | 00:47:55


Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way - E290

Summary

Designing a data platform is a complex and iterative undertaking which requires accounting for many conflicting needs. Designing a platform that relies on a data lake as its central architectural tenet adds additional layers of difficulty. Srivatsan Sridharan has had the opportunity to…

Summary

Designing a data platform is a complex…

16 May 2022 | 00:58:11


Scaling Analysis of Connected Data And Modeling Complex Relationships With The TigerGraph Graph Database - E288

Summary

Many of the events, ideas, and objects that we try to represent through data have a high degree of connectivity in the real world. These connections are best represented and analyzed as graphs to provide efficient and accurate analysis of their relationships. TigerGraph is a leading…

Summary

Many of the events, ideas, and objects…

09 May 2022 | 00:39:56


Exploring The Insights And Impact Of Dan Delorey's Distinguished Career In Data - E287

Summary

Dan Delorey helped to build the core technologies of Google’s cloud data services for many years before embarking on his latest adventure as the VP of Data at SoFi. From being an early engineer on the Dremel project, to helping launch and manage BigQuery, on to helping enterprises…

Summary

Dan Delorey helped to build the core…

09 May 2022 | 01:00:51


Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion - E286

Summary

The predominant pattern for data integration in the cloud has become extract, load, and then transform or ELT. Matillion was an early innovator of that approach and in this episode CTO Ed Thompson explains how they have evolved the platform to keep pace with the rapidly changing ecosystem.…

Summary

The predominant pattern for data…

02 May 2022 | 00:53:20