Data Engineering Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Listen in your favorite app:

Podlink

More options

Amazon Music

Show RSS Feed

Click to copy to clipboard

Here are shows you might like

See show recommendations

AI Engineering Podcast
Tobias Macey

The Python Podcast.__init__
Tobias Macey

The Grand Vision And Present Reality of DataOps - E183

Summary

The Data industry is changing rapidly, and one of the most active areas of growth is automation of data workflows. Taking cues from the DevOps movement of the past decade data professionals are orienting around the concept of DataOps. More than just a collection of tools, there are a number…

Summary

The Data industry is changing rapidly,…

04 May 2021 | 00:57:08

Self Service Data Exploration And Dashboarding With Superset - E182

Summary

The reason for collecting, cleaning, and organizing data is to make it usable by the organization. One of the most common and widely used methods of access is through a business intelligence dashboard. Superset is an open source option that has…

27 April 2021 | 00:47:25

Moving Machine Learning Into The Data Pipeline at Cherre - E181

Summary

Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformations is actually a full-fledged machine learning project in its own right.…

Summary

Most of the time when you think about a…

20 April 2021 | 00:48:05

Exploring The Expanding Landscape Of Data Professions with Josh Benamram of Databand - E180

Summary

"Business as usual" is changing, with more companies investing in data as a first class concern. As a result, the data team is growing and introducing more specialized roles. In this episode Josh Benamram, CEO and co-founder of Databand, describes the motivations for these…

Summary

"Business as usual" is…

13 April 2021 | 01:08:36

Put Your Whole Data Team On The Same Page With Atlan - E179

Summary

One of the biggest obstacles to success in delivering data products is cross-team collaboration. Part of the problem is the difference in the information that each role requires to do their job and where they expect to find it. This introduces a barrier to communication that is difficult to…

Summary

One of the biggest obstacles to success…

06 April 2021 | 00:57:37

Data Quality Management For The Whole Team With Soda Data - E178

Summary

Data quality is on the top of everyone’s mind recently, but getting it right is as challenging as ever. One of the contributing factors is the number of people who are involved in the process and the potential impact on the business if something goes wrong. In this episode Maarten…

Summary

Data quality is on the top of…

30 March 2021 | 00:58:00

Real World Change Data Capture At Datacoral - E177

Summary

The world of business is becoming increasingly dependent on information that is accurate up to the minute. For analytical systems, the only way to provide this reliably is by implementing change data capture (CDC). Unfortunately, this is a non-trivial undertaking, particularly for teams…

Summary

The world of business is becoming…

23 March 2021 | 00:49:58

Managing The DoorDash Data Platform - E176

Summary

The team at DoorDash has a complex set of optimization challenges to deal with using data that they collect from a multi-sided marketplace. In order to handle the volume and variety of information that they use to run and improve the business the data team has to build a platform that…

Summary

The team at DoorDash has a complex set of…

16 March 2021 | 00:46:05

Leave Your Data Where It Is And Automate Feature Extraction With Molecula - E175

Summary

A majority of the time spent in data engineering is copying data between systems to make the information available for different purposes. This introduces challenges such as keeping information synchronized, managing schema evolution, building transformations to match the expectations of…

Summary

A majority of the time spent in data…

09 March 2021 | 00:51:40

Bridging The Gap Between Machine Learning And Operations At Iguazio - E174

Summary

The process of building and deploying machine learning projects requires a staggering number of systems and stakeholders to work in concert. In this episode Yaron Haviv, co-founder of Iguazio, discusses the complexities inherent to the process, as well as how he has worked to democratize…

Summary

The process of building and deploying…

02 March 2021 | 01:06:28