Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

469 Episodes

Business Intelligence In The Palm Of Your Hand With Zing Data - E348

Summary

Business intelligence is the foremost application of data in organizations of all sizes. The typical conception of how it is accessed is through a web or desktop application running on a powerful laptop. Zing Data is building a mobile native platform for business intelligence. This opens…

Summary

Business intelligence is the foremost…

05 December 2022 | 00:46:47


Adopting Real-Time Data At Organizations Of Every Size - E347

Summary

The term "real-time data" brings with it a combination of excitement, uncertainty, and skepticism. The promise of insights that are always accurate and up to date is appealing to organizations, but the technical realities to make it possible have been complex and expensive. In…

Summary

The term "real-time data"…

05 December 2022 | 00:50:25


Supporting And Expanding The Arrow Ecosystem For Fast And Efficient Data Processing At Voltron Data - E346

Summary

The data ecosystem has been growing rapidly, with new communities joining and bringing their preferred programming languages to the mix. This has led to inefficiencies in how data is stored, accessed, and shared across process and system boundaries. The Arrow project is designed to…

Summary

The data ecosystem has been growing…

28 November 2022 | 00:50:25


Analyze Massive Data At Interactive Speeds With The Power Of Bitmaps Using FeatureBase - E345

Summary

The most expensive part of working with massive data sets is the work of retrieving and processing the files that contain the raw information. FeatureBase (formerly Pilosa) avoids that overhead by converting the data into bitmaps. In this episode Matt Jaffee explains how to model your data…

Summary

The most expensive part of working with…

28 November 2022 | 00:59:25


A Look At The Data Systems Behind The Gameplay For League Of Legends - E344

Summary

The majority of blog posts and presentations about data engineering and analytics assume that the consumers of those efforts are internal business users accessing an environment controlled by the business. In this episode Ian Schweer shares his experiences at Riot Games supporting…

Summary

The majority of blog posts and…

21 November 2022 | 01:01:29


Tame The Entropy In Your Data Stack And Prevent Failures With Sifflet - E343

Summary

The problems that are easiest to fix are the ones that you prevent from happening in the first place. Sifflet is a platform that brings your entire data stack into focus to improve the reliability of your data assets and empower collaboration across your teams. In this episode CEO and…

Summary

The problems that are easiest to fix are…

21 November 2022 | 00:46:47


Taking A Look Under The Hood At CreditKarma's Data Platform - E341

Summary

CreditKarma builds data products that help consumers take advantage of their credit and financial capabilities. To make that possible they need a reliable data platform that empowers all of the organization’s stakeholders. In this episode Vishnu Venkataraman shares the journey that he…

Summary

CreditKarma builds data products that…

14 November 2022 | 00:52:03


Build Data Products Without A Data Team Using AgileData - E342

Summary

Building data products is an undertaking that has historically required substantial investments of time and talent. With the rise in cloud platforms and self-serve data technologies the barrier of entry is dropping. Shane Gibson co-founded AgileData to make analytics accessible to companies…

Summary

Building data products is an undertaking…

14 November 2022 | 01:12:30


Build Better Data Products By Creating Data, Not Consuming It - E339

Summary

A lot of the work that goes into data engineering is trying to make sense of the "data exhaust" from other applications and services. There is an undeniable amount of value and utility in that information, but it also introduces significant cost and time requirements. In this…

Summary

A lot of the work that goes into data…

07 November 2022 | 01:05:20


Clean Up Your Data Using Scalable Entity Resolution And Data Mastering With Zingg - E340

Summary

Despite the best efforts of data engineers, data is as messy as the real world. Entity resolution and fuzzy matching are powerful utilities for cleaning up data from disconnected sources, but it has typically required custom development and training machine learning models. Sonal Goyal…

Summary

Despite the best efforts of data…

07 November 2022 | 00:46:47