Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

469 Episodes

Data Quality Starts At The Source - E238

Summary

The most important gauge of success for a data platform is the level of trust in the accuracy of the information that it provides. In order to build and maintain that trust it is necessary to invest in defining, monitoring, and enforcing data quality metrics. In this episode Michael Harper…

Summary

The most important gauge of success for a…

14 November 2021 | 00:58:55


Eliminate Friction In Your Data Platform Through Unified Metadata Using OpenMetadata - E237

Summary

A significant source of friction and wasted effort in building and integrating data management systems is the fragmentation of metadata across various tools. After experiencing the impacts of fragmented metadata and previous attempts at building a solution Suresh Srinivas and Sriharsha…

Summary

A significant source of friction and…

10 November 2021 | 01:06:55


Business Intelligence Beyond The Dashboard With ClicData - E236

Summary

Business intelligence is often equated with a collection of dashboards that show various charts and graphs representing data for an organization. What is overlooked in that characterization is the level of complexity and effort that are required to collect and present that information, and…

Summary

Business intelligence is often equated…

06 November 2021 | 01:02:00


Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL - E235

Summary

The precursor to widespread adoption of cloud data warehouses was the creation of customer data platforms. Acting as a centralized repository of information about how your customers interact with your organization they drove a wave of analytics about how to improve products based on actual…

Summary

The precursor to widespread adoption of…

05 November 2021 | 01:02:07


Removing The Barrier To Exploratory Analytics with Activity Schema and Narrator - E234

Summary

The perennial question of data warehousing is how to model the information that you are storing. This has given rise to methods as varied as star and snowflake schemas, data vault modeling, and wide tables. The challenge with many of those approaches is that they are optimized for answering…

Summary

The perennial question of data…

29 October 2021 | 01:08:49


Streaming Data Pipelines Made SQL With Decodable - E233

Summary

Streaming data systems have been growing more capable and flexible over the past few years. Despite this, it is still challenging to build reliable pipelines for stream processing. In this episode Eric Sammer discusses the shortcomings of the current set of streaming engines and how they…

Summary

Streaming data systems have been growing…

29 October 2021 | 01:09:32


Data Exploration For Business Users Powered By Analytics Engineering With Lightdash - E232

Summary

The market for business intelligence has been going through an evolutionary shift in recent years. One of the driving forces for that change has been the rise of analytics engineering powered by dbt. Lightdash has fully embraced that shift by building an entire open source business…

Summary

The market for business intelligence has…

23 October 2021 | 01:06:03


Completing The Feedback Loop Of Data Through Operational Analytics With Census - E231

Summary

The focus of the past few years has been to consolidate all of the organization’s data into a cloud data warehouse. As a result there have been a number of trends in data that take advantage of the warehouse as a single focal point. Among those trends is the advent of operational…

Summary

The focus of the past few years has been…

21 October 2021 | 01:09:06


Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data - E230

Summary

The binding element of all data work is the metadata graph that is generated by all of the workflows that produce the assets used by teams across the organization. The DataHub project was created as a way to bring order to the scale of LinkedIn’s data needs. It was also designed to be…

Summary

The binding element of all data work is…

16 October 2021 | 01:08:19


How And Why To Become Data Driven As A Business - E229

Summary

Organizations of all sizes are striving to become data driven, starting in earnest with the rise of big data a decade ago. With the never-ending growth in data sources and methods for aggregating and analyzing them, the use of data to direct the business has become a requirement. Randy Bean…

Summary

Organizations of all sizes are striving…

14 October 2021 | 01:02:00