Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

444 Episodes

Streaming Data Pipelines Made SQL With Decodable - E233

Summary

Streaming data systems have been growing more capable and flexible over the past few years. Despite this, it is still challenging to build reliable pipelines for stream processing. In this episode Eric Sammer discusses the shortcomings of the current set of streaming engines and how they…

Summary

Streaming data systems have been growing…

29 October 2021 | 01:09:32


Data Exploration For Business Users Powered By Analytics Engineering With Lightdash - E232

Summary

The market for business intelligence has been going through an evolutionary shift in recent years. One of the driving forces for that change has been the rise of analytics engineering powered by dbt. Lightdash has fully embraced that shift by building an entire open source business…

Summary

The market for business intelligence has…

23 October 2021 | 01:06:03


Completing The Feedback Loop Of Data Through Operational Analytics With Census - E231

Summary

The focus of the past few years has been to consolidate all of the organization’s data into a cloud data warehouse. As a result there have been a number of trends in data that take advantage of the warehouse as a single focal point. Among those trends is the advent of operational…

Summary

The focus of the past few years has been…

21 October 2021 | 01:09:06


Bringing The Power Of The DataHub Real-Time Metadata Graph To Everyone At Acryl Data - E230

Summary

The binding element of all data work is the metadata graph that is generated by all of the workflows that produce the assets used by teams across the organization. The DataHub project was created as a way to bring order to the scale of LinkedIn’s data needs. It was also designed to be…

Summary

The binding element of all data work is…

16 October 2021 | 01:08:19


How And Why To Become Data Driven As A Business - E229

Summary

Organizations of all sizes are striving to become data driven, starting in earnest with the rise of big data a decade ago. With the never-ending growth in data sources and methods for aggregating and analyzing them, the use of data to direct the business has become a requirement. Randy Bean…

Summary

Organizations of all sizes are striving…

14 October 2021 | 01:02:00


Make Your Business Metrics Reusable With Open Source Headless BI Using Metriql - E228

Summary

The key to making data valuable to business users is the ability to calculate meaningful metrics and explore them along useful dimensions. Business intelligence tools have provided this capability for years, but they don’t offer a means of exposing those metrics to other systems.…

Summary

The key to making data valuable to…

08 October 2021 | 00:43:37


Adding Support For Distributed Transactions To The Redpanda Streaming Engine - E227

Summary

Transactions are a necessary feature for ensuring that a set of actions are all performed as a single unit of work. In streaming systems this is necessary to ensure that a set of messages or transformations are all executed together across different queues. In this episode Denis Rystsov…

Summary

Transactions are a necessary feature for…

06 October 2021 | 00:45:59


Building Real-Time Data Platforms For Large Volumes Of Information With Aerospike - E226

Summary

Aerospike is a database engine that is designed to provide millisecond response times for queries across terabytes or petabytes. In this episode Chief Strategy Officer, Lenley Hensarling, explains how the ability to process these large volumes of information in real-time allows businesses…

Summary

Aerospike is a database engine that is…

02 October 2021 | 01:07:38


Delivering Your Personal Data Cloud With Prifina - E225

Summary

The promise of online services is that they will make your life easier in exchange for collecting data about you. The reality is that they use more information than you realize for purposes that are not what you intended. There have been many attempts to harness all of the data that you…

Summary

The promise of online services is that…

30 September 2021 | 01:12:11


Digging Into Data Reliability Engineering - E224

Summary

The accuracy and availability of data has become critically important to the day-to-day operation of businesses. Similar to the practice of site reliability engineering as a means of ensuring consistent uptime of web services, there has been a new trend of building data reliability…

Summary

The accuracy and availability of data has…

26 September 2021 | 00:58:07