Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

454 Episodes

Datapreneurs - How Todays Business Leaders Are Using Data To Define The Future - E383

Summary Data has been one of the most substantial drivers of business and economic value for the past few decades. Bob Muglia has had a front-row seat to many of the major shifts driven by technology over his career. In his recent book "Datapreneurs" he reflects on the people and businesses that he has known and worked with and how they relied on…

Summary Data has been one of the most substantial drivers of business and economic value for the…

17 July 2023 | 00:54:45


Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling - E382

Summary For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow…

Summary For business analytics the way that you model the data in your warehouse has a lasting…

09 July 2023 | 01:12:55


How Data Engineering Teams Power Machine Learning With Feature Platforms - E381

Summary Feature engineering is a crucial aspect of the machine learning workflow. To make that possible, there are a number of technical and procedural capabilities that must be in place first. In this episode Razi Raziuddin shares how data engineering teams can support the machine learning workflow through the development and support of systems…

Summary Feature engineering is a crucial aspect of the machine learning workflow. To make that…

03 July 2023 | 01:03:30


Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh - E380

Summary Data transformation is a key activity for all of the organizational roles that interact with data. Because of its importance and outsized impact on what is possible for downstream data consumers it is critical that everyone is able to collaborate seamlessly. SQLMesh was designed as a unifying tool that is simple to work with but powerful…

Summary Data transformation is a key activity for all of the organizational roles that interact with…

25 June 2023 | 00:50:19


How Column-Aware Development Tooling Yields Better Data Models - E379

Summary Architectural decisions are all based on certain constraints and a desire to optimize for different outcomes. In data systems one of the core architectural exercises is data modeling, which can have significant impacts on what is and is not possible for downstream use cases. By incorporating column-level lineage in the data modeling process…

Summary Architectural decisions are all based on certain constraints and a desire to optimize for…

18 June 2023 | 00:46:20


Build Better Tests For Your dbt Projects With Datafold And data-diff - E378

Summary Data engineering is all about building workflows, pipelines, systems, and interfaces to provide stable and reliable data. Your data can be stable and wrong, but then it isn't reliable. Confidence in your data is achieved through constant validation and testing. Datafold has invested a lot of time into integrating with the workflow of dbt…

Summary Data engineering is all about building workflows, pipelines, systems, and interfaces to…

11 June 2023 | 00:48:22


Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service - E377

Summary A significant portion of the time spent by data engineering teams is on managing the workflows and operations of their pipelines. DataOps has arisen as a parallel set of practices to that of DevOps teams as a means of reducing wasted effort. Agile Data Engine is a platform designed to handle the infrastructure side of the DataOps equation,…

Summary A significant portion of the time spent by data engineering teams is on managing the…

04 June 2023 | 00:54:06


A Roadmap To Bootstrapping The Data Team At Your Startup - E376

Summary Building a data team is hard in any circumstance, but at a startup it can be even more challenging. The requirements are fluid, you probably don't have a lot of existing data talent to manage the hiring and onboarding, and there is a need to move fast. Ghalib Suleiman has been on both sides of this equation and joins the show to share his…

Summary Building a data team is hard in any circumstance, but at a startup it can be even more…

29 May 2023 | 00:42:32


Keep Your Data Lake Fresh With Real Time Streams Using Estuary - E375

Summary Batch vs. streaming is a long running debate in the world of data integration and transformation. Proponents of the streaming paradigm argue that stream processing engines can easily handle batched workloads, but the reverse isn't true. The batch world has been the default for years because of the complexities of running a reliable…

Summary Batch vs. streaming is a long running debate in the world of data integration and…

21 May 2023 | 00:55:51


What Happens When The Abstractions Leak On Your Data - E374

Summary All of the advancements in our technology is based around the principles of abstraction. These are valuable until they break down, which is an inevitable occurrence. In this episode the host Tobias Macey shares his reflections on recent experiences where the abstractions leaked and some observances on how to deal with that situation in a…

Summary All of the advancements in our technology is based around the principles of abstraction.…

15 May 2023 | 00:26:42