Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

478 Episodes

Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service - E377

Summary A significant portion of the time spent by data engineering teams is on managing the workflows and operations of their pipelines. DataOps has arisen as a parallel set of practices to that of DevOps teams as a means of reducing wasted effort. Agile Data Engine is a platform designed to handle the infrastructure side of the DataOps equation,…

Summary A significant portion of the time spent by data engineering teams is on managing the…

04 June 2023 | 00:54:06


A Roadmap To Bootstrapping The Data Team At Your Startup - E376

Summary Building a data team is hard in any circumstance, but at a startup it can be even more challenging. The requirements are fluid, you probably don't have a lot of existing data talent to manage the hiring and onboarding, and there is a need to move fast. Ghalib Suleiman has been on both sides of this equation and joins the show to share his…

Summary Building a data team is hard in any circumstance, but at a startup it can be even more…

29 May 2023 | 00:42:32


Keep Your Data Lake Fresh With Real Time Streams Using Estuary - E375

Summary Batch vs. streaming is a long running debate in the world of data integration and transformation. Proponents of the streaming paradigm argue that stream processing engines can easily handle batched workloads, but the reverse isn't true. The batch world has been the default for years because of the complexities of running a reliable…

Summary Batch vs. streaming is a long running debate in the world of data integration and…

21 May 2023 | 00:55:51


What Happens When The Abstractions Leak On Your Data - E374

Summary All of the advancements in our technology is based around the principles of abstraction. These are valuable until they break down, which is an inevitable occurrence. In this episode the host Tobias Macey shares his reflections on recent experiences where the abstractions leaked and some observances on how to deal with that situation in a…

Summary All of the advancements in our technology is based around the principles of abstraction.…

15 May 2023 | 00:26:42


Use Consistent And Up To Date Customer Profiles To Power Your Business With Segment Unify - E373

Summary Every business has customers, and a critical element of success is understanding who they are and how they are using the companies products or services. The challenge is that most companies have a multitude of systems that contain fragments of the customer's interactions and stitching that together is complex and time consuming. Segment…

Summary Every business has customers, and a critical element of success is understanding who they…

07 May 2023 | 00:54:35


Realtime Data Applications Made Easier With Meroxa - E372

Summary Real-time capabilities have quickly become an expectation for consumers. The complexity of providing those capabilities is still high, however, making it more difficult for small teams to compete. Meroxa was created to enable teams of all sizes to deliver real-time data applications. In this episode DeVaris Brown discusses the types of…

Summary Real-time capabilities have quickly become an expectation for consumers. The complexity of…

24 April 2023 | 00:45:26


Building Self Serve Business Intelligence With AI And Semantic Modeling At Zenlytic - E371

Summary Business intellingence has been chasing the promise of self-serve data for decades. As the capabilities of these systems has improved and become more accessible, the target of what self-serve means changes. With the availability of AI powered by large language models combined with the evolution of semantic layers, the team at Zenlytic have…

Summary Business intellingence has been chasing the promise of self-serve data for decades. As the…

16 April 2023 | 00:49:19


An Exploration Of The Composable Customer Data Platform - E370

Summary The customer data platform is a category of services that was developed early in the evolution of the current era of cloud services for data processing. When it was difficult to wire together the event collection, data modeling, reporting, and activation it made sense to buy monolithic products that handled every stage of the customer data…

Summary The customer data platform is a category of services that was developed early in the…

10 April 2023 | 01:11:42


Mapping The Data Infrastructure Landscape As A Venture Capitalist - E369

Summary The data ecosystem has been building momentum for several years now. As a venture capital investor Matt Turck has been trying to keep track of the main trends and has compiled his findings into the MAD (ML, AI, and Data) landscape reports each year. In this episode he shares his experiences building those reports and the perspective he has…

Summary The data ecosystem has been building momentum for several years now. As a venture capital…

03 April 2023 | 01:01:57


Unlocking The Potential Of Streaming Data Applications Without The Operational Headache At Grainite - E368

Summary The promise of streaming data is that it allows you to react to new information as it happens, rather than introducing latency by batching records together. The peril is that building a robust and scalable streaming architecture is always more complicated and error-prone than you think it's going to be. After experiencing this unfortunate…

Summary The promise of streaming data is that it allows you to react to new information as it…

25 March 2023 | 01:13:34