Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

459 Episodes

Pushing The Limits Of Scalability And User Experience For Data Processing WIth Jignesh Patel - E408

Summary Data processing technologies have dramatically improved in their sophistication and raw throughput. Unfortunately, the volumes of data that are being generated continue to double, requiring further advancements in the platform capabilities to keep up. As the sophistication increases, so does the complexity, leading to challenges for user…

Summary Data processing technologies have dramatically improved in their sophistication and raw…

07 January 2024 | 00:50:26


Designing Data Platforms For Fintech Companies - E407

Summary Working with financial data requires a high degree of rigor due to the numerous regulations and the risks involved in security breaches. In this episode Andrey Korchack, CTO of fintech startup Monite, discusses the complexities of designing and implementing a data platform in that sector. Announcements Hello and welcome to the Data…

Summary Working with financial data requires a high degree of rigor due to the numerous regulations…

01 January 2024 | 00:47:57


Troubleshooting Kafka In Production - E406

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: : Troubleshooting in Production". In this episode he…

Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events…

24 December 2023 | 01:14:44


Adding An Easy Mode For The Modern Data Stack With 5X - E405

Summary The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X…

Summary The "modern data stack" promised a scalable, composable data platform that gave everyone the…

18 December 2023 | 00:56:12


Run Your Own Anomaly Detection For Your Critical Business Metrics With Anomstack - E404

Summary If your business metrics looked weird tomorrow, would you know about it first? Anomaly detection is focused on identifying those outliers for you, so that you are the first to know when a business critical dashboard isn't right. Unfortunately, it can often be complex or expensive to incorporate anomaly detection into your data platform.…

Summary If your business metrics looked weird tomorrow, would you know about it first? Anomaly…

11 December 2023 | 00:51:18


Designing Data Transfer Systems That Scale - E403

Summary The first step of data pipelines is to move the data to a place where you can process and prepare it for its eventual purpose. Data transfer systems are a critical component of data enablement, and building them to support large volumes of information is a complex endeavor. Andrei Tserakhau has dedicated his careeer to this problem, and in…

Summary The first step of data pipelines is to move the data to a place where you can process and…

04 December 2023 | 01:03:57


Addressing The Challenges Of Component Integration In Data Platform Architectures - E402

Summary Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is…

Summary Building a data platform that is enjoyable and accessible for all of its end users is a…

27 November 2023 | 00:29:43


Unlocking Your dbt Projects With Practical Advice For Practitioners - E401

Summary The dbt project has become overwhelmingly popular across analytics and data engineering teams. While it is easy to adopt, there are many potential pitfalls. Dustin Dorsey and Cameron Cyr co-authored a practical guide to building your dbt project. In this episode they share their hard-won wisdom about how to build and scale your dbt…

Summary The dbt project has become overwhelmingly popular across analytics and data engineering…

20 November 2023 | 01:16:04


Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine - E400

Summary Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran…

Summary Software development involves an interesting balance of creativity and repetition of…

13 November 2023 | 01:07:52


Shining Some Light In The Black Box Of PostgreSQL Performance - E399

Summary Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is…

Summary Databases are the core of most applications, but they are often treated as inscrutable black…

06 November 2023 | 00:54:52