Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

463 Episodes

Addressing The Challenges Of Component Integration In Data Platform Architectures - E402

Summary Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is…

Summary Building a data platform that is enjoyable and accessible for all of its end users is a…

27 November 2023 | 00:29:43


Unlocking Your dbt Projects With Practical Advice For Practitioners - E401

Summary The dbt project has become overwhelmingly popular across analytics and data engineering teams. While it is easy to adopt, there are many potential pitfalls. Dustin Dorsey and Cameron Cyr co-authored a practical guide to building your dbt project. In this episode they share their hard-won wisdom about how to build and scale your dbt…

Summary The dbt project has become overwhelmingly popular across analytics and data engineering…

20 November 2023 | 01:16:04


Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine - E400

Summary Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran…

Summary Software development involves an interesting balance of creativity and repetition of…

13 November 2023 | 01:07:52


Shining Some Light In The Black Box Of PostgreSQL Performance - E399

Summary Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is…

Summary Databases are the core of most applications, but they are often treated as inscrutable black…

06 November 2023 | 00:54:52


Surveying The Market Of Database Products - E398

Summary Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she…

Summary Databases are the core of most applications, whether transactional or analytical. In recent…

30 October 2023 | 00:47:12


Defining A Strategy For Your Data Products - E397

Summary The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the…

Summary The primary application of data has moved beyond analytics. With the broader audience comes…

23 October 2023 | 01:03:50


Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable - E396

Summary Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems…

Summary Building streaming applications has gotten substantially easier over the past several years.…

15 October 2023 | 01:08:29


Using Data To Illuminate The Intentionally Opaque Insurance Industry - E395

Summary The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual…

Summary The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact…

09 October 2023 | 00:51:58


Building ETL Pipelines With Generative AI - E394

Summary Artificial intelligence applications require substantial high quality data, which is provided through ETL pipelines. Now that AI has reached the level of sophistication seen in the various generative models it is being used to build new ETL workflows. In this episode Jay Mishra shares his experiences and insights building ETL pipelines with…

Summary Artificial intelligence applications require substantial high quality data, which is…

01 October 2023 | 00:51:37


Powering Vector Search With Real Time And Incremental Vector Indexes - E393

Summary The rapid growth of machine learning, especially large language models, have led to a commensurate growth in the need to store and compare vectors. In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector…

Summary The rapid growth of machine learning, especially large language models, have led to a…

25 September 2023 | 00:59:16