This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Support the show!Listen in your favorite app:
Fountain TrueFans Podverse Podcast Guru Apple Podcasts Spotify Pick your app with Episodes.fmHere are shows you might like
Summary The rapid growth of machine learning, especially large language models, have led to a commensurate growth in the need to store and compare vectors. In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector…
Summary The rapid growth of machine learning, especially large language models, have led to a…
25 September 2023 | 00:59:16
Summary A significant amount of time in data engineering is dedicated to building connections and semantic meaning around pieces of information. Linked data technologies provide a means of tightly coupling metadata with raw information. In this episode Brian Platz explains how JSON-LD can be used as a shared representation of linked data for…
Summary A significant amount of time in data engineering is dedicated to building connections and…
17 September 2023 | 01:01:31
Summary Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this…
Summary Data systems are inherently complex and often require integration of multiple technologies.…
10 September 2023 | 01:01:26
Summary Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to…
Summary Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of…
04 September 2023 | 00:42:13
Summary Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low…
Summary Data persistence is one of the most challenging aspects of computer systems. In the era of…
28 August 2023 | 01:01:10
Summary Generative AI has unlocked a massive opportunity for content creation. There is also an unfulfilled need for experts to be able to share their knowledge and build communities. Illumidesk was built to take advantage of this intersection. In this episode Greg Werner explains how they are using generative AI as an assistive tool for creating…
Summary Generative AI has unlocked a massive opportunity for content creation. There is also an…
20 August 2023 | 00:54:52
Summary Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you're not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of modern data pipelines that will help you stay out of trouble and be productive with your…
Summary Data pipelines are the core of every data product, ML model, and business intelligence…
14 August 2023 | 00:47:03
Summary As businesses increasingly invest in technology and talent focused on data engineering and analytics, they want to know whether they are benefiting. So how do you calculate the return on investment for data? In this episode Barr Moses and Anna Filippova explore that question and provide useful exercises to start answering that in your…
Summary As businesses increasingly invest in technology and talent focused on data engineering and…
06 August 2023 | 01:01:53
Summary All software systems are in a constant state of evolution. This makes it impossible to select a truly future-proof technology stack for your data platform, making an eventual migration inevitable. In this episode Gleb Mezhanskiy and Rob Goretsky share their experiences leading various data platform migrations, and the hard-won lessons that…
Summary All software systems are in a constant state of evolution. This makes it impossible to…
31 July 2023 | 01:09:53
Summary Real-time data processing has steadily been gaining adoption due to advances in the accessibility of the technologies involved. Despite that, it is still a complex set of capabilities. To bring streaming data in reach of application engineers Matteo Pelati helped to create Dozer. In this episode he explains how investing in high performance…
Summary Real-time data processing has steadily been gaining adoption due to advances in the…
24 July 2023 | 00:40:43