This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Support the show!Listen in your favorite app:
Fountain TrueFans Podverse Podcast Guru Apple Podcasts Spotify Pick your app with Episodes.fmHere are shows you might like
Summary Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful…
Summary Generative AI promises to accelerate the productivity of human collaborators. Currently the…
28 April 2024 | 00:50:10
Summary Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee started work on Shortwave he was focused on making email more productive. When AI started gaining adoption he realized that he had even more potential for a transformative experience. In this episode he shares the technical challenges that he and his…
Summary Generative AI has rapidly transformed everything in the technology sector. When Andrew Lee…
21 April 2024 | 00:53:43
Summary Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing…
Summary Databases come in a variety of formats for different use cases. The default association with…
14 April 2024 | 01:16:02
Summary Maintaining a single source of truth for your data is the biggest challenge in data engineering. Different roles and tasks in the business need their own ways to access and analyze the data in the organization. In order to enable this use case, while maintaining a single point of access, the semantic layer has evolved as a technological…
Summary Maintaining a single source of truth for your data is the biggest challenge in data…
07 April 2024 | 00:56:23
Summary Working with data is a complicated process, with numerous chances for something to go wrong. Identifying and accounting for those errors is a critical piece of building trust in the organization that your data is accurate and up to date. While there are numerous products available to provide that visibility, they all have different…
Summary Working with data is a complicated process, with numerous chances for something to go wrong.…
31 March 2024 | 00:50:44
Summary A core differentiator of Dagster in the ecosystem of data orchestration is their focus on software defined assets as a means of building declarative workflows. With their launch of Dagster+ as the redesigned commercial companion to the open source project they are investing in that capability with a suite of new features. In this episode…
Summary A core differentiator of Dagster in the ecosystem of data orchestration is their focus on…
24 March 2024 | 00:55:40
Summary A significant portion of data workflows involve storing and processing information in database engines. Validating that the information is stored and processed correctly can be complex and time-consuming, especially when the source and destination speak different dialects of SQL. In this episode Gleb Mezhanskiy, founder and CEO of Datafold,…
Summary A significant portion of data workflows involve storing and processing information in…
17 March 2024 | 00:58:14
Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost effectiveness that they offer. The link that bridges the gap between data lake and warehouse capabilities is the catalog. The primary purpose of the catalog is to inform the query engine of what data exists and where, but the Nessie project aims to go beyond…
Summary Data lakehouse architectures are gaining popularity due to the flexibility and cost…
10 March 2024 | 00:40:55
Summary Artificial intelligence technologies promise to revolutionize business and produce new sources of value. In order to make those promises a reality there is a substantial amount of strategy and investment required. Colleen Tartow has worked across all stages of the data lifecycle, and in this episode she shares her hard-earned wisdom about…
Summary Artificial intelligence technologies promise to revolutionize business and produce new…
03 March 2024 | 00:46:25
Summary Building a database engine requires a substantial amount of engineering effort and time investment. Over the decades of research and development into building these software systems there are a number of common components that are shared across implementations. When Paul Dix decided to re-write the InfluxDB engine he found the Apache Arrow…
Summary Building a database engine requires a substantial amount of engineering effort and time…
25 February 2024 | 00:56:01