Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

459 Episodes

Accelerate Development Of Enterprise Analytics With The Coalesce Visual Workflow Builder - E278

Summary

The flexibility of software oriented data workflows is useful for fulfilling complex requirements, but for simple and repetitious use cases it adds significant complexity. Coalesce is a platform designed to reduce repetitive work for common workflows by adopting a visual pipeline builder to…

Summary

The flexibility of software oriented data…

03 April 2022 | 00:42:46


Repeatable Patterns For Designing Data Platforms And When To Customize Them - E277

Summary

Building a data platform for your organization is a challenging undertaking. Building multiple data platforms for other organizations as a service without burning out is another thing entirely. In this episode Brandon Beidel from Red Ventures shares his experiences as a data product manager…

Summary

Building a data platform for your…

03 April 2022 | 00:47:02


Eliminate The Bottlenecks In Your Key/Value Storage With SpeeDB - E276

Summary

At the foundational layer many databases and data processing engines rely on key/value storage for managing the layout of information on the disk. RocksDB is one of the most popular choices for this component and has been incorporated into popular systems such as ksqlDB. As these systems…

Summary

At the foundational layer many databases…

27 March 2022 | 00:46:53


Building A Data Governance Bridge Between Cloud And Datacenters For The Enterprise At Privacera - E275

Summary

Data governance is a practice that requires a high degree of flexibility and collaboration at the organizational and technical levels. The growing prominence of cloud and hybrid environments in data management adds additional stress to an already complex endeavor. Privacera is an enterprise…

Summary

Data governance is a practice that…

27 March 2022 | 01:02:35


Exploring Incident Management Strategies For Data Teams - E274

Summary

Data assets and the pipelines that create them have become critical production infrastructure for companies. This adds a requirement for reliability and management of up-time similar to application infrastructure. In this episode Francisco Alberini and Mei Tao share their insights on what…

Summary

Data assets and the pipelines that create…

20 March 2022 | 00:57:26


Accelerate Your Embedded Analytics With Apache Pinot - E273

Summary

Data and analytics are permeating every system, including customer-facing applications. The introduction of embedded analytics to an end-user product creates a significant shift in requirements for your data layer. The Pinot OLAP datastore was created for this purpose, optimizing for low…

Summary

Data and analytics are permeating every…

20 March 2022 | 01:12:56


Accelerating Adoption Of The Modern Data Stack At 5X Data - E272

Summary

The modern data stack is a constantly moving target which makes it difficult to adopt without prior experience. In order to accelerate the time to deliver useful insights at organizations of all sizes that are looking to take advantage of these new and evolving architectures Tarush Aggarwal…

Summary

The modern data stack is a constantly…

14 March 2022 | 00:53:51


Taking A Multidimensional Approach To Data Observability At Acceldata - E271

Summary

Data observability is a term that has been co-opted by numerous vendors with varying ideas of what it should mean. At Acceldata, they view it as a holistic approach to understanding the computational and logical elements that power your analytical capabilities. In this episode Tristan…

Summary

Data observability is a term that has…

14 March 2022 | 01:03:17


Move Your Database To The Data And Speed Up Your Analytics With DuckDB - E270

Summary

When you think about selecting a database engine for your project you typically consider options focused on serving multiple concurrent users. Sometimes what you really need is an embedded database that is blazing fast for single user workloads. DuckDB is an in-process database engine…

Summary

When you think about selecting a database…

05 March 2022 | 01:17:02


Developer Friendly Application Persistence That Is Fast And Scalable With HarperDB - E269

Summary

Databases are an important component of application architectures, but they are often difficult to work with. HarperDB was created with the core goal of being a developer friendly database engine. In the process they ended up creating a scalable distributed engine that works across edge and…

Summary

Databases are an important component of…

05 March 2022 | 00:49:34