Data Engineering Podcast
Episode Archive
Episode Archive
422 episodes of Data Engineering Podcast since the first episode, which aired on January 7th, 2017.
-
Maintaining Your Data Lake At Scale With Spark
June 16th, 2019 | 50 mins 50 secs
A conversation with the architect of Delta Lake on the challenges of building a sustainable data lake at scale
-
Managing The Machine Learning Lifecycle
June 9th, 2019 | 1 hr 2 mins
An interview about how the open source Hydrosphere platform simplifies management of the full machine learning lifecycle
-
Evolving An ETL Pipeline For Better Productivity
June 4th, 2019 | 1 hr 2 mins
An interview about how and why Greenhouse migrated their homegrown ETL pipeline onto DataCoral
-
Data Lineage For Your Pipelines
May 26th, 2019 | 49 mins 1 sec
An interview about how the open source Pachdyerm platform makes building flexible data pipelines with first class support for data lineage easy
-
Build Your Data Analytics Like An Engineer With DBT
May 19th, 2019 | 56 mins 46 secs
An interview about how dbt enables your data teams to build better analytics in your data warehouse
-
Using FoundationDB As The Bedrock For Your Distributed Systems
May 6th, 2019 | 1 hr 6 mins
An interview about the FoundationDB project and how it simplifies the work of building custom distributed systems applications
-
Running Your Database On Kubernetes With KubeDB
April 28th, 2019 | 50 mins 54 secs
An interview about how to run your database on Kubernetes with the creator of KubeDB
-
Unpacking Fauna: A Global Scale Cloud Native Database
April 22nd, 2019 | 53 mins 50 secs
A deep dive on building the Fauna database and how it supports transactions at global scale
-
Index Your Big Data With Pilosa For Faster Analytics
April 15th, 2019 | 43 mins 41 secs
An interview about the Pilosa bitmap index server and how it can be used to run fast, continuous analytics on large and complex data sets
-
Serverless Data Pipelines On DataCoral
April 7th, 2019 | 53 mins 41 secs
An interview about how DataCoral is building an abstraction layer over data pipelines using microservices built on serverless technologies
-
Why Analytics Projects Fail And What To Do About It
March 31st, 2019 | 36 mins 30 secs
An interview about the common factors that contribute to failure in analytics projects and how data engineers can help keep them on the path to success
-
Building An Enterprise Data Fabric At CluedIn
March 25th, 2019 | 57 mins 49 secs
An interview about building an enterprise data fabric at scale to ease enterprise data integration
-
A DataOps vs DevOps Cookoff In The Data Kitchen
March 18th, 2019 | 54 mins 31 secs
An interview about the current state of DataOps and how it's not just DevOps for data
-
Customer Analytics At Scale With Segment
March 4th, 2019 | 47 mins 46 secs
An interview about the platform Segment has built for routing streams of customer analytics data
-
Deep Learning For Data Engineers
February 24th, 2019 | 42 mins 46 secs
An interview about what data engineers need to know about deep learning
-
Speed Up Your Analytics With The Alluxio Distributed Storage System
February 18th, 2019 | 59 mins 44 secs
An interview about the Alluxio distributed virtual in-memory file system