Data Engineering Podcast
Episode Archive
Episode Archive
422 episodes of Data Engineering Podcast since the first episode, which aired on January 7th, 2017.
-
Migrate And Modify Your Data Platform Confidently With Compilerworks
August 18th, 2021 | 1 hr 6 mins
An interview with Shevek, CTO of Compilerworks, about how they are using compiler technology to aid in migrating your data processing between platforms and gain insight into your dependencies through advanced data lineage.
-
Prepare Your Unstructured Data For Machine Learning And Computer Vision Without The Toil Using Activeloop
August 14th, 2021 | 48 mins 39 secs
An interview with Davit Buniatyan about his work on Activeloop and the open source Hub framework for reducing the toil involved in getting your unstructured data ready for computer vision and machine learning projects.
-
Build Trust In Your Data By Understanding Where It Comes From And How It Is Used With Stemma
August 10th, 2021 | 52 mins 36 secs
An interview with Stemma founder and CEO Mark Grover about how it can be used to establish trust and understanding of your data and how it is being used.
-
Data Discovery From Dashboards To Databases With Castor
August 6th, 2021 | 52 mins 46 secs
An interview about how the Castor platform approaches the problem of data discovery and preserving context for your organization.
-
Charting A Path For Streaming Data To Fill Your Data Lake With Hudi
August 3rd, 2021 | 1 hr 9 mins
An interview about the Hudi project and how it allows for integrating streaming data sources into analytical queries across your data lake.
-
Adding Context And Comprehension To Your Analytics Through Data Discovery With SelectStar
July 30th, 2021 | 51 mins 23 secs
An interview with Shinji Kim about her experience building the SelectStar data discovery platform to streamline communications about data across your organization
-
Building a Multi-Tenant Managed Platform For Streaming Data With Pulsar at Datastax
July 27th, 2021 | 1 hr 12 secs
An interview about the operational and architectural complexities of building a managed service of Apache Pulsar at scale for Datastax to power streaming data workloads.
-
Bringing The Metrics Layer To The Masses With Transform
July 22nd, 2021 | 1 hr 1 min
An interview with Nick Handel about the benefits of a unified metrics layer for improving the confidence of your analytics and his work at Transform to make it accessible to everyone.
-
Strategies For Proactive Data Quality Management
July 19th, 2021 | 1 hr 1 min
An interview with Gleb Mezhanskiy about his work at Datafold and how it has informed his strategies for proactive management of data quality across your organization.
-
Low Code And High Quality Data Engineering For The Whole Organization With Prophecy
July 16th, 2021 | 1 hr 12 mins
An interview with Raj Bains about how the Prophecy platform provides a smooth experience for the whole organization to build high quality data engineering workflows with a unified model that brings engineers and business users together in one experience.
-
Exploring The Design And Benefits Of The Modern Data Stack
July 12th, 2021 | 49 mins 1 sec
A conversation about the design and motivation of the "modern data stack" and how it can simplify the work of building a self-service data platform that enables everyone in the business to ask and answer questions with data.
-
Democratize Data Cleaning Across Your Organization With Trifacta
July 9th, 2021 | 1 hr 7 mins
An interview with Trifacta CEO Adam Wilson about how the platform is used to democratize data cleaning for everyone in the organization
-
Stick All Of Your Systems And Data Together With SaaSGlue As Your Workflow Manager
July 5th, 2021 | 55 mins 31 secs
An interview about the how the SaaSGlue workflow manager simplifies the process of sticking together all of your clouds, services, and data pipelines
-
Leveling Up Open Source Data Integration With Meltano Hub And The Singer SDK
July 2nd, 2021 | 1 hr 5 mins
An interview with the Meltano team about how they are investing in the Singer ecosystem for data integration with the Meltano Hub and Singer SDK
-
A Candid Exploration Of Timeseries Data Analysis With InfluxDB
June 28th, 2021 | 1 hr 6 mins
An interview with Paul Dix about the technology that powers the Influx Data platform and the architectural evolution that keeps delivering better performance for your timeseries data
-
Lessons Learned From The Pipeline Data Engineering Academy
June 25th, 2021 | 1 hr 11 mins
An interview with the co-founders of the Pipeline Data Engineering Academy about the lessons that they learned along with their first cohort of students.