Tobias Macey
Host of Data Engineering Podcast
Tobias Macey is a dedicated engineer with experience spanning many years and even more domains. He currently manages and leads the Technical Operations team at MIT Open Learning where he designs and builds cloud infrastructure to power online access to education for the global MIT community. He also owns and operates Boundless Notions, LLC where he offers design, review, and implementation advice on data infrastructure and cloud automation.
In addition to the Data Engineering Podcast, he hosts Podcast.__init__ where he explores the universe of ways that the Python language is being used. By applying his experience in building and scaling data infrastructure and processing workflows, he helps the audience explore and understand the challenges inherent to data management.
Tobias Macey has hosted 423 Episodes.
-
The Workflow Engine For Data Engineers And Data Scientists
June 24th, 2019 | 1 hr 8 mins
An interview about how the Prefect workflow engine unifies the needs of data engineers and data scientists with a pure Python API
-
Maintaining Your Data Lake At Scale With Spark
June 16th, 2019 | 50 mins 50 secs
A conversation with the architect of Delta Lake on the challenges of building a sustainable data lake at scale
-
Managing The Machine Learning Lifecycle
June 9th, 2019 | 1 hr 2 mins
An interview about how the open source Hydrosphere platform simplifies management of the full machine learning lifecycle
-
Evolving An ETL Pipeline For Better Productivity
June 4th, 2019 | 1 hr 2 mins
An interview about how and why Greenhouse migrated their homegrown ETL pipeline onto DataCoral
-
Data Lineage For Your Pipelines
May 26th, 2019 | 49 mins 1 sec
An interview about how the open source Pachdyerm platform makes building flexible data pipelines with first class support for data lineage easy
-
Build Your Data Analytics Like An Engineer With DBT
May 19th, 2019 | 56 mins 46 secs
An interview about how dbt enables your data teams to build better analytics in your data warehouse
-
Using FoundationDB As The Bedrock For Your Distributed Systems
May 6th, 2019 | 1 hr 6 mins
An interview about the FoundationDB project and how it simplifies the work of building custom distributed systems applications
-
Running Your Database On Kubernetes With KubeDB
April 28th, 2019 | 50 mins 54 secs
An interview about how to run your database on Kubernetes with the creator of KubeDB
-
Unpacking Fauna: A Global Scale Cloud Native Database
April 22nd, 2019 | 53 mins 50 secs
A deep dive on building the Fauna database and how it supports transactions at global scale
-
Index Your Big Data With Pilosa For Faster Analytics
April 15th, 2019 | 43 mins 41 secs
An interview about the Pilosa bitmap index server and how it can be used to run fast, continuous analytics on large and complex data sets
-
Serverless Data Pipelines On DataCoral
April 7th, 2019 | 53 mins 41 secs
An interview about how DataCoral is building an abstraction layer over data pipelines using microservices built on serverless technologies
-
Why Analytics Projects Fail And What To Do About It
March 31st, 2019 | 36 mins 30 secs
An interview about the common factors that contribute to failure in analytics projects and how data engineers can help keep them on the path to success
-
Building An Enterprise Data Fabric At CluedIn
March 25th, 2019 | 57 mins 49 secs
An interview about building an enterprise data fabric at scale to ease enterprise data integration
-
A DataOps vs DevOps Cookoff In The Data Kitchen
March 18th, 2019 | 54 mins 31 secs
An interview about the current state of DataOps and how it's not just DevOps for data
-
Customer Analytics At Scale With Segment
March 4th, 2019 | 47 mins 46 secs
An interview about the platform Segment has built for routing streams of customer analytics data
-
Deep Learning For Data Engineers
February 24th, 2019 | 42 mins 46 secs
An interview about what data engineers need to know about deep learning