Tobias Macey
Host of Data Engineering Podcast
Tobias Macey is a dedicated engineer with experience spanning many years and even more domains. He currently manages and leads the Technical Operations team at MIT Open Learning where he designs and builds cloud infrastructure to power online access to education for the global MIT community. He also owns and operates Boundless Notions, LLC where he offers design, review, and implementation advice on data infrastructure and cloud automation.
In addition to the Data Engineering Podcast, he hosts Podcast.__init__ where he explores the universe of ways that the Python language is being used. By applying his experience in building and scaling data infrastructure and processing workflows, he helps the audience explore and understand the challenges inherent to data management.
Tobias Macey has hosted 423 Episodes.
-
Keeping Your Data Warehouse In Order With DataForm
October 14th, 2019 | 47 mins 4 secs
An interview about Dataform and how it helps you to keep your data warehouse in good working order
-
Fast Analytics On Semi-Structured And Structured Data In The Cloud
October 7th, 2019 | 54 mins 38 secs
An interview about the architecture of Rockset and how they built a serverless platform for fast and flexible analytics on your semi-structured data
-
Ship Faster With An Opinionated Data Pipeline Framework
September 30th, 2019 | 35 mins 8 secs
An interview about how the open source Kedro framework makes it faster and easier to build your end-to-end data pipeline for machine learning projects
-
Open Source Object Storage For All Of Your Data
September 22nd, 2019 | 1 hr 8 mins
An interview on the open source MinIO platform for fast and flexible object storage for data intensive applications and analytics that runs everywhere
-
Navigating Boundless Data Streams With The Swim Kernel
September 18th, 2019 | 57 mins 55 secs
An interview about using stateful computation on data streams with the SwimOS kernel to improve your analytics
-
Building A Reliable And Performant Router For Observability Data
September 9th, 2019 | 55 mins 19 secs
An interview about building the Vector project to unify delivery of logs and metrics for better system observability
-
Building A Community For Data Professionals at Data Council
September 2nd, 2019 | 52 mins 46 secs
An interview with Pete Soderling about building and growing the Data Council events and helping engineers build businesses
-
Building Tools And Platforms For Data Analytics
August 26th, 2019 | 48 mins 6 secs
An interview on what data engineers need to know about building tools and platforms for data analytics
-
A High Performance Platform For The Full Big Data Lifecycle
August 19th, 2019 | 1 hr 13 mins
An interview about the HPCC Systems platform, its journey to open source, and how it handle the full lifecycle of big data for enterprise scale analytics
-
Digging Into Data Replication At Fivetran
August 12th, 2019 | 44 mins 40 secs
An interview about how the Fivetran platform is designed to handle data replication as a service
-
Solving Data Discovery At Lyft
August 5th, 2019 | 51 mins 48 secs
An interview about the open source Amundsen platform for data discovery and how Lyft is using it to improve their analytics workflow
-
Simplifying Data Integration Through Eventual Connectivity
July 28th, 2019 | 53 mins 47 secs
An interview about a new pattern for data integration that reduces the amount of effort required to find connections in numerous data sets
-
Straining Your Data Lake Through A Data Mesh
July 22nd, 2019 | 1 hr 4 mins
An interview about how the data mesh architectural and organizational pattern can lead to a more maintainable data platform
-
Data Labeling That You Can Feel Good About With CloudFactory
July 14th, 2019 | 57 mins 50 secs
An interview about the Cloud Factory platform for data labeling and social good in developing nations
-
Scale Your Analytics On The Clickhouse Data Warehouse
July 8th, 2019 | 1 hr 11 mins
An interview about Clickhouse, an open source, columnar data warehouse built for massive scale and speed to enable interactive analytics
-
Stress Testing Kafka And Cassandra For Real-Time Anomaly Detection
July 1st, 2019 | 38 mins 2 secs
An interview about testing the limits of scaling Kafka and Cassandra for real-time anomaly detection at Instaclustr