
Tobias Macey
Host of Data Engineering Podcast
Tobias Macey is a dedicated engineer with experience spanning many years and even more domains. He currently manages and leads the Technical Operations team at MIT Open Learning where he designs and builds cloud infrastructure to power online access to education for the global MIT community. He also owns and operates Boundless Notions, LLC where he offers design, review, and implementation advice on data infrastructure and cloud automation.
In addition to the Data Engineering Podcast, he hosts Podcast.__init__ where he explores the universe of ways that the Python language is being used. By applying his experience in building and scaling data infrastructure and processing workflows, he helps the audience explore and understand the challenges inherent to data management.
Tobias Macey has hosted 360 Episodes.
-
Build A Common Understanding Of Your Data Reliability Rules With Soda Core and Soda Checks Language
September 25th, 2022 | 41 mins 1 sec
An interview with Tom Baeyens about the Soda Checks Language and how it was designed to express the various concerns involved in data reliability engineering in a format that is approachable by everyone.
-
Building A Shared Understanding Of Data Assets In A Business Through A Single Pane Of Glass With Workstream
September 18th, 2022 | 54 mins 51 secs
An interview with Nicholas Freund about his efforts at Workstream to build a single view of data assets and their status across the organization in a context that is understandable by everyone.
-
Operational Analytics To Increase Efficiency For Multi-Location Businesses With OpsAnalitica
September 18th, 2022 | 1 hr 32 mins
In this episode Tommy Yionoulis talks about how incorporating deliberate data collection into business processes can drive important operational insights in multi-location businesses and his work at OpsAnalitica to make it manageable.
-
Build Confidence In Your Data Platform With Schema Compatibility Reports That Span Systems And Domains Using Schemata
September 11th, 2022 | 59 mins 39 secs
An interview with Ananth Packildurai about the Schemata project and how it provides visibility into the connections and compatibility of schemas that flow from source systems through all of your transformations and into your data assets.
-
Building Data Pipelines That Run From Source To Analysis And Activation With Hevo Data
September 11th, 2022 | 57 mins 15 secs
An interview with Manish Jethani about the Hevo Data platform for building end-to-end data pipelines that automate flows from source systems, into the warehouse, and out to operational platforms without all of the maintenance overhead.
-
Introduce Climate Analytics Into Your Data Platform Without The Heavy Lifting Using Sust Global
September 4th, 2022 | 54 mins 18 secs
An interview with Gopal Erinjippurath about Sust Global's work to bring climate analytics into your data platform through robust APIs and curated data sets.
-
A Reflection On Data Observability As It Reaches Broader Adoption
September 4th, 2022 | 58 mins 39 secs
-
An Exploration Of What Data Automation Can Provide To Data Engineers And Ascend's Journey To Make It A Reality
August 28th, 2022 | 1 hr 3 mins
An interview with Sean Knapp about the potential impact of data automation and the various considerations and capabilities that are required to make it a reality.
-
Alumni Of AirBnB's Early Years Reflect On What They Learned About Building Data Driven Organizations
August 28th, 2022 | 1 hr 10 mins
An interview with alumni of AirBnB's formative years as a data driven organization about the lessons that they learned there and how they are carrying them forward in the founding of new data companies.
-
An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications
August 21st, 2022 | 1 hr 6 mins
An interview with Shruti Bhat about the state of the ecosystem for real-time data applications and the motivating factors for when and how to build them.
-
Understanding The Role Of The Chief Data Officer
August 21st, 2022 | 47 mins 10 secs
An interview with Tracy Daniels, CDO of Truist, about the role and responsibilities of the Chief Data Officer and when your organization might need one
-
Bringing Automation To Data Labeling For Machine Learning With Watchful
August 13th, 2022 | 1 hr 20 mins
An interview with Shayan Mohanty about the challenges of building repeatable data labeling processes and how Watchful is building a platform to let domain experts codify their knowledge for automated labeling of training data for machine learning projects.
-
Collecting And Retaining Contextual Metadata For Powerful And Effective Data Discovery
August 13th, 2022 | 53 mins 24 secs
An interview with Shinji Kim about the challenges of collecting contextual metadata for your information assets and how to organize it to power effective data discovery for everyone in the business
-
Useful Lessons And Repeatable Patterns Learned From Data Mesh Implementations At AgileLab
August 6th, 2022 | 48 mins 30 secs
An interview with Paolo Platter about the experience that he and his team at AgileLab have had implementing Data Mesh strategies at multiple organizations and the repeatable patterns that they have built into their Data Mesh Boost product.
-
Optimize Your Machine Learning Development And Serving With The Open Source Vector Database Milvus
August 6th, 2022 | 58 mins 51 secs
An interview with Frank Liu about the open source vector database Milvus and how its native storage of vector embeddings reduces the friction involved in building and deploying machine learning models.
-
Interactive Exploratory Data Analysis On Petabyte Scale Data Sets With Arkouda
July 31st, 2022 | 40 mins 37 secs
An interview with David Bader about the Arkouda framework for exploratory data analysis at interactive speeds across massive data sets and how it supports operating from a single laptop to multiple servers in the cloud or thousands of cores on a supercomputer