The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects. In this episode Adam Kamor explores the factors that make this such a complex problem to solve, the approach that he and his team have taken to turn it into a reliable product, and how you can start using it to replace your own collection of scripts.
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- Truly leveraging and benefiting from streaming data is hard - the data stack is costly, difficult to use and still has limitations. Materialize breaks down those barriers with a true cloud-native streaming database - not simply a database that connects to streaming systems. With a PostgreSQL-compatible interface, you can now work with real-time data using ANSI SQL including the ability to perform multi-way complex joins, which support stream-to-stream, stream-to-table, table-to-table, and more, all in standard SQL. Go to dataengineeringpodcast.com/materialize today and sign up for early access to get started. If you like what you see and want to help make it better, they're hiring across all functions!
- Data and analytics leaders, 2023 is your year to sharpen your leadership skills, refine your strategies and lead with purpose. Join your peers at Gartner Data & Analytics Summit, March 20 – 22 in Orlando, FL for 3 days of expert guidance, peer networking and collaboration. Listeners can save $375 off standard rates with code GARTNERDA. Go to dataengineeringpodcast.com/gartnerda today to find out more.
- Your host is Tobias Macey and today I'm interviewing Adam Kamor about Tonic, a service for generating data sets that are safe for development, analytics, and machine learning
- How did you get involved in the area of data management?
- Can you describe what Tonic is and the story behind it?
- What are the core problems that you are trying to solve?
- What are some of the ways that fake or obfuscated data is used in development and analytics workflows?
- challenges of reliably subsetting data
- impact of ORMs and bad habits developers get into with database modeling
- Can you describe how Tonic is implemented?
- What are the units of composition that you are building to allow for evolution and expansion of your product?
- How have the design and goals of the platform evolved since you started working on it?
- Can you describe some of the different workflows that customers build on top of your various tools
- What are the most interesting, innovative, or unexpected ways that you have seen Tonic used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on Tonic?
- When is Tonic the wrong choice?
- What do you have planned for the future of Tonic?
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email email@example.com) with your story.
- To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers
- Ruby on Rails
- Entity Framework
- Oracle DB