Summary
In this episode, Shilpa Kolhar, SVP of Product and Engineering at MongoDB, discusses using MongoDB as a unified foundation for AI-driven and agentic applications. She explains how the Application Modernization Platform (AMP) accelerates the transition from legacy relational systems to a document-first architecture, driven by the need for AI-readiness and speed of change. Shilpa highlights MongoDB's features, such as its native JSON document model, Atlas Vector Search, auto-embeddings, and integrated search, which help eliminate drift and latency across operational data, indexing, and vectors, emphasizing the importance of keeping context, transactions, and embeddings together for real-time AI use cases. She shares best practices for re-architecting legacy systems, including schema validation and versioning patterns to tame schema drift, aggregation pipelines for consistent reads, and pragmatic standardization across services, while also detailing AMP's approach to scoping large estates and the balance of LLM-powered automation with human-in-the-loop governance.
Announcements
Parting Question
In this episode, Shilpa Kolhar, SVP of Product and Engineering at MongoDB, discusses using MongoDB as a unified foundation for AI-driven and agentic applications. She explains how the Application Modernization Platform (AMP) accelerates the transition from legacy relational systems to a document-first architecture, driven by the need for AI-readiness and speed of change. Shilpa highlights MongoDB's features, such as its native JSON document model, Atlas Vector Search, auto-embeddings, and integrated search, which help eliminate drift and latency across operational data, indexing, and vectors, emphasizing the importance of keeping context, transactions, and embeddings together for real-time AI use cases. She shares best practices for re-architecting legacy systems, including schema validation and versioning patterns to tame schema drift, aggregation pipelines for consistent reads, and pragmatic standardization across services, while also detailing AMP's approach to scoping large estates and the balance of LLM-powered automation with human-in-the-loop governance.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- If you lead a data team, you know this pain: Every department needs dashboards, reports, custom views, and they all come to you. So you're either the bottleneck slowing everyone down, or you're spending all your time building one-off tools instead of doing actual data work. Retool gives you a way to break that cycle. Their platform lets people build custom apps on your company data—while keeping it all secure. Type a prompt like 'Build me a self-service reporting tool that lets teams query customer metrics from Databricks—and they get a production-ready app with the permissions and governance built in. They can self-serve, and you get your time back. It's data democratization without the chaos. Check out Retool at dataengineeringpodcast.com/retool today and see how other data teams are scaling self-service. Because let's be honest—we all need to Retool how we handle data requests.
- Your host is Tobias Macey and today I'm interviewing Shilpa Kolhar about using MongoDB as the foundation for AI-driven applications
- Introduction
- How did you get involved in the area of data management?
- Can you describe what MongoDB is and the core primitives that it offers?
- The MongoDB engine has gone through substantial evolution since it was first introduced over 20 years ago. What are some of the most notable features that have been added in recent years?
- You recently launched the MongoDB Application Modernization Platform (AMP). What are the key elements of modernization that it is focused on?
- How do the core primitives of the MongoDB engine align with modernization objectives?
- There is a lot of attention being paid now to AI applications where data is the most critical element for success. What are the features of MongoDB that lend itself to being the context store for generative AI services?
- Besides the data used for context and grounding, AI applications also want to track user interactions and form short and long term memory to improve the system over time. How can MongoDB assist in that work as well?
- While the lack of schema enforcement on write can be beneficial to rapid evolution of software, it can also be a detriment if not managed well. How can MongoDB help in avoiding schema drift over time that leads to old data being incompatible with current code?
- What are the most interesting, innovative, or unexpected ways that you have seen MongoDB used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on MongoDB and application modernization?
- When is MongoDB/AMP the wrong choice?
- What do you have planned for the future of AMP?
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
- MongoDB
- MongoDB AMP
- Google Gemini
- Voyage AI
- Qdrant
- ChromaDB
- Weaviate
- Pinecone
- MongoDB Autoembedding
- Retool
- ODM == Object Document Mapper
- RAG == Retrieval Augmented Generation
- Agentic Memory