Data Engineering Podcast


This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Support the show!

01 February 2026

Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows - E499

Rewind 10 seconds
1X
Skip 30 seconds ahead
0:00/0:00

Share on social media:


Summary 
In this episode Tim Sehn, founder and CEO of DoltHub, talks about Dolt - the world’s first version‑controlled SQL database - and why Git‑style semantics belong at the heart of data systems and AI workflows. Tim explains how Dolt combines a MySQL/Postgres‑compatible interface with a novel storage engine built on a “Prollytree” to enable fast, row‑level branching, merging, and diffs of both schema and data. He digs into real production use cases: powering applications that expose version control to end users, reproducible ML feature stores, managing massive configuration for games, and enabling safe agentic writes via branch‑based review flows. He compares Dolt’s approach to LakeFS, Neon, and PlanetScale, and explores developer workflows unlocked by decentralized clones, full audit logs, and PR‑style data reviews. 

Announcements 
  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • If you lead a data team, you know this pain: Every department needs dashboards, reports, custom views, and they all come to you. So you're either the bottleneck slowing everyone down, or you're spending all your time building one-off tools instead of doing actual data work. Retool gives you a way to break that cycle. Their platform lets people build custom apps on your company data—while keeping it all secure. Type a prompt like 'Build me a self-service reporting tool that lets teams query customer metrics from Databricks—and they get a production-ready app with the permissions and governance built in. They can self-serve, and you get your time back. It's data democratization without the chaos. Check out Retool at dataengineeringpodcast.com/retool today and see how other data teams are scaling self-service. Because let's be honest—we all need to Retool how we handle data requests.
  • Your host is Tobias Macey and today I'm interviewing Tim Sehn about Dolt, a version controlled database engine and its applications for agentic workflows

Interview
 
  • Introduction
  • How did you get involved in the area of data management?
  • Can you describe what Dolt is and the story behind it?
  • What are the key use cases that you are focused on solving by adding version control to the database layer?
  • There are numerous projects related to different aspects of versioning in different data contexts (e.g. LakeFS, Datomic, etc.). What are the versioning semantics that you are focused on?
  • You position Dolt as "the database for AI". How does data versioning relate to AI use cases?
  • What types of AI systems are able to make best use of Dolt's versioning capabilities?
  • Can you describe how Dolt and Doltgres are implemented?
  • How have the design and scope of the project changed since you first started working on it?
  • What are some of the architecture and integration patterns around relational databases that change when you introduce version control semantics as a core primitive?
  • What are some anti-patterns that you have seen teams develop around Dolt's versioning functionality?
  • What are the most interesting, innovative, or unexpected ways that you have seen Dolt used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on Dolt?
  • When is Dolt the wrong choice?
  • What do you have planned for the future of Dolt?

Contact Info
 

Parting Question
 
  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements
 
  • Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.

Links
 

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Share on social media:


Listen in your favorite app:



More options

Here are shows you might like

See show recommendations
AI Engineering Podcast
Tobias Macey
The Python Podcast.__init__
Tobias Macey

© 2025 Boundless Notions, LLC.
EPISODE SPONSORS Retool
Retool

If you lead a data team, you know this pain: Everyone needs dashboards and reports, and they all come to you. You're either the bottleneck slowing everyone down, or you're spending all your time on one-off requests. Retool gives you a way out. Their AppGen platform lets people build their own apps on company data—with governance you control. They get self-service. You get your time back.

https://retool.com/?utm_source=data_eng_podcast&utm_medium=podcast&utm_campaign=we_retool&rcid=701Qo00001JUyeRIAT