Context Engineering as a Discipline: Building Governed AI Analytics

Hello, and welcome to the Data Engineering podcast, the show about modern data management.

Data teams everywhere face the same problem. They're forcing ML models, streaming data, and real time processing through orchestration tools built for simple ETL.

The result, inflexible infrastructure that can't adapt to different workloads.

That's why Cash App and Cisco rely on Prefect.

Cash App's fraud detection team got what they needed, flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows.

Each model runs on the right infrastructure,

whether that's high memory machines or distributed compute.

Orchestration is the foundation that determines whether your data team ships or struggles.

ETL, ML model training, AI engineering,

streaming, Prefect runs it all from ingestion to activation in one platform.

Whoop and 1Password

also trust Prefect for their data operations.

If these industry leaders use Prefect for critical workloads, see what it can do for you at dataengineeringpodcast.com/prefect.

Are you tired of data migrations that drag on for months or even years? What if I told you there's a way to cut that timeline by up to a factor of six while guaranteeing accuracy?

DataFold's migration agent is the only AI powered solution that doesn't just translate your code. It validates every single data point to ensure a perfect parity between your old and new systems.

Whether you're moving from Oracle to Snowflake, migrating stored procedures to dbt, or handling complex multisystem migrations, they deliver production ready code with a guaranteed timeline and fixed price.

Stop burning budget on endless consulting hours. Visit dataengineeringpodcast.com/datafold

to book a demo and see how they turn months long migration nightmares into week long success stories.

Your host is Tobias Macy, and today I'm welcoming back Nick Schrock to talk about building an AI analytical system that keeps data teams in the loop in the form of Compass. So, Nick, can you start by introducing yourself for people who haven't heard any of your past appearances?

Yeah. Sure. And, thanks for having me, Tobias. It's always a pleasure being on. So, yeah, briefly, I'm Nick Schrock. I'm the CTO and founder

of DAXR Labs, which is the company behind

DAXR, which is a,

open source

data orchestration platform,

and Daxter Plus, which is our commercial hosted product on top of that, and now kind of an additional product, which is called Compass,

which I'm super excited to talk about. Yeah. Before that, kind of I cut my teeth at Facebook engineering, and the thing I was best known for was being one of the cocreators of GraphQL.

So that's kind of my story.

Founded Daxter

in 2018,

so a while ago now, and but kinda kinda really got the company off the ground 2019, hired my first employee then. And we've been

working really hard for a long time and have an at scale open source project and a really healthy commercial business and looking forward to many more years

of success.

You've been running Dagster for almost as long as I've been running this podcast.

It's true.

Your podcast is actually one of the major ways I got up to speed on the domain. In particular, the episode you did about DataOps was Chris Berg. That was, like, kind of a real unlock for me. So I feel like you and I have kind of been on the journey together in some ways.

Absolutely. And it's it's been a crazy ride over the past,

what, eight years now. So Yeah.

So I guess the next stop in that ride is Agentic Systems.

And so because you're working in the technology space, you're obligated to build an Agentic system. So I'm wondering if you could just give a bit of an overview about

what your thoughts are on the application of

agentic systems to data analysis

and some of the ways that you thought about the approach to Compass that keeps data teams in the loop without just leaving them on the sidelines

and letting the AI run rampant over all of their hard work.

Yeah. It's been a fascinating

journey,

actually.

I think both me and

Daxter Labs as a company has been fairly conservative when it's come to

AI and agentic systems up until now. You know, last summer, I wrote this piece.

That was a blog post about,

what I called the rise of medium code and the properties of a software system

that needs to happen or to be an amenable target for AI cogent. And it really focused

on minimizing slop

and having a technical blast radius so the AI can't do much damage to your system and

all that. So I've always thought about it in those terms,

but I was always

a little skeptical about how good the agents could get. And it's really their the progress has really exceeded my expectations in the last year. And I think really a huge I didn't realize the time, but I think a huge release was this release in February

where

Anthropic released Sonnet three seven, I think it was, and Quad Code in the same release.

And those two things was a simultaneous

innovation at the model layer, but, also, I think maybe even more importantly, the application layer over that model layer. And that kind of moment

and the right afterwards was a huge wake up call for me that was like, oh,

these systems

are super ready for prime time now if you apply the right tools and techniques.

And then that really momentum has been building. Actually, you know, in in June,

now this term kind of became part of the ether context engineering.

Toby looked up Postapata, and then it was canonized

by Carpathi.

But it really described how,

you know, you can

you know, it's a kind of a rebrand of prompt engineering, but it it kinda describes how you can programmatically

inject context in the right place in the right time to the right model.

And that mentality

really clicked with me in terms of, like, oh, you know, me,

the lowly product and infrastructure engineer who doesn't know hired the higher order math necessary to kinda build a foundation model. Like, I can really participate in this in a super first class way. So that's kind of the context of my journey. At this point, I'm very, you know, AI pill, I would say, in terms of I believe that we are witnessing the simultaneous

disruption

of multiple layers of the stack simultaneously

in a way that we've never experienced before. So AI is revolutionizing

the way we build software, the way we structure infrastructure,

the way our stakeholder relationships work,

and also

the the consumption layer. Right? ChatDTP has been, like, a dramatic change in the way that we interact with computer systems.

And that has not really reached the enterprise at all either, which is a very interesting topic of discussion.

So I think we're at this massive

way. And it is you know, you you kind of alluded to it how, like, okay. You're in software, so you have to be thinking about agentic code.

And, unfortunately,

kind of that is true. And I think some people are more obnoxious about it. But it's kind of like

ignoring agentic AI would be like ignoring the Internet

in the nineties, you know, and not really thinking about how that's gonna impact your system.

And, you know,

it comp the Internet completely revolutionized all domains of computing. Right? Even the ones that weren't putatively you know? It kinda started with consumer, but then all our infrastructure changed too. So I think this is a similar

a similar wave. So I think it's extremely

exciting,

and I'm just really enthusiastic about the future.

And to help frame

the rest of the conversation,

can you describe

the scope and purpose of Compass and some of the problems that you're trying to solve with it?

Yeah. Without going into features or technologies,

I think the

the highest level the problem we are trying to solve in sort of human terms

is

to completely

restructure

the relationship

between a data platform team and their stakeholders.

Meaning that kind of right now,

I think that data teams feel like they are cogs in a machine,

that they are cost centers,

that they are there to do a job. Business stakeholders ask for data. Business stakeholders ask for dashboards. But then you're kinda disconnected from those business business users because

your work is intermediated by these tools,

which are often not that pleasant to deal with. Like, you know, BI tools being an example. I often joke that BI category, it it feels like it was invented by Dostoevsky

because, like, all BI tools are terrible, but they're all terrible in their own ways. And so what we really want to do is rather than have you know, you you mentioned the term self serve. Rather than think about it as complete self serve, we wanna redefine the relationship between the stakeholders so that the data team is

collaborating with the business stakeholders in real time in a highly positive way where instead of be being viewed as a cost center, they are the face of the value. And so they're collaborate collaboratively working

with their stakeholders, and they're empowering

empowering way more of those stakeholders.

Now that is the problem we're trying to solve. And in the end, that means

much more accessibility

to data, and you can leverage your data platform to do more things in the organization. They're bore thereby increasing the value. K. So that was like a

value. K. So that was like a whole a lot of stuff that I just talked about. But we're redefining the stakeholder relationships such that the perceived and real value of a data platform is higher in the organization.

Now how do we do that? You know, Compass looks fairly

innocent

at first blush.

It is a Slack native

experience

where you can interact with your data in natural language.

It's processed by AI, which sort of acts like a junior analyst, and it can interrogate your data warehouse and do interesting analyses.

That is the user experience. But it ends up

being fairly, I would say, transformative, dare I say, revolutionary,

and it has been internally

because you have the stakeholders

interacting with both this agentic tool, the the the analyst. But then

because there's a agent analyst, the the data team members that are in there in that Slack thread with the business stakeholders,

they're no longer analysts.

They act much more like data stewards.

They're, like, guiding the user to do the correct analysis, and then they manage

the context.

And they manage the context store so they can govern the AI in a very scalable

fashion. And then the business stakeholders, they never leave Slack. There's data vis right there. The stakeholders

can make data requests.

They can

request context corrections. Often, the AI just figures out how to do that for them. So they plug in all sorts of workflows. They can schedule these analyses on a regularized basis.

And in our demos, we really we almost brag.

Right? That's like, you're not gonna see a web UI during this entire demo, and that is super deliberate. Because who wants to learn another web UI? You, like, get bounces web UI. You have to auth to it. You have to learn a completely new information architecture.

Right? You have to learn completely new concepts, and it bounces you out of your collaborative zone. You can no longer go, like, at mention people, etcetera, etcetera. So, yeah, that's kind of the approach. It's a, you know, it's a Slack native natural

natural language analytics experience that is both collaborative,

governed, and natural language in AI driven by AI.

So the common challenge when dealing with all of these agent based systems, as you already pointed out, is this challenge of context engineering

or the alchemy of turning

raw bits into useful information is

the entire purpose of data engineering, to be very reductive about it. And so those two things are

intention because

the data engineer doesn't want to

forego

their purpose and hand it off to an AI

and particularly if they have any pride in their work because they know that the AI isn't going to do an appropriate job of understanding

the business needs, the business context, all of the hardwood knowledge that they've already encoded into the data assets that they're building. Whereas

the consumers of that data, to your point, don't wanna have to deal with learning all the information architecture. They don't want to have to dig through all the docs or go through all the pipelines to really understand what it is that they're actually looking at. And so I'm wondering if you can just talk to some of the ways that you're thinking about that context management

and the handoff between

the data teams who are doing all the hard work of bringing all this information together and hydrating it with that business context

and the ways that the agentic analyst is able to actually retrieve and interact with that context to be able to understand

how to map the probably very vaguely worded request from the stakeholders

into a concrete plan of action and means of discovery and enumerating

all of the information that's required to be able to fulfill that request.

Right. So there's a lot there. Just trying to start the thesis apart here.

So I guess the

dispositionally

and because I also think it's how the world

works is that I think of the AI as a bicycle for your brain as opposed to a replacement

for it. And in some ways, actually, almost in every way,

these AIs make judgment and taste

that much more leveraged.

Because if you have good taste and judgment,

you can get the AIs to do an extraordinary amount of work on your behalf,

that's high quality. But if you don't have that, then it can you know, I call it a technical debt super spreader. It can copy bad patterns and it can go off the races and hallucinate

lord knows what. So that's kind of my starting place is that we build tools

that keeps humans in the loop, that are governed, and that accelerate

the work and amplify

the work of subject matter experts rather than sort of eviscerating it. So that's kinda that's kinda my starting.

I, so where do you want me to take it next? So that's, like, kind of, like, high level philosophical.

So I think the interesting bit is I'll admit to everybody listening, I've already witnessed the demo of Compass, so I already

know a lot of the details of how this is operating. So these are somewhat leading questions and a little bit of inside baseball. But my understanding,

the Compass utility relies on a repository

of context artifacts for being able to

understand how to map some of the semantics of these analytical requests into the actual data assets that are available with these data assets, at least in its current formulation, largely being restricted to a data warehouse environment for being able to create and execute SQL queries and, warehouse environment for being able to create and execute SQL queries. And so I'm wondering if you can just talk to how you're thinking about the initialization

of that context repository

from that set of

tables and data assets that already exist in a manner that isn't just a lot of extra busy work on the behalf of the data team, but provides all of the necessary

information and guidance to the agentic system for being able to fulfill the requests of the business stakeholders.

Right. Okay. That makes total sense. So I think there's, like there is an initial step where we bootstrap the contact store with as much information as possible. That takes the form of, you know, setup is super easy. All you do is you plop in your data warehouse creds, and you're good to go. Now what happens

is that we for the tables that you allow us to see, we query the information schema. We get as much information as possible from there. We also

sample the data, and then we programmatically generate

context. I think where a lot of people go wrong with that type of step

is that they aren't thoughtful enough about producing the precise context that your application

expects.

They just, like, dump raw metadata into some context window and, like, hope and pray

that the agent figure stuff out. We're my very much of the belief that you need to very deliberately

produce context programmatically

in a way that's guided and specific to your application. That means there's a level of precision and control. Like, I firmly believe that increasingly,

we're gonna move from data pipelines to context pipelines,

meaning that context will be computed. It'll be computed from other context and other data, and that is data pipelining. Right? So that's one of the reasons we we kind of have an engineering approach to it. And the automatically generated data docs are kind of like an initial step to do that. The second critical piece that's in the context store is the oh, and actually, I'll start. All of this is managed with Git, and we think that's very important. Context

occupies

this fascinating

space

that's sort of in between code and data, meaning that context is computed

just like data, but it also very directly

determines

system behavior just like code does. This is why we've kind of set on this path of having programmatically

generated context

checked in to a Git repository.

Because if it's checked in, you can track changes, you can revert things, you could write tooling over it to change it in place. It allows all sorts of flexibility

and precision. Right? So imagine you're really developing this thing at scale. You have evals so you can evaluate the performance of the agent. Right? You can do, like, a bisect against the context just like code, and that's a super powerful dynamic. The second piece that we capture in Git are these manual context corrections

that we get directly from the business stakeholders. So this is another critical piece

of the context puzzle. The first one, again, being programmatic generation

of context. And then the second piece is actually getting the information out of the brains of your stakeholders who actually know the domain and into some governed context store where the agents can utilize that to give correct results. Like, for example, at one of our early customers,

they use the term core

as a kind of special code word for a project that doesn't really mean core, and they flag this. You know? They're like, this has screwed up AIs before,

and they totally hallucinate because, obviously, it has its own idea of what core means in the foundation layer. So we created this context correction that very specifically laid out, like, okay. Core actually means these, this and this, these context,

etcetera, etcetera, and the system performed well. Now what's really magical about the way Compass works is that all of this is kind of captured either explicitly or in an ambient sense

from the interactions that are happening in Slack. So, for example, if the business stakeholder is presented within Slack with a data visualization that looks wrong, like the demo that we give is, like, some sales rep has a 90% win rate, and that never happens. So the demo is, like, you look into it, you do some investigation, you figure out that the sales rep is actually a customer success manager, and then that automatically submits that context back to the contact store and says, like, hey. Don't count CSMs

as sales reps. So and then that, it's then checked into the context store. Knowledge is captured, and then the data team can kind of take that and manage it very explicitly. And that's a very powerful model to boot strap. You know, generally, alternative and more heavyweight systems like semantic layers is to do this upfront

process that is very burdensome and complicated, and you need to send the business stakeholder to a custom tool or a web app or something. And they never do that. So the knowledge stays captured in their head, which does no good to anyone except for them, maybe. But as we demonstrated, humans have limited context windows too. And what's good for the goose is good for the gander here. So it's important to get that out of your head and into a place where the the agents can take advantage of it. And it's really this lightweight interface within Slack that's really the magic on in in that part of the process.

One of the other aspects

of data teams who are responsible for the care and feeding of analytical systems, particularly when you're dealing with business intelligence,

is that it's often very difficult to gain any real insight into

how

the stakeholders are interacting with those systems. You might be able to have some audit logs to see how frequently people are running certain queries or dashboards.

But beyond that, you don't know why they're going to those dashboards, what they're doing with the information once they retrieve it. And I'm curious how you're thinking about the ways

of bringing more visibility

both to the data teams and at the organizational

level of how the overall

company is interacting with the data assets that you have and are creating and some of the ways that that can create some feedback loops to the data teams either to prune unused data assets or to understand what new data assets they need to generate to be able to fulfill the needs of the organization,

and in particular, how using the conversational

system of record for the company helps to provide some of that visibility?

Oh, it's a great question.

And I think I could talk for hours on this subject because it's not just what's currently happening, but I think the roadmap on this front

is super, super bright. You know, we're still early stages

here. So, you know, I guess, like, just the first thing that happens

is that when

the business stakeholder is in the same channel

as the analyst, and the analyst can literally see what the business stakeholder wants, gets literally compiled to SQL in the data warehouse,

that gap between business language and, like, your column and table names communicates

so much about what is actually happening and what people actually want. If you're with a traditional BI tool or exploratory data analysis tool, that that translation

does not exist in a format that can be discernible by one of the analysts.

So I think that is the dynamic. Just the social dynamics end up really producing a ton of insight. And we are just at the beginnings of being able to use, I like your term, the conversational system of record to drive more value out of that. The initial

kind of, like, feature we have that I think demonstrates that is that we have the ability

to create data requests in a ticketing system automatically based on what's been happening in a specific thread. So let me give you an example of how this works. So I was asking specifically

Compass

about product analytics, and I asked and this but it expands beyond that. So I was asking what I thought was a very simple question, which was, how many of our customers use declarative automation, which is one of our scheduling features? It turns out our warehouse didn't really have that particular feature

explicitly modeled well. So what the AI did, and this was awesome and terrifying to watch, what it actually did is we have Gong

transcripts,

meaning Gong is a system that records and transcribes

sales calls. So the AI decided

to use Snowflake's features of AI based analyses

to it couldn't find the exact information, but what it did do is it found

all the customers that had mentioned declarative automation in one of their sales calls, which was obviously

imprecise, but gave me a hint or at least a floor of how many people use it and interesting customers that use it. It was also terrifying

because those capabilities are extremely expensive in Snowflake. But as a result of this experience, I was like, make a data request ticket so that we have this information first class in the data warehouse. And what the system does is it scoops up that entire conversation. And think about all the context you have. You have you have who asked for it. You have all this SQL that was generated that is navigating the warehouse and trying to figure out where where things are going on. You have follow-up questions. You might have a conversation. Maybe our analyst jumps in and says, like, oh, yeah. We don't have that because of this historical reason. It scoops all that up, synthesizes it, and creates a data request. Right? And that's an LLM assisted process. So we're really building, you know, Databricks calls these compound AI applications.

But what it means is kind of injecting where appropriate

LLM augmentation

and processing

into parts of the workflow. And I think we're just at the beginnings of this. As we develop this product

over time, we will be doing much leveraging the conversational system of record much more. You can imagine

doing post hoc processing on threads

that allow you to

discern with more detail,

like, what suggested

context corrections, for example,

scooping up all that information.

You can imagine having observability and ins insights tools

across all the conversations happening in all different channels across your business so you can understand,

like, what's happening, what data is being requested the most frequently, what data is not being requested.

You know, my vision is kinda like, it's almost you know how

during COVID,

right, Google, you could figure out when the COVID was spreading with Google Trends, and it was ahead of any reporting

because people would start asking about, you know, oh, I'm losing my I I I can no longer taste something. You could kind of see that go through the country and lead

be a leading indicator of the true metrics that get reported. And I think I'm imagining a future where at a company, you can really get a sense of, like, what people care about in aggregate. If you if the analytical queries are shifting in the business, that's actually a good insight in a large organization about what people care about, what they're worried about, etcetera, etcetera. So I think there's a huge

space here

to get broad based intelligence from this conversational system of record and how the context is being accessed to. So I think it's, like, a a very perceptive question, and I'm like, I think, like, there's just a huge amount of greenfield around there.

It's interesting too. I was actually, just earlier today having a conversation with somebody who's building an agentic

coding

platform for

doing software engineering in an autonomous fashion, and

it brought up the whole idea of Conway's Law about how the structure of the software is defined by the communication patterns of the organization.

And once you start introducing these agentic systems, that changes the communication patterns to also incorporate those LLMs,

which by necessity

modifies the structures of the software that gets created. And I'm wondering how you see that analogy play out in the context of these agentic analytical systems and the role that it plays in terms of the design and orchestration

of the data assets that you're building and the ways that people are interacting with those data systems. But because we have these LLMs

in play, it is no longer human to human interaction or human to deterministic machine interaction. The LLM then plays a role in that communication system

and modifies the ways that people are interacting with it.

That is really interesting. The Conway's law analogy, I hadn't thought about, but it makes total sense. Because one of the things that I think is happening in the AI era is that I think nearly every stakeholder relationship is going to be reimagined for this era. And I think part of that is because the new consumption layers facilitate new team organizations because of the Conway's law. So just as an example that's not in data platforms, what I mean by that is that there's a wire there's, like, a all software engineering is going away boomlet.

Right? That was kind of a big conversation, which I thought was complete load of, you know, whatever.

It's a family show. Right?

Tobias.

The, but, you know, PMs vibe coding does not mean that software engineers are going away. However,

the ability

of PMs to prototype

and build things in the native system of the engineers

fundamentally and completely transforms their stakeholder relationship. And I think this is partially kind of one of these Conway law esque effect. And that's also what's happening in Compass between the business stakeholder

and the analyst where the business stakeholder can now kind of do Vibe analytics in their own way and communicate directly in the native medium that the

analyst

can understand. So and this Conway of Law effect

is I am

I am incredibly

bullish

on this

UI interaction

of multiplayer

agentic chat

in b two b context. So, you know, single player agentic chat, like ChatDVT

and its competitors,

have completely

remade consumer software and are in the process of doing so. And I think that this multiplayer collaborative chat is gonna do the is the same order of magnitude change in the enterprise.

You know, we're seeing it right now in Compass because, you know, if you stack that on top of the data platform, you effectively don't need reporting functionality

across all of your vertical SaaS apps. It's just in this one spot, which is super, super exciting. And, you know, and I think this, like, agentic chat is what does it because

you're bringing in people. You can bring in the random stakeholders. And then instead of, like, the agent, I think a lot of people's mental model of the agent is, like, someone's alone

and talking to the agent in an enterprise context. That doesn't make sense. What the Slack modality does is that the agent is a participant

in a collective

conversation that incorporates workflows,

and that is a super powerful dynamic that also kind of changes the communication structures here. So there's a lot of I think people have the wrong mental model of this. There's, like there's also a boom lit about, like, oh, there's gonna be, like, one person startups that are billion dollar companies. And, like, I don't really think that's true

either because I just I don't imagine a world where, like, one human is talking to, like, n agents

and building a company like that.

I think it more of, like, there's there's there could be fewer people, more hyper empowered people, but it's always gonna be hybrid where there's lots of humans and lots of agents and this the the humans are sort of up leveling their work. At least that's the way maybe it's just the way the I want the world to work, but I is I I think it is the way the world will work.

I think it's also

indicative of just the overall tendency for people to take a proof of concept and extrapolate

to a larger scale that is not true. I mean, as with anything in software and technology, it's, oh, I built this system in a weekend, so therefore, I can build an entire production company by the end of the week. But the the the factors of scale are something that nobody ever properly accounts for where you're dealing with

exponential complexity but logarithmic

capability. And so you're you're going to diverge sooner rather than later in terms of what you could actually feasibly maintain. And so I think similarly

with that idea of the one person company where I just have 50 different agents, it's like your your head's going to explode trying to keep up with them. And, eventually, the agents you're going to hit the law of diminishing return where the inaccuracies of the agents are going to start compounding, and it's going to drive your multibillion dollar company into the ground before it ever takes off. And I think it's it's also indicative of the hype cycle that came out with the initial release of chat GPT about saying, oh, well,

AGI is now just around the corner. We're going to have it by the end of next year, and now it's in the end of two years or five years or it keeps getting pushed back.

Yeah. And what is even AGI? You know? Like,

it's a very difficult thing to define.

You know, I use the term

technical debt super spreader and things like that. I actually think that's a specific instantiation

of a more general trend that's gonna be across multidomains. Because, like, I think with AI, we're going to be entering a complexity crisis

effectively. Like, the ability of agentic systems and humans empowered by agentic systems to produce complexity,

junk content, interrelated cons concepts that you fundamentally don't understand

is very, very high. So I think that the ability to manage and model complexity will become only more and more leveraged. You know? And that's that's what I think about when I'm doing agentic engineering is really compartmentalizing complexity in a way where the agents can contribute the right things at the right time. But, yeah, we are it's going to be a very

complicated world with all these agents running around.

And now bringing us back around to Compass and these agentic systems for AI and the role of the data infrastructure and the data teams in that landscape,

what are some of the ways that the

requirements of the data infrastructure change

when you have to support these agentic systems,

and what are some of the aspects that can remain the same and the agent is able to just use systems as they exist today?

You know, it's such a broad

question.

And, you know, the agentic systems

are so

new, and not that many people have deployed them at real scale, that I think it's actually very difficult

to understand

at this time how exactly it's going to impact everything. You know? I think lots of people there lots of people have these like, a lot of people are like, oh, there's gonna be more unstructured data. I don't even know if that's true, for example, because, actually, for these AI systems to operate over and do super leverage thing, you actually want tons of structure and tons of metadata, tons of context. You know, I think that real time,

more complex

workflows

are going to be incredibly

important.

I'm quite bullish on systems like Temporal,

for example,

to manage

the agentic workflows and the complicated agentic workflows that go on because the ability to pause and resume compute

will be very important. Because, like, one of the interesting things happening is that agentic workflows are so high latency.

Right? Users are now trained

to think for a computer to think for minutes on end on your behalf, which is very different from, say, the web era where, like, every millisecond

counted. And utilizing

computational resources efficiently

in those contexts, I think, is actually quite challenging. There's any number of things I could spout off about, but I think I think anyone who gives you definitive answers about how all this stuff is gonna impact infrastructure

doesn't really know what they're talking about. Because, like I said, every layer in the stack is getting disrupted,

and both the consumption layer is changing.

That implies changes to the compute that's actually running, but also the AI is impacting the way these infrastructure things are built. So there's multiple dimensions of variability right now, so I think it's very difficult to project beyond pure print conjecture what's gonna be changing.

One of the other aspects of

bringing an agent into the equation is obviously

cost because LLMs have very unpredictable

cost patterns.

And so

you don't want to route every request through the LLM and especially having it do a huge body of work if it's something that you already have a stored data asset for. And I'm wondering if how you're thinking about some of the methods around

taking some of the common questions and interaction patterns with that agent and being able to then either cache them for quick retrieval

or materialize them into a more durable asset so it's not something that gets recomputed every time or just some of the ways to enforce

the interaction patterns

of the

stakeholders to say, you don't have to ask the LM this question every time. You can go here for it, or it's going to deliver this to you without you having to take any action into some of the means of mitigation of unbounded cost.

Yeah. No. It's a you know, I felt this very personally

because I went from zero to, like, a 100 on agentic coding this summer,

and I hadn't signed up for the Claude Max plan. I just used our corporate account,

which doesn't have that sort of high usage limit. And in my first two weeks of Claude code usage, I cost the company $3,000.

So we're able to get it under control, but, you know, you can consume

a lot of cost doing that. I kind of like I don't even wanna think about, like, how much natural gas was burned to produce those, you know, 10,000 lines of code or whatever.

So the cost you know, I mentioned a complexity crisis before. I also think there's a cost crisis coming. And I think the first answer here is that earlier in this episode, I mentioned that I think context pipelines are the kind of the new data pipelines.

And that is gonna one, is that you wanna be precise

about

when and how you recalculate

context.

And that means it's a data pipelining problem, like doing event based computation,

crafting the computation in a very specific way, and then producing it in a highly tailored way so it's perfect for your application.

So

writing data pipelines that become context pipelines and then matching that with context engineering, meaning taking those produced artifacts and feeding it to the right model at the right time, the combination of those two techniques, I think, are going to be essential

for controlling

costs. You know, because, like, the larger the context window, the more expensive the compute is. Like, prefill is quadratic with respect of context window size, and it just determines a ton about model performance. But I think that the cost crisis coming is real.

I think the chickens will come home to roost

for a lot of these firms

who aren't

passing through enough of the compute cost to their customers,

and their customers will have a rude awakening and churn. And I think, you know, some of the coding startups are encountering that challenge right now. So, yeah, I think there's gonna be a huge amount of techniques,

and those techniques will stay

extremely relevant even as the models

get better and even as they get cheaper

too. Because some of this context management,

I view it as kind of almost like big o notation or algorithmic complexity,

Meaning that no matter how good Moore's law, an o of n squared sort algorithm

can only go so far because no matter

how fast the processor is. And I think the same thing is gonna be true with context engineering. You know? Like, we're even seeing this now. Like, we're getting to a million and, like, even enormously large context windows, but they have enormous amounts of diminishing

return. And even it can be a negative thing if you pollute the context with contradictory

information. You know, this is famously called, like, context poisoning or context rot and all this stuff. So I think, like,

context engineering

is gonna be more and more expansive. I think that is gonna be

a common theme to control cost. And then beyond that, having more control over fine tuning. And, I think there's a whole undiscovered country in terms of democratizing

fine tuning and then having the model providers have built in capabilities so you can do fine tuning over their closed source models. But it is early days. But we are going to burn a lot of money and energy along the way, and it's gonna be becoming increasingly more important to control it.

As you have been going through this journey of building Compass,

testing it out, getting it in front of some early adopters.

What are some of the most interesting or innovative or unexpected ways that you've seen teams apply these agentic capabilities on top of their existing data investments?

That is a good question.

You know, I think the thing that has really struck out at me is that, you know, the only in our funnel you know, we're still on the order of dozens of users, and we're you know, this is week of October 6, and we're opening the floodgates a little bit. You know, we have hundreds of people on the wait list. I think the thing that has struck out at me is that effectively, once people connect their data warehouse to the system, we have a 100% retention on the platform,

which is crazy. So

people start using it, and they the usage is intense, and they get

tons and tons of stakeholders

in the system. Right? And, internally,

you know, we actually purchased some datasets that we're gonna make public that effectively are the moral equivalent of, like, the pitch book data. So companies and their fundraising histories and their revenue numbers and all this sort of stuff. And then a people database, which is kinda like the LinkedIn dataset. And just Compass

plus those things

make it, like, the best

prospecting,

meaning salespeople finding customers that will be open to purchasing the product. Prospecting and recruiting tool, it's, like, more powerful than LinkedIn sales navigator.

It's crazy. You know, SQL is so powerful. And natural language on top of SQL, doubly so. So we've already seen every single ops role use this tool

effectively

across

recruiting,

across

HR,

across

FinOps,

RevOps, sales ops. There's lots of ops these days. Product ops queries,

doing these sort of things on our own data platform,

in fact. So the breadth of use cases has been pretty awesome.

And,

yeah, you know, it's been great. I

the a lot of our early product market fit are actually investment firms. So and they use it for interesting stuff. We thought they would use it for kind of trying to find new companies to invest in, but they have a sales pipeline just like a but their sales are investing in something. So they kind of know what stage they're looking at their company in. They kind of have a pretty formalized pipeline. And the

generally, there's one investment ops person who kind of manages that, and they have to field requests from the partners, which is often very time sensitive and stressful. But they've actually gotten their partners, the people who run the firms, to use this tool directly

and which has both been efficient, but also an incredible stress reducer, which is really literally why on our marketing site, we can have a pull quote that says, quote, unquote, Compass saved my life, which is always something you wanna hear as a founder.

But the reason why that, person said that

is because we not just saved her time, but enormous amounts of stress dealing with kind of time sensitive time sensitive requests from very important people. So I think this investor use case has been pretty interesting to see.

In your work of building the system

and

understanding

the

capabilities and use cases and limitations

of an agentic analytics

platform and how to tie it into

existing data infrastructure and data assets, what are some of the most interesting or unexpected or challenging lessons that you learned in the process?

I mean, it's still early days. It's amazing how

once you once you go from

one person on the go to market team being able to interact with the data warehouse to 80% of your team being able to interact with the data warehouse, you really start to see how many gaps there are both in understanding

and your data model, but also, like, these gaps and understanding of, like, what people actually care about. So

I think that has been

super interesting to have rollout

in real time.

And what are the situations where you would advise against going down the agentic path for

these

exploratory

or analytical use cases?

Yeah. So the you know, we don't call it a BI tool. We call it exploratory data analysis because it's actually a very distinct use case. The for

BI tools often

drive

absolutely

mission critical

things,

like revenue reporting that is subject

to regulatory scrutiny,

or comp decisions,

or

pricing

decisions.

And Compass is explicitly

not designed for that use case. It is for exploratory,

rapidly rapid directionally correct data analysis, which is a very different use case. So we don't purport no desire to be a replacement

for those those core BI assets. We think that should be managed by the BI tools. It's Kind of one of our principles here is that we want to designers call this truth in materials. We don't want to pretend like it's not an LOM. We don't want to pretend

that it's a 100% accurate or bulletproof. It's not its purpose. Right? We want it to be rapidly correct or directionally correct and eventually correct. And by eventual correctness, I mean that the context store gets added to, and then the the queries get more and more accurate over time to some kind of asymptotic level. So, you know, there are domains where

absolute precision in all cases is absolutely required. That is not,

it is for facilitating,

as I said, directionally correct rapid analyses.

And as you continue to

invest in and iterate on this agentic

exploratory

analytics use case, what are some of the things you have planned for the near to medium term or any particular projects or problem areas or capabilities that you're excited to explore?

Yeah. So one thing I'm super interested in, I think for obvious reasons, is deep integration

between Compass and Dagster plus and Dagster.

You know? And this comes in many, many different forms, both using data pipelines to produce context and manage context,

integrating the context store with our operational system of record, and then also

using this tool. You know, we have this ability to create data requests,

which can be, like, very detailed, and then using that as a basis of AI agentic authoring workflows,

which we actually have kind of working already and is very, very effective. So I'm very excited

for that dimension, kind of integrating Compass even more first class into data platforms. I'm very excited to work on our at a, a more enterprise SKU of Compass. I think these kind of organizational observability

features

will be part of that, as well as sort of on prem versions, which will have its own challenges, but will really unlock usage in a ton of places that will deliver a ton of value and we feel will be very successful in terms of being a healthy business. Yeah. And then just kind of you know, this the way this is set up, you know, we can attack all sorts of interesting use cases

one by one by one. You know, we view just in the initial stages,

right, every dashboard

in every

vertical SaaS app is our opportunity in effect.

And that's very exciting to see. And I'm excited to so much of the information and knowledge work that happens still,

it's so much drudgery.

Manually

fielding a request to add such and such to this Salesforce dashboard and then, you know, hooking this and that up. And I think people are a little too pessimistic about, like, AI taking all of our jobs.

I don't think that will happen. I think people will move up the stack and have to deal with much less drudgery. And that's kind of the way I approach this and what I seek to do as participating and helping with this product. I think the future is bright. You know, it it kind of always comes up. I'm maybe anticipating a question you might ask, but,

should my kids study software engineering? Is software engineering gonna have a future? And blah blah blah. And I couldn't be more bullish

about the future of software engineering. It's just gonna change the definition of what software engineering is. But the the core foundations of learning how computation

works, learning

how to think about this stuff from first principles will only become more leveraged.

Are there any other aspects of this space of agentic analytics, the work that you're doing on Compass, the

leveraging

of existing

data infrastructure and data assets into this more

AI driven

interaction pattern that we didn't discuss yet that you'd like to cover before we close out the show?

No. I think we've we've done a pretty good we've covered a lot of ground, so I think we'll leave it here.

Alright. Well, for anybody who wants to get in touch with you and follow along along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap of the tooling or technology for data management today.

It's always

extremely unfair when you ask vendors this because we're morally obligated

to talk our own book. But I am super interested

in I'm, like, obsessed with this context engineering

notion. And I think it's, like, gonna be a defining discipline for the next ten years. I think it's super, super early days. I actually think about a lot because my other kind of passion project right now is figuring out how to deploy

AI and agentic authoring in real and large software systems. And I am very interested in the problem of keeping

this sounds simple, but I think it's a big problem.

Keeping markdown files checked into a project

up to date with the underlying code. Because I think this is a big problem. Because I think of these markdown files, generally, that are computed by agents, they're just to me, they're just token caches. Right? That LLM has, like, evaluated a bunch of tokens in the code base and then materialized that knowledge in more condensed form. Right? And I think that's actually gonna happen recursively in large software projects. But keeping it up to date, it's actually

another instance of a data pipelining problem because you can't recompute it every time because it ends up being too expensive. So how can you do that intelligently

and keep it up to date? I think it's just one pillar of what is gonna be needed to do AI accelerated

software engineering at scale. That's my term, by the way. I despise the term vibe coding and hope we don't talk about it here.

Yeah. An alternative term that I heard recently is AI native engineering.

That's pretty good. I will take it. I will take it. Agentic engineering is pretty good too, but I don't know. Agentic is like one of these words now, which I, like, only use as a last resort.

Absolutely.

Well, thank you very much for taking the time today to join me and share the work that you've been doing on your agentic analytics system and, the experiences that you've learned there. So I appreciate that, and I hope you enjoy the rest of your day. Alright.

Thanks, Tobias. Thanks for having me.

Thank you for listening, and don't forget to check out our other shows. Podcast.net

covers the Python language, its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems.

Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about

it. Email hosts@dataengineeringpodcast.com

with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.

Data Engineering Podcast