In this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data modeling that directly captures a company's business model. Vijay shares insights from his decade-long experience building data practices at Rent the Runway and explains how the modern data stack has led to a proliferation of dashboards without a coherent way for business consumers to reason about cause, effect, and action. He explores how metric trees differ from and interoperate with other data modeling approaches, serve as a backend for analytical workflows, and provide concrete examples like modeling Uber's revenue drivers and customer journeys. Vijay also discusses the potential of AI agents operating on metric trees to execute workflows, organizational patterns for defining inputs and outputs with business teams, and a vision for analytics that becomes invisible infrastructure embedded in everyday decisions.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- Data teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.
- Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
- Your host is Tobias Macey and today I'm interviewing Vijay Subramanian about metric trees and how they empower more effective and adaptive analytics
- Introduction
- How did you get involved in the area of data management?
- Can you describe what metric trees are and their purpose?
- How do metric trees relate to metric/semantic layers?
- What are the shortcomings of existing data modeling frameworks that prevent effective use of those assets?
- How do metric trees build on top of existing investments in dimensional data models?
- What are some strategies for engaging with the business to identify metrics and their relationships?
- What are your recommendations for storage, representation, and retrieval of metric trees?
- How do metric trees fit into the overall lifecycle of organizational data workflows?
- When creating any new data asset it introduces overhead of maintenance, monitoring, and evolution. How do metric trees fit into existing testing and validation frameworks that teams rely on for dimensional modeling?
- What are some of the key differences in useful evaluation/testing that teams need to develop for metric trees?
- How do metric trees assist in context engineering for AI-powered self-serve access to organizational data?
- What are the most interesting, innovative, or unexpected ways that you have seen metric trees used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on metric trees and operationalizing them at Trace?
- When is a metric tree the wrong abstraction?
- What do you have planned for the future of Trace and applications of metric trees?
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
- Metric Tree
- Trace
- Modern Data Stack
- Hadoop
- Vertica
- Luigi
- dbt
- Ralph Kimball
- Bill Inmon
- Metric Layer
- Dimensional Data Warehouse
- Master Data Management
- Data Governance
- Financial P&L (Profit and Loss)
- EBITDA ==Earnings before interest, taxes, depreciation and amortization
Hello, and welcome to the Data Engineering podcast, the show about modern data management. Data teams everywhere face the same problem. They're forcing ML models, streaming data, and real time processing through orchestration tools built for simple ETL. The result, inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed, flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high memory machines or distributed compute.
Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI engineering, streaming, Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workloads, see what it can do for you at dataengineeringpodcast.com/prefect. Are you tired of data migrations that drag on for months or even years? What if I told you there's a way to cut that time line by up to a factor of six while guaranteeing accuracy? DataFold's migration agent is the only AI powered solution that doesn't just translate your code. It validates every single data point to ensure a perfect parity between your old and new systems.
Whether you're moving from Oracle to Snowflake, migrating stored procedures to DBT, or handling complex multisystem migrations, they deliver production ready code with a guaranteed timeline and fixed price. Stop burning budget on endless consulting hours. Visit dataengineeringpodcast.com/datafold to book a demo and see how they turn months long migration nightmares into week long success stories. Your host is Tobias Macy. And today, I'm interviewing Vijay Subramanian about metric trees and how they empower more effective and adaptive analytics. So, Vijay, can you start by introducing yourself?
[00:02:06] Vijay Subramanian:
Hey, Tobias. Great to be here. I'm Vijay Subramanian. I'm the founder and CEO of a metric tree based analytics startup called Trace.
[00:02:16] Tobias Macey:
And do you remember how you first got started working in data?
[00:02:19] Vijay Subramanian:
Oh, yes. Yeah. Back in, 2010, I joined the then seed stage startup called Run the Runway to head up all things data. And, that became almost ten year journey to IPO. And I had, like, a front row seat to what, is now considered the modern data stack, the full birth and the evolution of the modern data stack. And, I don't know if you remember back in 2010, the rage at that time was the Hadoop ecosystem. So that's how far back that was. We actually went with Vertica early on. We had a Vertica on prem data warehouse before we migrated to Snowflake.
We had all these, custom scripts for data pipelines before we tried this tool called Luigi, and finally settled on DBT. And then early on, all of our reporting was just spreadsheets before we moved to Tableau and then Looker. So we were basically the poster child for adopting the modern the modern data stack. And it's precisely for that reason because I had a front row seat to seeing the innovation that was happening to enable the data producer to be able to ingest data, build the pipelines, maintain the pipelines. But I didn't see as much happening on the consumer side is exactly why I'm doing what I'm doing now. In fact, I would even argue that enabling the producer almost led to a proliferation of data assets, so dashboards and data models without regard to the consumer and how they're gonna piece together everything in order to make their workflows work.
So, we're hoping that MetricRease is that framework and that platform on which we can regulate this and also really provide value to the end consumer.
[00:03:52] Tobias Macey:
Yeah. Absolutely. It's definitely interesting how the late twenty tens into the early twenty twenties was really a sea change in terms of investment in and application of data at a broader scale than what had previously been the case where, for a long time, organizations would have some form of business intelligence. It was generally fairly small scale, limited, focused on a very small set of data sources largely driven by their core applications. And then the introduction of Hadoop brought in the idea of big data of just capture everything, and, eventually, it'll be useful. But the investment required to actually run an operation of that scale meant that it was largely limited to the big tech players and early adopters of the enterprise. And then as we moved later into the late twenty tens, things like Redshift and Snowflake helped to reduce the barrier to entry for data warehousing and business intelligence, led to what you mentioned as far as the rise of the modern data stack as well as, introductions of things like Fivetran, Airbyte, all of these data ingest tools that made it easier to onboard new data sources. Yet, to your point, led to this proliferation of data without any real solution of how to actually integrate it and make it useful.
And going back to the point of data warehouses in the nineties, there was the introduction of the Kimbell style star schemas. There was Bill Inman with his third normal form data warehousing. I think it was maybe early to mid two thousands was the introduction of Data Vault. So we had all of these frameworks for, in the abstract, how to organize data into reusable structures. But, again, that required a lot of investment. It required a team who was very well versed in those practices as well as being very well versed in the business and how it operated. And I'm wondering if you could, given all that context, describe a bit about how the idea of metric trees fit into this evolutionary ecosystem, where we are today with this proliferation of data sources, the challenges of being able to actually activate it, especially now that data is being applied to a much broader range of use cases than just retrospective analytics.
[00:06:16] Vijay Subramanian:
Yeah. Exactly. Yeah. I mean, that that was a great that was a great timeline there. Yeah. I think the in in fact, I don't know if you follow the timeline. We kinda got away from data modeling for a bit in the modern data stack. We just thought we'll throw all the data in. The compute is cheap. Storage is cheap. We'll just run whatever we need to for the business. And we got away from a lot of the principles, which I think we're coming back to now. And if you really read those classics, if you read Kimball, Inman, they were actually obsessed with the idea of gathering business requirements in order to do the data modeling. It wasn't just go and build a bunch of data models.
Everything derived from what the business needed to do and working backwards to build the data models. So if you look at it through that lens, Metatree is just simply an evolution of that concept. Because, you know, you can imagine going to the growth team and saying, what are your five or six different output metrics you care about, and what are the factors that drive that? Let's design those. Those become your metric trees. And then now let's work backwards to design the data models you need to populate those metric trees. Because you really you're really going business first before you think about the data. And I think that's from that lens, I think metric trees are very logical evolution of what data modeling can be.
And if I were to sort of, like, step back and sort of define met what metrics are before we dive, you know, far you know, deep into it, that is their producer angle. Right? And the and from a data producer angle, it's just a next evolution of data modeling. Right? It's just maybe it's the final frontier of data modeling. From a consumer standpoint, a metric tree can be seen as a a metric template that captures the business process or the business model. So, you know, the way I think about it is sort of combine these two concepts, and I like to define metric trees as the data model that captures the business model.
So that that's sort of how we think about it. And so yeah. So, you know, why are we doing that? Right? Like, what's the purpose of of doing all this work? If you could capture the business model through data and you could capture that in code and all of this work you mentioned. Right? I mean, we are spending so much time today, tedious work, manual work, repetitive work, time consuming work, even boring work, trying to understand what is happening to my metrics. Why are they up? Why are they down? How are we performing versus budgets and plans, OKRs? What does the forecast look like if things stay the course? What does the forecast look like if things change?
Are my experiments working? Are my features working? What should I be doing if I move this metric x? What does it do to revenue? So all of these questions that sort of animate the organizations, we could actually start streamlining them in software with this data model that captures the business model. And in fact, the most ambitious vision would be, can we automate them? Because, you know, in in sort of, you know, what if what if there are a certain set of metric tree templates that govern the various business models in the world? And what if there is a finite contained list of analytical operations or functions, if you will, that can be applied to the metric trees that can actually power the workflows that users need. What if that were true? Right? That's sort of, I think, the where I think I'm excited about the purpose of metric trees.
[00:09:29] Tobias Macey:
And metric trees are another layer that builds on top of those dimensional structural elements of warehousing. And in terms of the nomenclature, they beg a comparison to the introduction of the idea of the metric layer, also known as the semantic layer, also known as headless BI, that became popular in the 2021 to 2022 time frame and gained a lot of popularity in the initial introduction, has since faded a little bit in terms of the hype around it, but is still very much part of the data ecosystem. And I'm wondering if you can talk to the ways that the work that you're doing with this idea of metric trees relates to the principles as well as maybe the technology layers that were brought in with those concepts of the metric layer or the semantic layer.
[00:10:24] Vijay Subramanian:
Yeah. Maybe the best way to illustrate this might be just an example so we can make this whole thing more concrete, for us, for the audience. So let's take Uber or Lyft because we all know that how they operate. And in Uber or Lyft, the key metrics would be number of rides, number of riders, revenue. Right? And if you were to if you had a semantic layer or a metrics layer where you could define these metrics, you could then ask that, system to give you, hey. Give me the number of rides in New York City in the month of September, and that system will generate the sequel and give you the output. Right? That's sort of the the purpose of a metric layer or semantic layer. In a metric tree formulation of this problem, you would define, you know, individual metrics don't really play a role, right, because you're defining the metric tree and the output metric, let's say, is revenue, and you would define that as a function of number of rides, ride frequency, average price of the ride, maybe a promotion rate if that's applied, the take rate that Uber takes, right, you know, one minus the commission that they're paying to the drivers. So you express this function and you say these five, six different things ladder up to this thing called revenue. Now that's what a metric tree is. Now you can obviously see that a metric layer, if you already have built it, that could be an input to hydrating and populating a metric tree. Right? So if you already have defined this metric layer and you could access that in a nice way, then you could populate the metric tree. But you also don't need a metric layer to populate a metric tree. Because you can define a metric tree, and you could push a lot of these metric definitions into the metric tree system itself. You don't need a separate layer to define the metrics in order to populate that. So the way I sort of think about this this concept is, like, if there is a standardization in the in the ecosystem where there is a standard, metric layer, then, yeah, that can be consumed to hydrate and populate the metric tree, but you don't necessarily need that in order for a metric tree to operate on its own. And I should make this point because it is also a question of purpose here. Right? A metric layer's purpose is almost sort of self evident. Right? You define the metric and then you ask for the metric and it gives you the metric, and that ends the purpose. But as a metric tree, if you simply hydrate the metric tree, that is just the that is not sufficient in my opinion to to to write value. It's just a visualization of a bunch of metrics in a canvas with a bunch of boxes. Right? I don't think the value is yet realized unless you're building these analytical functions I spoke about, unless you're building a real workflow around it. So so I do think there's a lot more to be built around metric trees than what the metric layer ecosystem today has.
[00:12:54] Tobias Macey:
Another interesting aspect of this question of metrics, particularly in juxtaposition to those dimensional schemas that have been the bread and butter of data warehousing for the past thirty years now, is if you're doing the dimensional modeling correctly, then why do you even need a separate representation of metrics? And it also introduces comparisons to things such as the differentiation between a data warehouse and an OLAP server as well as comparison to practices such as master data management as well as data governance policies and how that reflects to the technical representation and just the idea being that raw data is only useful if you contextualize it. The way that you contextualize it is to compare it to the questions of business process.
And then the canonical example of why you need effective definitions of metrics is that different people across the business have a different concept of what a specific phrase might mean where customer has a different, meaning to a marketing team versus a finance team versus a sales team. And so the business events that cause somebody to go from a prospect to a customer have different thresholds across those different use cases. And I'm wondering if you can just maybe wrap that altogether and explain why just the practice of dimensional data modeling is not sufficient, or what are some of the aspects of that as a general practice and policy that leave gaps that necessitate the introduction of a new concept and a new technology layer.
[00:14:36] Vijay Subramanian:
Totally. I think this sort of yeah. Did it just wrap everything? I think it's really it gets down to the the purpose question. The Kimbell framework, the Inman framework, they were all designed to thoughtfully build data models with the use case being that a business is able to retrieve flexible calculations of metrics, right, inside, like, some BI tool, for example. So yeah. So give me revenue in New York City, for, you know, for Uber. Right? Give me rides. Give me riders. Give me for September. Give me the the rolling twenty eight days. Cut it by UberX versus Uber share. Like, all of these various cuts can all be provided in a well governed way, but that was the end purpose. The purpose is can I generate these metrics, which is really what the metric layer is also, in in a sense, trying to do? Right? If you look at it through that lens, the metric layer is just a formalization of on top of Kimbell, a structure that also captures the metrics itself as a concept. So it's not just facts and events and dimensions. It's also metrics and dimensions. Just like one other it's one layer further up in terms of abstraction.
Whereas I think of metric tree as really getting as close to the business model as you can. So back to the point of it being a data model that captures the business model. And now that is a back end on which we can build analytical functions that can actually directly affect the workflows of the work that's being done around data in the organization. So it's just so I don't think, like, Kimball, there was any drawback to that. It's just, like, that framework was useful to generate metrics. And now we're talking about users wanna activate the data to use your to use your phrase. They wanna use the data. They wanna get insights quickly. They wanna make decisions quickly. And how do we make all that happen? And the answer is we gotta keep pushing the attraction, keep pushing the frontier of modeling forward, and that's where metric trees really fit in.
[00:16:24] Tobias Macey:
And so digging into that purpose and the definition of the metrics and their relations and how that factors into the questions that the business is asking and needs answers to. What are some of the strategies that you have found most effective and that you recommend for teams who are in the early stages of trying to adopt metric trees and understand what are the definitions, how do they relate to each other, how do they ladder up from the very fine grade events into the more abstracted, contextualized business questions that are being asked about the data that underlies those calculations?
[00:17:11] Vijay Subramanian:
That's a really interesting question because, if you talk to business folks and I, you know, I I helped a start up go from seed stage to IPO, so worked with the business extremely closely. And you talk to any business folk, you will and you talk to them about, you know, what are the drivers, what are the input levers you have, what are the output metrics, the KPIs, and have this conversation. You will realize very quickly that they already have a rough mental model that kinda looks like a metric tree if you were to, you know, think about it that in that context. They wouldn't call it a metric tree. They definitely don't have it fleshed out and thought through and refined it, but they're already thinking of the connections. If they think of these metrics that they are tracking and how they're sort of connected, and that's what they're trying to accomplish when they log in to a BI tool and look at a metric and then hop into another dashboard and apply a filter and look at that metric. And so they're kinda like dancing around these dashboards or spreadsheets in a way sort of tracing the connections between the metrics because they know these are all laddering up to something important. Right? And there's some connections between them. So, really, the job to be done here is not anything radically new, but it is sort of like in fact, pushing the frontier of how Kimbell and Invent's and those classics thought about it is, like, can we work with the business to define and flesh out how they really think about all the different metrics and the levels and how it ladders up to the outcomes that they care about? Because really, in a sense, the reason why the dashboards are proliferated is because they're sort of, like, making these requirements come piecemeal. Hey. Give me this cut of this data. Give me this cut of that metric. And then it starts to be starts to grow and grow. But, really, there is a connective tissue underlying all of that, which actually is a framework in which they are thinking about what are the input levers, what am I moving, what is what metrics are moving in the output layer. So I think the job to be done is relatively simple, but simple to say, but kinda hard to do in organization, which is really have a very active conversation between data teams and business teams and flesh out what are the metrics, how are they connected, what are your levers, what are your outputs, and keep refining that mental model as you learn more and more about the business. That's the job to be done. In fact, by the way, a financial p and l, which is probably one of the oldest data models you've ever built, is actually a metric tree. It just ladders up a bunch of concepts to drive in the net, you know, margin and EBITDA. It is it might be the OG metric tree. Right? A financial model, a financial p and l. And if you go to any FP and A team at any reasonable scale of an organization, they have a sophisticated financial model, And they're operating on that because it describes how they make money, how cash flows. And so all I'm all we're talking about here is taking that model, taking that discipline, and making and and ensuring that we can do that at all these different functions, sales and marketing and product growth, if you exact team and how they think about metrics. So just take the discipline of modeling the business and all these various functions, operations, right, and making that come to life is really what we're talking about here and working with the business on that.
[00:20:05] Tobias Macey:
You mentioned too that a dedicated metrics layer as a separate technology implementation is not a requirement for creating and using metric trees, and I'm curious what your overall recommendations are as far as the storage and technology implementation of the metric tree. Is it just another set of tables within an overarching data warehouse? Is there some other access layer or security layer? Or what what are some of the considerations to go into as you're starting to think about, okay. Well, metric trees as a practice seem beneficial. I'm going to go ahead and define what are the different calculations and how they roll up into the top level events that I care about for being able to then also support this drill down approach and just some of the overall planning and implementation details that go into actually creating and activating these metric trees in the business context.
[00:21:05] Vijay Subramanian:
I should mention upfront that we're very early in this, and we're trying to figure the you know, we're gonna figure all this out as as the market evolves. I think the it's a very important question that you just asked because it gets to the utility question. Like, why are we doing this work? Why are we storing metric trees? What are we generating? How are we retrieving it? What are we gonna do with it? And I think it goes back to the the point I made earlier about the analytical functions that we can operate on top of the metric trees to provide value to the business. Right? So through that lens, I mean, maybe the best way to think about this is let's just do an example again. Let's go back to the Uber example. Let's assume you had a metric tree of revenue with, with these terms. Right? Rides, right frequency, average price, commission rate, etcetera. And let's say revenue is down 5%, and you wanna understand why well, what is driving that. So, a, you have to first now apply an analytical function to see which of these terms are driving that 5%. And so now let's assume that the right frequency is the driver, and that's explaining 4.3 of the 5%, hypothetically.
You may wanna go to the right frequency and say, well, how does it vary by all these various markets? Right? New York or, you know, in The US, we you know, globally. And when you do that, you you also may wanna know, is it because you have certain markets where the frequency is going down? That's where the frequency is going down. Or is it because markets are shifting around? And a higher frequency market like New York is going down in market share and a lower frequency is going up, and that shift alone is causing the overall right frequency to go down. Right? So these are the things that the business wants to understand instantly. So to do that, take that question and apply it in terms of how you would think about storing and retrieving metric trees. A, you need to first define the core revenue tree, save that somewhere. Okay. This revenue tree has these terms. You need to be able to apply an analytical function to that to figure out which is the term that's driving that.
Then take the that term and mutate the tree because you're now saying, okay. Let me take the auto frequency and mutate it and further decompose it, if you will, into all the different markets or maybe into UberX and Uber share. Right? So you wanna mute it and then apply another another function to further explain, you know, what is what is causing the frequency to go down. So you can see that if you really look at it this way, it is not sufficient to just generate the metrics and populate the metric tree. You really need to think of the metric tree almost as a back end.
The the metric tree data model is almost like a back end to a system, to a software system in order to empower the workflows for the consumer. And I make this point because, I have definitely seen out there people talk about metric trees and they will you can see a visualization of a bunch of boxes of metrics and arrows and they're they're populated. Yes. You can run SQL through them and populate these boxes. And that is there's some value to that to seeing all the metrics in one place. You can think of it almost as a dashboard or dashboards. Right? Like, you know, you have all the different dashboards in one dashboard. But I think the utility is fairly limited unless you're operating on it, unless you treat it as a unless you can apply analytical functions to it, unless you can mutate it, unless you can do things with it in an interactive system. So my view of it is that that that that's why I said, you know, metric, metric layers can be useful as a mechanism to populate the data. Or, ultimately, you have to treat the metric tree itself as a key data artifact that can be a back end of a larger system.
[00:24:28] Tobias Macey:
To your point of the metric tree being something that can be used as a dashboard of dashboards, it also brings up the question of what are the systems that are going to be interacting with the metric tree? Is this just an optimization to the business intelligence dashboard where you're doing this retrospective analysis, or maybe you're using some sort of forecasting model to give projections in the dashboard, but, ultimately, it's a very static and manual process of interacting with that data? Or do you see metric trees as something that feeds into a broader range of data consumption use cases? And what are some of the ways that you're seeing metric trees feeding into the broader applications of data maybe beyond just these static dashboard focused use cases?
[00:25:15] Vijay Subramanian:
Yeah. That's sort of the key question here. Like, how how are things gonna play out? I think so the way I'm thinking about it again is, like, well, let's take all the workflows that users wanna do, and let's try to use metric tree as a back end that we can use to empower these workflows. Can we automate them? Right? That's where the ambitious state here. Can we automate this work? How much can we automate? So in that sense, yeah, it is almost like we're taking all the work that's being done today manually in within BI tools or spreadsheets and all of these things, and we can build workflows around it.
Now so does that mean that we're the Metairie will be a back end that feeds into existing tools and they're gonna adapt that framework? Or is it that we need a a new set of tools that will do these functions that we're talking about? Right? Obviously, our vision is that we wanna be that place where we can do all those functions, but you you know, you never really know how it's gonna play out. I mean, you already know that there are players out there, like, older tools. Like, we maybe even take a a mixed panel, for example, you know, a tool that does web analytics, bot analytics, and they are releasing metric tree into their framework. Right? So so, yeah, we just have to see how this all plays out. Like, whether this is a another input into another tool or whether this actually can be a net new thing on top of the BI infrastructure. My honestly, if you would ask me for my grandest vision, I would say this probably sub plans BI. Because BI had its, the BI as it formulated today, which is a a catalog of metric dashboards that can be created and disseminated, will be replaced by analytical functions that are directly embedded within the users the business users workflow. And what does that look like? That's, I think, what metric trees can enable, and we have to see how that all plays out.
[00:26:56] Tobias Macey:
Because of the fact that metric trees have that relational aspect to them as well, relational in terms of the graph sense, not just the, you know, relational database sense, It introduces comparison to things like GraphRag where the introduction of a graph and the connections and the relationships that objects have to each other enhances the ability for things like generative AI models to perform better reasoning tasks across that underlying data asset. And I'm wondering how you're seeing the utility of metric trees as a means of context engineering and context management for these generative AI cases that are maybe doing some of that question answering, whether that's text to SQL or talk to your data use cases or if it's something where you're using those ML or those AI models to generate new derived data assets using those metric elements as context to understand what are what is the actual resulting context that it outputs beyond just being some scale or value.
[00:28:06] Vijay Subramanian:
That's a super exciting topic. You know, we've been chasing this, holy grail of self serve in data for a long time. And what's interesting to me is how self serve has always been formulated as self serve data. Right? You're going to have a business user be able to ask some question to retrieve some custom dataset, and this back end system is going to find the right tables and do the right joins and aggregations and filters and write the SQL and give you back a dataset that satisfies your request. But the interaction is still, like, give me give me some data. And I can tell you, you know, you know, I've been in this space for two decades now, and the vast, vast majority of business folks really do not want to spend hours porting over datasets, trying to make sense of it, doing their own analysis.
They just want very contextual insights into what they need at that moment. You know, what is working, what's not working, what do I do next? So through that lens, I'm thinking that this text to SQL, ask your chat with your data kind of formulation of AI is much more in that vein. Right? Like, I can ask a question and then AI is now going and doing all this work and it generates the SQL and gives you the data. But is that really the right problem to be solving? I'm really not convinced, honestly. Not to mention the complexities of doing this in a in a deterministic reliable way and making sure that the numbers you're generating are accurate. Is that even the right problem to be solving? So the way I'm thinking about generative AI is, can these let's just use the word agents because we haven't used the word agents in almost half an hour in a podcast, and that's and it's criminal. Right? So we gotta say AI agents. So I'm thinking of AI almost as, if you give it this graph, as you said, all of the the graph which actually represents the business model. Here are these, like, you know, 15 different ways in which the business thinks about its metrics and how they all relate to each other. And you give it access to the analytical functions we talked about. Can it actually can different agents, you know, blend these two concepts together and perform workflows for the business user? So that's that is the vision that I'm that I'm actually pretty excited about, that AI is not necessarily in the business of going to the raw data and trying to write some SQL and generate some data, but it's actually using the framework of how the business thinks about metrics and the kind of operations that need to happen and instead of cobbling these together to drive a workflow for the business user. So that's sort of the the the division that, we will we wanna work towards, and I think that I think is probably the more exciting problem to be solving than, can we do some text to SQL and give the business users some datasets.
[00:30:36] Tobias Macey:
The other aspect of this is, as you mentioned, there is a purpose to this, but it's also not free. It requires work in terms of the initial definition and creation of these metric trees and the underlying metrics and the computation and derivation and storage thereof. It also, once you have it in place, requires ongoing maintenance and monitoring and validation as well as occasional pruning as either you discover that an initial metric that you thought was useful turns out to not be useful or the business evolves such that those metrics are no longer relevant to the questions that are being actively asked and sought after. And I'm wondering if you can just talk to some of that overall life cycle and workflow management of metric trees and maybe some of the places where you're seeing the responsibility lie in the organizational structure of the investment in and maintenance of those metric trees as a core organizational data asset?
[00:31:44] Vijay Subramanian:
Yeah. So I think the the producer angle is a bit probably a bit more clear and straightforward. So because you can think of the metric tree artifact as an extension of the data pipeline, if you will. So you have your ingest. You build your dimensional models, which we talked about. You have your facts, your dimensions. And while a metric layer will build maybe metric cubes, for lack of a better word, a metric tree is building a metric tree cube. It's generating all these different metrics and how they relate to each other, and that could be viewed as an extension of the existing data models that are being generated.
So so through that lens, the the creation, the maintenance of these assets will follow sort of the same principles that's already been established in the in the data domain. You know, you care about things like, is the data up to date? You know? Is it fresh? Is it, is it running reliably? Is it accurate? And so it's sort of the same ideas that we've been you we've been working on for the last decade will still apply to those assets. Now the again, if you wanted to think about, you know, what if I change the metric tree? What if I build a new metric tree? So the same sort of ideas will apply. Right? Now you have to generate a new asset. Can you borrow from existing met you know, nodes in the metric tree into this new metric tree? Can I you know, as opposed to, you know, recreating the business logic again for that? So So the same ideas again will apply there because, you know, you can think of a metric tree actually because it is a graph. It has DAG like properties. So if you're deriving a second metric tree that can be derived from the first two metric trees, then you wanna be able to borrow those nodes so so you don't have to rerun the the data for those nodes.
So in a sense, actually, it kinda, like, falls nicely within that within the work that we've been doing for the last decade and builds nicely on those principles.
[00:33:30] Tobias Macey:
Digging now into some of the challenges and edge cases of metric trees, it's definitely very easy to see the value when you have clearly defined hierarchies of events that roll up into one another or things that are easily attributable based on a cause and effect relationship. But in business, as in life, there are numerous situations where you maybe have a correlation but not a causation, but you still want to try to understand what are the correlated activities that roll up into some higher order insight. And I'm wondering if you can just talk through some of the ways that you think through how to manage some of those fuzzier relations of events and how they roll up into metrics and some of the nondeterministic, but at least tangentially related business events that you want to be able to report on.
[00:34:27] Vijay Subramanian:
Yeah. I mean, I do get that question a lot, and it's surprising how often it comes up in conceptual thinking. But when you go and work with the business, you realize that 95% of how they are operating the business is actually modelable through relationships, whether it's mathematical, whether it's two dimensional cuts, or what have you. So, yeah, I mean, I would probably, you know, put that in the category of the known unknowns to use the Rumsfeld for to phrase. There might be there might be metrics that are floating around that you don't know how they relate to each other. By the way, they may or may not have a relationship to each other, and they might even be unknown unknowns. Right? Things that are related that you don't know yet. I you know, if the way we are the the way we think about that is that, you know, you can definitely the best that humans can do at this point is, you know, you know, put them on a, you know, tool and try to run, as you said, correlations and regressions and see if there are relationships there. If you can come up with a tool that automatically figures out causality, then I think we're we're approaching, we're approaching AGI. Right? Because you basically can have a a metric tree that models the business, and there is this, this machine that's basically figuring out what is driving the business and starts moving the levers automatically. So that would be an extremely exciting, time to live in, but I don't think we're anywhere close to that right now.
So yeah. So while Matituaries can sort of hint at causality, I mean, as you said, we're very much at this point focused on modeling the known relationships. And the best we can do for the unknowns is to try to regress and come up with some some metrics, for lack of better word. But it's not, at this point, like, a tool where you can automatically just divine causality. Right? Because that sounds that's that's incredibly hard problem.
[00:36:04] Tobias Macey:
There's definitely no magic bullet, as you said. And so from what I'm gathering of your response, there's no definitive way for you to be able to intuit those correlations. But building on top of the core principles of metric trees and their definition, you can create the relations as you go through the process of defining how they fit into the overall business processes. But I'm wondering if maybe there are some some some of the primitives in terms of how you think about the definition of metric trees that maybe conveys some of the level of uncertainty or the level of a lack of a strict deterministic relationship between those two metrics that can be reflected in the resultant analysis or the resultant display and presentation of that information.
[00:36:55] Vijay Subramanian:
Right. You just reframe to the fact that we wanna be careful not to imply a relationship when the relationship may not be there. Right? Yeah. I mean, this happens a lot even in when the financial team builds a financial model. They'll just go in and say, what if I move this cell? The output metric changes because you've expressed relationship, but is that actually valid at all times? It may not be valid in the future. Like, if you throw more money on paid marketing, it doesn't mean you're gonna grow revenue proportionally. You expect the conversion rates to go down. Right?
So I think what is probably what you may be hinting at, I would have I would think about it differently. I don't think it's that the relationship is I don't think the relationship is completely unknown, and it's completely confusing or fuzzy. I think it's that the relationships are changing over time, that there may be boundaries within which they apply and boundaries where they may not apply. So if you think about the example I was saying, as you spend more money, maybe you expect, you know, another metric to go down, and then there's a net result of that, which is your cost of acquisition. So I do think we can model those kind of concepts in the product. Right? We can say, okay. We know that these two metrics ladder up to this output metric, and we know that, but these two metrics tend to go, in the inverse direction that as you spend more money, the other one goes down. Right? But is that by the way, is that true for every company at every regime? Probably not. This, you know, you know, in in some regimes, you probably can spend more money, and you probably still can grow successfully at the same at the same profitable rate. So there's definitely some contextual business, contextual aspects of this that are not as straightforward in just the the the the the mathematic expressions itself.
But the concepts that these metrics can behave in certain ways in relation to each other can be expressed. But the key is to get the the relationship model in the first place, I think, because that's what's not happening today. I don't think we're expressing relationships as explicitly on paper for the organization to align and have a discussion around that. And this metric is just a is just a way to force that to force that conversation. And I don't even mean that for for things that are related like, cost of acquisition and and conversion rate. I mean, even for metrics that you think are related. Right? If I improve my customer service response time, I'm gonna drive retention of my customers.
Those are too fuzzy, you know, it's it's it's not there's no mathematical relationship. It's sort of implicitly there. But people will talk about it. But who actually goes in and and, takes those two metrics and runs a regression today actively? Not a lot of companies actually. You'd be surprised. So a lot of the discussions happen in theory. They say, hey. If I'd I think this will be affecting that, so I'm gonna go work on this. But you don't actually express it and try to run something to validate or or invalidate it. So, really, I just think of this ultimately as a way to just start forcing these conversations, putting these things on paper. Let's actually see if there's a regression even if it's not causation. Let's see let's see. Is there even a correlation between customer response time and retention rate? Right? Let's actually prove the you know, let's see what the data says. So, yeah, that's that's kinda how I think about it that this is just a forcing function to align the business and the execs and the data teams around the metrics that matter. And then the actual analytical functions and the causality and those hard things will just follow over time.
[00:40:05] Tobias Macey:
That brings up an interesting question too as far as the behavioral impact of having metric trees as a more declarative representation of the cause and effect of different business activities and the ability to be able to do some of that analysis and also maybe how that fits into some of the patterns around experimentation, both in terms of technical systems as doing things like AB testing of different features, as well as organizational practices of saying, hey. We're going to experiment with maybe changing the allocation of our salespeople and the regions and how we define those regions and seeing how that impacts the overall outcomes and maybe forcing that into before you embark upon those experience, maybe doing some of that definition of, okay. Well, we think that doing this activity is going to have this outcome. But before we bother making that change, let's go ahead and do that definition of these metric relations so that we can get a more clear before and after picture and see what is the actual causation and not just correlation.
[00:41:23] Vijay Subramanian:
That's an excellent point, actually. Right? Because the bleeding edge companies that are operating on data who are, let's say, running and they and they can afford to run experiments at scale, do think about this all the time. Right? What is what is driving what is driving what? Can we isolate the impacts? Can we run an AB test? And we're, you know, only a small select group of people are doing that today. And I think, in in a sense, the met I'm hoping the metric tree is a more is a simple but still a more accessible and a more intuitive way that it sort of it's able to, it enables the business and the data teams and exec teams to have this conversation.
And you don't necessarily need a cutting edge data science team doing experimentation for that. Right? It just becomes part of how they talk about everything as opposed to just give me this dashboard.
[00:42:13] Tobias Macey:
And in your work of building the in your work of building a business focused on metric trees as this core data asset and encouraging its use in organizational exploration of their practices, What are some of the most interesting or innovative or unexpected ways that you've either seen metric trees used or implemented?
[00:42:40] Vijay Subramanian:
Yeah. I mean, when we when we started working on this, I of course, we knew that the the the most obvious metrics would be time series metrics like revenue and users and whatnot, and that is still the case. But we have, built a bunch of customer journeys, for lack of a better word. You you take a cohort and you track how they evolve over time in order to ultimately, you know, ladder up to some lifetime value or just like we're modeling all the critical junctures of a customer and what the business actually wants them to do, right, versus what they're actually doing, which if you think about it is actually, really a very pragmatic way to think about levers. Because, ultimately, you're you're you're you're building something to to facilitate a customer's existing demand or try to change their behavior on the margin or, you know, radically.
So I I found that to be a pretty interesting use case of building a tree is to track a customer journey over time, which I you know, we you know, you know, as businesses, we've all done those, but to see that in a tree format is pretty powerful.
[00:43:41] Tobias Macey:
And as you have been digging into this ecosystem and building a a company around it, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:43:52] Vijay Subramanian:
Well, yeah. So I I think, honestly, I've been surprised by a few things. One is because I was a practitioner myself for a decade. And as I talk to a lot of companies now, I am surprised with all this focus on data and the investments that have gone into it. I'm surprised how many companies are still early in their journey on metrics maturity, for lack of a better word, knowing deeply how their metrics connect to each other and what the drivers are, what the hypothesis around the drivers are. I'd be surprised by that because I think it sort of reflects my original point at the very beginning is, like, so much focus has happened on, can we ingest all the data? Can we put them in a place? Can we make it accessible? And I don't think enough time has been spent on what is the what is the business model. Are we capturing it? Are we understanding the drivers? And I don't I'm I'm I'm sounding something I'm saying something very obvious, but I've been surprised by that conversation that we're having with organizations, which has not even been a function of the company size or scale. Right? Even large companies, I'm finding that to be the case. The other thing that that that is that that is surprising me is, like, you know, I I'll come across two companies that are very similar to each other. And they will talk about some overlapping metrics.
But they do think about these metric trees and these templates quite differently, which I found fascinating as well. It's pretty much the same business model, and And I can clearly see one company is doing a better job of expressing it than the other one. You know? I can see the differences. So all this to say is, like, I've been sort of just generally surprised at the maturity curve and the standardization, if you will, of metrics and how people think about it and because that's the lifeblood of data. Right? That's what data is really for, is to capture the business. So which just means there's a lot of opportunity out there to for organizations to level up and really, you know, spend their time on this problem and also level up in terms of, building your metrics in a way that actually is best of breed for your business model. And it's not a discussion that happens a lot, publicly. Maybe maybe it's because companies believe they might have an arbitrage opportunity here. I don't know. But that part of the that part of the equation is just not discussed as much as we discussed what database we're using and what, you know, what BI tool we're using. Right? So that that's interesting to observe.
[00:46:02] Tobias Macey:
Yeah. I think it's also interesting as you're talking about the ways that the businesses, even if they're effectively doing the same thing, are going to reflect their metric trees in different ways because they think about their business in different ways. And the lack of maybe standardization or componentization of those metric trees, it also brings to mind the work that, for for instance, companies like Segment were trying to do or are trying to do in the realm of things like customer data platforms where you say, oh, you're a business that is working in ecommerce, so we're going to presume that you're using Shopify and HubSpot and Marketo or whatever. And so we're going to standardize the way that we ingest those datasets and the way that we generate the Skibas. And so we're going to try and create a one size fits all solution for this segment of a a business or the style of industry and how that over the many times and many versions of that effort that have happened over the past several decades, it never actually quite works because everybody has a different way that they're thinking about their data. Maybe they're able to get some initial value from that out of the box standardized set of visualizations and reports. They always wanna ask different questions than their neighbor who's doing the same thing. And so it's just an interesting perspective and an interesting insight into the fact that even if it's the same business, it's not going to operate the same way.
[00:47:40] Vijay Subramanian:
Yeah. I think that's actually a really good point. So is the the question really is, like, is it because they have different strategies and tactics to drive what they're doing, that the way they think about how the metrics connect each other is different? Or is one of them doing something wrong, then the other is doing it right? And I think this is sort of the interesting question. And this is obviously independent of the data sources in which I think you were talking about. You know, can you standardize the data models? Because that can be that I can see being, being very complicated because there are edge cases. Right? You know, you need every you need two customers in the same business to be using exactly the same data sources end to end, and it's always the tail that gets you. Even if you Shopify for your main reporting, there's probably this additional tool using for customer service that are the other company is not, and now you that whole premise that you can build a unified data model sort of breaks, I think. At least that that that's my 2¢ on it. Whereas on the consumption side, I'm actually intrigued. Like, why is it that two companies in the same business are thinking about it differently? And maybe it is the fact that they have different approaches to how they wanna grow, and both might be valid for the phases of the company that they're at. But yeah. But I will definitely be surprised by seeing but I also will say I also see a hierarchy for sure. I don't think it's not necessarily a and b are different. I do see some companies are much more mature and thoughtful. They've thought about how their business works in a much deeper way than the other company. And I, honestly, I don't think that has anything to do with tooling. I think it's just the I I I think it's a people thing, honestly. I think it's just the individuals there are I've been thinking about this, and they are thinking about things at a deeper level than the other organization for whatever the case might be. That would be my humble hypothesis because I don't know if you have any other thoughts on that.
[00:49:18] Tobias Macey:
Yeah. I mean, I I think it's also maybe just a a factor of people wanting to be contrary and not being pigeonholed and saying, no. Even if you have these out of the box reports and maybe they're useful, I wanna think about it differently. And so I'm going to do the work to customize it because I have my own opinions about it.
[00:49:37] Vijay Subramanian:
Right. So we're creating work to to give ourselves value is what you're saying. Okay.
[00:49:44] Tobias Macey:
And so as people become aware of and explore the idea of metric trees, what are the situations where you would say that that is the wrong abstraction and they should just lean on standard dimensional models, or maybe there is some other approach to determining that causation or determining the appropriate context for a given business process?
[00:50:09] Vijay Subramanian:
Yeah. I mean, wrong might be a strong word, but I've definitely come across conversations where it might be either overkill or just fundamentally not interesting or applicable. Maybe that's sort of a better way of thinking about it for me. Like, a few weeks ago, I had a conversation with someone who has a early stage start up, and they are running, like, a few 100 projects. And these projects go through complicated steps to lead to an outcome in the end where they realize their revenue in the in the physical world. And they were interested in the idea of modeling that through metric trees so we can identify bottlenecks and optimize the process. Yes. We could do that. Right? We could do that. That's certainly doable. But my initial my my gut reaction was like, you have a 100 projects, just put that into a spreadsheet and pivot your way to glory. Maybe I was wrong, but that was my gut reaction. It's like, you know, do you really want to build a metric tree? And maybe there's more reaction of the effort it takes to get something up and running maybe. I don't know what it is, but I felt a gut reaction that maybe it's overkill right now for you to do this. In terms of, applicability, I think it really comes down to, I think, the business model itself. Right? I've definitely come across business models where, let's say you're highly partnership driven and you have a few deals that drives your business. It's still not the best use case for even metrics in the first place. Right? I mean, you know, even metrics are not that useful fundamentally.
And what are you tracking really? And consequently, metric trees and metric layers are also not very useful. So you do need, I guess, some kind of some level of data volume and some level of, breadth and complexity that the business can be measured through data for it to start becoming useful. And that's where I think, like, you know, I've come across cases where it's just it's probably overkill or just not applicable at all.
[00:51:48] Tobias Macey:
And as you continue to explore and popularize and define this concept of metric trees, what are some of the things you have planned for the near to medium term, either just as far as metric trees and how to think about and popularize them or the work that you're doing at Trace to make them operational?
[00:52:09] Vijay Subramanian:
Yeah. I mean, the the I I I have to go back to this formulation I had of these analytical functions. So a big vector in our road map is to keep building out these functions and these use cases. It's one thing to do retrospective trend analysis. It's one thing to then do how am I doing versus my budgets. Right? It's a different kind of analysis, and you're doing a variance analysis. It's one thing to then compare how is my channel a and channel b doing. It's one thing to report on experiment outputs and a b test output. So we're really focused on these analytical functions or workflows or what, you know, what have you. The other vector, which I'm genuinely excited about, not because everyone's talking about, is the use of AI agents.
I keep thinking that if we have this abstraction that captures a business model and it's already hydrated through data, it's already sitting there, and I have these functions that I know people want. What I mean, it seems like the perfect tailor made case for an agent to figure out how to combine them on demand for a business user's request. And maybe in the future, even connect to some activation tool, right, as you mentioned. Maybe it leads to some action and some tool, you know, and that can all be orchestrated in a layer where you no longer and and in the long run, maybe there is no such thing as, static dashboards and everything is, a fluid system of agents figuring out the business drivers and starting to push the users into taking action. So that's a super exciting, obviously, vision that I can paint, but the making it reality is gonna be incredibly hard. So, we are starting to take baby steps in that direction as well.
[00:53:45] Tobias Macey:
Are there any other aspects of this overall concept and application of metric trees that we didn't discuss yet that you'd like to cover before we close out the show?
[00:53:54] Vijay Subramanian:
Yeah. Maybe the one, aspect that sort of jumps to me is, is I do and I I I may have made this point earlier. But when I do talk to folks in the in the wild who have never heard this term before and I start describing it, their first reaction is like that it that is that is a visualization. Right? We're gonna draw some boxes and you can do that in a Canvas tool. You can do that in Miro or Mural. You can draw boxes. Okay. Now let's say the boxes can be hydrated with, some data. Maybe you can by the way, I've seen people do that in spreadsheets. Right? You just draw some boxes, and then in that cell, you have a SQL query running or something, and you're you you have some data, you know, and some data populated. And it sort of they can do that, and then they ask them. So, like, okay. Let's say I do that. Let's say I visualize it. That's cool. Now I can have a dashboard of dashboards. I can see them all in one place. But then they start to hit the question of, like, what is the value? Right? What is the long term value? Is there sustainable value? These are all very important questions.
So I just I do wanna make the point again that our realization is that the visualization is interesting and useful as an alignment vehicle. But I think the real value is going to be when that can serve as a back end to a system where you can actually operate on them and you can actually build workflows flows around them. So that's kinda what I'm really excited about. And I think that's not super obvious in the discourse around metric trees. It's just seen as like, oh, one yet another visualization. We have tables, we have charts, and now we have a graph. And I'm like, yes. That is true on the surface superficially, but I think there is a lot more to it and there's a lot more complexity to making that real thing into the organizations, and that's kinda why we're doing what we're doing. Otherwise, you know, why would I be building a company around? So yeah.
[00:55:38] Tobias Macey:
Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management
[00:55:54] Vijay Subramanian:
today? Well, I mean, no surprise here. I I feel that we have spent enormous, effort and will continue to on how we ingest data and clean data and build pipelines and monitor them and maintain them. I think, we to me, I hope the next decade is really about how do we activate, which I love the word, by the way. I think I'm gonna I'm gonna borrow that. How do we activate data? How would we make it operational? How do we make it useful? Can we build an operating system for business folks? Right? So it's not just self serving data, but they can actually do workflows around data. So that's sort of the the the thing that I think is the biggest gap. That's why companies spend they build data teams and they invest millions of dollars and the execs feel still frustrated, honestly.
Because I think that's the gap that I don't think people are able to quite articulate. And and and I also think the problem is interpreted correctly. The execs thing, like, we built all this thing. Now data should just tell me what to do, which is not how data works. Right? It's it's it's a reflection of reality. It's not a it's not a cause is it? If in the magic world where it can, causally find the drivers, then you're reaching AGI. Right? So I don't think that's data's job today. I think it's to reflect the business and help you be smarter and more thoughtful in how you operate the business and the levers. So I think that's the biggest gap is how do we make it useful and operational where people feel the value day in and day out. And and, honestly, at a point may maybe where it becomes almost invisible. Can we can we aspire to a world where you don't have these dashboards of numbers and cells floating around and it's just invisible in the way the business operates? That's that's that I think is sort of the goal I want us to work towards. I mean, is it achievable in in in our lifetimes? I don't know, but that's what I wanna aspire to, where data feels as invisible as software and AI or whatever, you know, whatever framework du jour that we're dealing with today.
[00:57:41] Tobias Macey:
And I think too on that point of breaking out of the static dashboard mindset, I've also been seeing some recent work as far as moving that analytical and question asking and answering workflow into more of the communication platforms of the company where the data analysis is happening using agentic workflows in the context of Slack where you can talk to an agent and say, hey. I'm curious about the sales numbers from last week and how they correlate with the number of calls that my salespeople made or what have you. And then the agent is able to, using some of these contextual cues, such as the work being done with metric trees, retrieve that information, generate an analysis, summarize it, provide a visualization on demand in a fashion where it also invites other people in the business to be able to participate rather than dashboards being more of a single player option of everybody can look at the same dashboard, but they're doing it in isolation unless they all happen to be in the same room at the same time talking about it and just bringing that more into a contextualized and conversational workflow, I I think, is an in interesting evolution that we're going through now as well.
[00:59:01] Vijay Subramanian:
Totally. Yeah. I I I think that's a great way of thinking about it. I I think that you wanna go away from static dashboards and you want dynamic on demand computations and insights. You do need that. You do need a back end that is going to be powerful, and the back end has to understand the business. It has to understand the analytical operations or functions as the tool I've been using internally in my company. And if all that come together, then, yeah, then the UI is just a slick interface where you ask us you ask a question and you get back a very powerful answer that you don't have to dig five levels deep to, to understand. So yeah. You're absolutely I mean, I'm I'm seeing it happen as well. I think that's probably how it'll eventually morph into, and you can just then tell your chat system to go do something as well. And then that go it goes and does something in response to that. So Absolutely. All all exciting stuff. But we have to make it a reality, and we have to make it reliable, right, that actually can do these things correctly.
[00:59:53] Tobias Macey:
Well, thank you very much for taking the time today to join me and share your thoughts and experiences around this idea of metric trees and the work that you're doing to help support them. It's definitely a very interesting new addition and a new style of data assets that I think will be very valuable. So I appreciate all the work you're doing on that, and I hope you enjoy the rest of your day.
[01:00:15] Vijay Subramanian:
Thank you so much. It's a pleasure to be here and chatting with you.
[01:00:25] Tobias Macey:
Thank you for listening, and don't forget to check out our other shows. Podcast.net covers the Python language, its community, and the innovative ways it is being used, and the AI Engineering Podcast is your guide to the fast moving world of building AI systems. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email host@dataengineeringpodcast.com with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Intro and guest setup: Metric trees for adaptive analytics
Vijay’s background and witnessing the modern data stack evolve
From Hadoop to Snowflake to BI: producer focus vs. consumer needs
Defining metric trees: data model that captures the business model
Purpose and promise: automating analytics via metric templates
Metric trees vs. metric/semantic layers: Uber revenue example
Why dimensional modeling isn’t enough: pushing closer to decisions
Adopting metric trees: partnering with the business on drivers
Implementation considerations: storing, mutating, and operating on trees
Beyond dashboards: workflows, tools, and the future of BI
Metric trees as context for AI agents and operational insights
Lifecycle and ownership: building, maintaining, and reusing nodes
Handling ambiguity: correlations, causality, and changing relationships
Experimentation and behavior change: making hypotheses explicit
Use cases and surprises: customer journeys as trees
Market insights: metrics maturity and divergent templates
Standardization limits, strategy differences, and people factors
When metric trees are overkill or inapplicable
Roadmap: analytical functions and agentic automation
Not just a visualization: trees as a backend for workflows
Closing: biggest gap—activating data beyond ingestion