Bringing The Metrics Layer To The Masses With Transform

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

Have you ever woken up to a crisis because a number on a dashboard is broken and no 1 knows why? Or sent out frustrating Slack messages trying to find the right dataset? Or tried to understand what a column name means?

Our friends at Outland started out as a data team themselves and faced all this collaboration chaos.

They started building Outland as an internal tool for themselves.

Outland is a collaborative workspace for data driven teams like GitHub for engineering or Figma for design teams.

By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets and code, Atlan enables teams create a single source of truth for all of their data assets

and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker, and more.

Go to data engineering podcast.com/outland

today. That's a t l a n, and sign up for a free trial.

If you're a data engineering podcast listener, you get credits worth $3, 000 on an annual subscription.

When you're ready to build your next pipeline and want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode. With our managed Kubernetes platform, it's now even easier to deploy and scale your workflows or try out the latest Helm charts from tools like Pulsar, Packaderm, and Dagster.

With simple pricing, fast networking,

object storage, and worldwide data centers, you've got everything you need to run a bulletproof data platform.

Go to data engineering podcast.com/linode

today. That's l I n o d e, and get a $100

credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show.

You listen to this show to learn about all of the latest tools, patterns, and practices that power data engineering projects across every domain.

Now there's a book that captures the foundational lessons and principles that underlie everything that you hear about here.

I'm happy to announce I collected wisdom from the community to help you in your journey as a data engineer and worked with O'Reilly to publish it as 97 things every data engineer should know.

Go to data engineering podcast.com/97

things today to get your copy.

Your host is Tobias Macy. And today, I'm interviewing Nick Handel about transform, a platform providing a dedicated metrics layer for your data stack. So Nick, can you start by introducing yourself? Yeah. Thanks for having me, Tobias. I'm a big fan of the show.

So I am the cofounder and CEO of Transform. And do you remember how you first got involved in data management?

Yeah. So for me, I originally studied math and then joined BlackRock out of college. And so

I was,

you know, working on a bunch of different technologies that I think now would be considered

legacy tooling,

but learned a lot about

just, you know, how BlackRock was using various macroeconomic datasets

to build models and do analysis on some of their portfolios.

And so from there, it kind of progressed

towards

wanting to do things in kind of, I'd say, more modern tooling.

And so started exploring

different opportunities

and moved over to Airbnb in 2014,

originally as a data scientist.

And this was kind of a golden

era of Airbnb's

data team. There was

a bunch of investment in tooling like airflow and then superset,

the experimentation

platform, the knowledge repo, just a bunch of great tools.

And so,

you know, kind of progressed from there.

My understanding is that

your work at Airbnb and experiencing the work that they were doing with their metrics layer was some of the inspiration for what you're building at Transform now. So I'm wondering if you can just give a bit of the backstory

of how you ended up where you are now and what you're building at Transform.

I I actually joined a few weeks before Airbnb released the the very first version of its metric store. It was called metrics repo, and it was actually

within

the experimentation

tool the company was building. So Airbnb was going through this shift of kind

of being a very design led company to being a both design plus data led company.

And as a part of that, I was really investing in tooling around product experimentation.

And so

I had joined the growth team and

the, you know, primary

job that I had as a data scientist was to run experiments.

And

when I first joined,

I was, you know, really just a bottleneck to

the product team that I was on because it took me so long to run analysis on each of these individual

experiments. And this tool came around that basically just made it easy to define

the various metrics that I wanted to use and do analysis on my experiments

with and just

built out the pipelines to then serve those metrics

to this kind of experimentation

readout.

And so, you know, very quickly went from running an individual experiment

a week to running tens of experiments at the same time

and actually getting to dive a lot deeper into the interesting parts of them

because all of the metrics and all of the different

basic analysis, the kind of stats testing and whatnot was served to me in this nice clean readout.

And so over time, Airbnb invested more and more in that framework.

And, you know, originally, it really kind of served the use case of experimentation,

but data scientists started to see that there were different applications. And so

my, you know, very naive approach was to

start running fake experiments and

generating metrics out of this, kind of automated data pipelining tool

to then pull into analysis. And then later on, it evolved into the tool that is now Minerva, which Airbnb talks a lot about. In the context of

analytics

and data platforms, I'm wondering if you can just share your definition of what a metric actually is and some of the ways that they manifest throughout the data life cycle.

Yeah. So a metric is a bit of an abstract concept. And to make it a bit more concrete, I might dive into a specific example,

from Airbnb. So

1 of our key metrics was night nights booked. It was kind of the North Star metric for the whole company. Every team tracked it. Every experiment run at the company was either trying to impact it or make sure that it didn't impact

it and get something else done. And so that metric actually makes sense in a bunch of different contexts.

So,

you know, it makes sense as how many nights booked were there by country, by listing type, by super host status, by whether it was a tree house or not. These things are called dimensions, and they bring context to numerical data.

And so

being able to aggregate that metric to many different dimensions is really powerful, and there's a clear relation here to OLAP cubes.

And

data engineers and, you know, today kind of more and more analytics engineers are responsible for building these nice clean interfaces into the data warehouse for broader business consumption.

And so by, you know, capturing these definitions for metrics in a somewhat

abstract way and then being able to flexibly build them to various different dimensional levels,

we can, you know, serve these nice clean datasets

to the company

that then allow less technical users to consume them. You know, we've seen a bunch of different solutions here around kind of summary tables or just, you know, queries existing in a bunch of different downstream tools from BI tools to

really, really a wide range of different places where people want to consume metrics.

And so the point of this definition of a metric in our framework

is to then be able to

both build those datasets in the warehouse and also build them in downstream tools consistently.

1 of the other pieces of terminology

that I've encountered that

is reminiscent of what we're discussing here with the idea of metrics is the concept of master data management where you have this 1 golden table that says, if you need to be able to query against using the example that you gave of knights booked, then you query it against this table because we did the calculation ahead of time for you.

And I'm wondering if you can just draw some parallels between

some of the

ways that master data management has been done historically and some of the challenges that it poses and what you are working towards with transform to

enable this more sort of flexible category of metrics that can be

calculated

sort of at query time.

Yeah. So there's this really interesting history of, you know, semantic layers

in general. And

there are, you know, a wide range of takes historically,

whether they existed inside of business intelligence tools

or they existed, you know, within

kind of data warehousing type solutions.

And the point of this tool is really to kind of pull that out and separate it from the various pieces of infrastructure

that are either storing

or applying compute to data,

and then all of the different places where people want to consume metrics.

As you said, there have been a few different generational shifts with the idea of the metric store being the most recent 1 and 1 that's been gaining a lot of attention at least in the past few months that I've been seeing it popping up. And I'm wondering if you can just talk through some of the ways that those different semantic layers have

been managed and some of the

challenges and complexities

that teams face when trying to

create and manage the context and the semantic meaning around data

and sort of what you see as driving the shift towards this dedicated metrics layer? Yeah. So it might help to kind of back up and define what a metric store is and then kind of dive into the various takes.

And so

I see a metric store as really these 4 pieces.

And that is the semantics for how you capture the information.

And it seems relatively simple. It's, you know, various

tables in the data warehouse, and they have connections or relationships to each other. But actually,

it's probably 1 of the most important pieces,

and it's something that Airbnb iterated on for years.

And it's also quite hard to change once you start capturing information.

And so kind of moving between different ways of capturing the semantic information.

It's a challenging evolution.

The second piece is really around performance. And that's kind of getting at this question that I think you're asking around,

you know, static. Are you building the datasets in the data warehouse? Are you, you know, building them to some kind of location that can serve them really quickly

versus dynamic? Are you asking on the fly for

some kind of metric denormalized dataset to get constructed?

The next 2 are really kind of how are you exposing that data to the rest of the company. And so

the 3rd piece is governance. How do you apply life cycle management? How are you managing the definitions of these metrics? And the last 1 is interfaces. How are you exposing these metrics to all of the different places where they're getting consumed?

And so, you know, when I look across various tools that exist, I think that they're

largely the techniques that they're applying can be bucketed into those 4 categories,

and there's varying levels of investment in each of those. And so, you know, I think that there are quite a number of tools out there that solve problems in each of those spaces.

But

the kind of metric store in my mind is a holistic solution to how am I consuming data off of the data warehouse to how

how am I consuming data off of the data warehouse to how am I making sure it's right and getting it into the various tools where it needs to get consumed from. As you're saying, historically, there have been a few different approaches to solving different pieces of the problem where a lot of it will live maybe in the business intelligence tool where there's

a way to add context to a particular calculation. But then if you need to be able to use that same calculation

in a

Spark job, for instance, then there's no clean way to be able to access that because it doesn't live in a place that Spark can easily get to without

reaching into the metadata database for the business intelligence tool.

And I'm wondering what are some of the

potential negative impacts of having

slight differences

or inconsistencies

in how these metrics

are calculated and maintained and differences in life cycle that can come about if you

think that you have sort of replicated a metric,

you know, accurately in 2 different places, but then later find out that maybe you, you know, flipped an operation or changed an order of operations somewhere, and all a sudden you're wildly divergent.

Yeah. Yep. Exactly. And so the challenge here is, how do you in this process of doing denormalization,

once you have, you know, these nice clean normalized models

sitting in your data warehouse,

how are you then going and kind of consistently building the datasets that you wanna consume in all of the different places that you wanna consume them. And so,

you know, there are a lot of different

negative consequences,

but I think that it kind of all boils down to

lost trust in data

and a lack of productivity

amongst the kind of data consumers.

Can they easily access the metric that they're trying to consume,

and do they trust others when they say they have some kind of insight?

When I joined Airbnb, there were 3 definitions of the company's North Star metric bookings. And so,

you know, the the big challenge there was that different teams would come to meeting and

say they saw this thing happening in the business, and then

there would be some disagreement. And ultimately, what it would boil down to is

2 data analysts staying after that meeting and just hashing out, you know, specific nuances of the SQL that they had written. And so it was an incredibly inefficient process,

but worse it,

you know, led to the higher ups coming to those meetings to just say,

I'm just gonna use, you know, intuition here. Let these data analysts figure this out, and then, you know, next time, we'll come back and look at the data.

And in terms of what you're building at transform, what are sort of the

primary goals that you have for the platform

and the target users that you have in mind as you're building out the overall system and the

user experience design and the integration points?

So the company's mission is to make data accessible.

And the philosophy, you behind how we're going to do that

is that

there needs to be better interfaces

for data producers and data consumers

broadly bucketed to communicate with each other.

And so, you know, our hypothesis is that

a metric is a really great interface, because

in some ways, it's the language of how nontechnical users,

you know, use to then

communicate

around

around data.

And so, you know, this all starts with establishing a definition

in our metrics framework,

and then exposing that broadly

to be, you know, both computed, but also to kind of share that definition in that metadata with a wide range of tools. And so that's where our APIs

and our metrics catalog kind of come

in. And then on top of that, there are kind of a bunch of different ideas for how we can use those metric datasets to do interesting things.

So there are, you know, ideas around

forecasting and anomaly detection and,

you know, applying annotations to metrics

and building datasets for experimentation

and, you know, really just kind of pushing metrics into the various places where people can then make use of them.

So the 2 users of the tool in my mind are kind of these, you know, broad buckets of data producers and consumers. And I think to get a little bit more granular on, you know, data producers,

it's some combination of data engineers, analytics engineers, data analysts. They're the people who

build the normalized datasets in the warehouse

and have a hypothesis around

how they should be consumed by the broader company. On the consumer side,

you know,

probably about 97%

of most companies are not, you know, data workers. They're not data analysts or data engineers.

And so really, I think, you know, these metrics should be consumable much more broadly.

And so that means building nice interfaces that allow them to then consume those datasets

or to pull them into

the interfaces that they

know and like to consume datasets from. And so in order to accomplish that,

the metrics framework, which is really aimed at that data producer

is, you know, a framework that's built around the ML and SQL. It's contributed to

Git in order to do version control,

And then that publishes these metrics through

either our catalog, which is kind of the first demonstration of the power of some of our APIs.

And, you know, hopefully, that catalog makes it easy for this data consumer to then kind of ask basic questions. Show me this metric slice by this dimension.

And then beyond that, you know, there are a bunch of different interfaces

that

data producers also want to expose their datasets in. And so we publish a number of different APIs that can then connect to anything from business intelligence tools and Jupyter Notebooks to, you know,

GraphQL and React, which allow front end developers to build on top of transform.

And can you dig a bit deeper into the way that the platform is architected and some of the

system design considerations

that you had to deal with as you were building out the initial versions of the platform and some of the ways that it has grown and evolved since those initial prototypes?

The core of this platform is really this semantic layer where the data producer is defining

these YAML files. And

these YAML files have some amount of kind of SQL expressions in them. But really the most important part

are the abstractions that we've chosen for how to capture this information

and what those abstractions enable.

And

so those files then get parsed

by the semantic layer. And we have a

a server which then basically builds SQL against the customer's underlying data warehouse. So everything that we do is built on top of the customer's data warehouse.

We use their existing storage and compute,

but we can kind of do 2 deployments because of that infrastructure.

So 1 is where we're actually deploying on their virtual premise.

That means that they are connecting their data warehouse

to transform.

It's it's all staying in their ecosystem,

and they kind of get all of the security guarantees that they want. The other option is a hosted version where we're basically just building SQL to their data warehouse and not actually passing any data back to our ecosystem.

So

the specifics of kind of what's built out, the metrics framework that we use is written in Python.

The front end is TypeScript, GraphQL, React,

and then the APIs

are

written around a GraphQL core.

But, really, there are, you know, any number of interfaces that we can build on top of that, and that would be in whatever language that's being consumed by. So our command line interface and our Python client are both built in Python.

Our JDBC is built in Java,

and then our front end is built using the React components and GraphQL components that are GraphQL

interface that we are then exposing also to our customers

so they can build

on top of those same APIs.

In terms of the actual

workflow

of building a set of metrics and then consuming it down stream, what's involved in actually

defining a metric,

populating that into transform,

validating that, you know, in terms of any sort of organizational

discussion that needs to happen around that, and then being able to consume that from a downstream system, whether that's business intelligence or Jupyter Notebook or a Spark pipeline, for instance?

The actual definition workflow is, you know, typically done locally, and we have a command line interface that makes it easy to iterate on these config files, test them,

you know, run variations of metrics that already exist or define new metrics.

Then it, you know, follows kind of the standard

code commit

practices that the company is using.

So those files will get contributed to get Those, you know, once merged would go to our MQL server, get parsed into the current active semantic layer. And then any API requests coming in

would be made against that current semantic layer.

And so

that means that, you know, our front end is then

building on top of these current definitions.

But another really cool thing about this is that if a metric definition changes in that semantic layer, then all of the different places that the company is referencing that metric, so through our JDBC

over SQL,

or through some notebook,

you know, really any of the kinds of interfaces that they're consuming it

would then be

consistent

because they're getting the current definition of that metric.

The nice thing about this is that we're really building on top of the same interface that we're exposing to our customers,

which means that once a metric is defined in this framework,

it should be consistent across all of the different places that they're consuming it. And as far as the integration

with the customer's

data systems and data platform, you mentioned that the the transform sits on top of the data warehouse layer.

And I'm wondering what types of validation and introspection you need to do to be able to provide useful feedback to

the engineers who are building the metrics definitions and as they iterate on defining it and creating the code representation that they're then going to commit and populate into transform.

Yeah. So the core of this dev workflow is to basically be able to

run this semantic layer

against whatever set of configs you're using. And so, you know,

the objective here is to really be able to iterate off of the current version of this kind of semantic mapping of the data warehouse

and then to be able to

use those configs

in the same way that you would use the configs that are currently in production.

And so it effectively

gives the end user the same experience as if they were just querying

the production MQL server.

Because of the fact that you are targeting the data warehouse, I'm wondering if there are

any challenges in being able to extend this layer or if it even makes sense to try and extend this layer to

account for more semi structured or unstructured data storage locations or if it's purely

something that only really makes sense on a

data warehouse that already has some measure of structure applied to it? Yeah. Right now, you know, we're really focused on kind of the data analytics use case.

And because of

that, we're primarily building on top of

the data warehouse as it exists and the structured datasets that are already there. I think that that probably satisfies the large majority

of

applications for metrics.

And so

I think, you know, it'd be probably good to understand what kinds of metrics are getting built off of unstructured or semi structured datasets

to really be able to answer that question.

In terms of the actual life cycle of a metrics definition,

I'm wondering what are some of the interesting stages

that it progresses through from when it's first instantiated and somebody determines that they need to create this calculation

through to, you know, many years down the road where the business shifts and maybe the underlying meaning of the metrics change, or you need to be able to incorporate additional factors into how the metric is calculated or what the overall value should be? So this is a really interesting and important evolution

of the framework that we saw at Airbnb. And,

you know, for the first 2 years

of this framework, there was really very little governance. Aside from the fact that it was being committed to Git.

There was very little kind of oversight

of what these metrics were and who is consuming them and how are they consuming them and which ones were old. And so

there was a big push to basically think through what are the stages of a metric life cycle.

And I think that, you know, there's been a lot of iteration and Airbnb published some great blog posts about this, but we have our own definition.

That's that there are 4 stages. So

it starts with definition.

I have an idea of how I want to measure something.

How do I define this? Is it different than the other metrics that exist in this framework?

How do I compare it to existing metrics? How do I test it? Who do I want to consume this, and how do I want them to consume it? And that kind of leads into the 2nd stage, consumption.

If I want to consume this, am I using this right? Does it mean what I think it means? Is it up to date? Is it still accurate? Is the data good?

And, you know, generally,

am I able to pull it into the tools that I want to consume this from?

The 3rd stage is iteration.

So I think that this metric needs to change.

Who needs to know why is it changing? How is it different than before?

What's actually changed about this metric?

How do I

compare it to the old version?

And how do I then, you know, in the UI or in kind of these APIs,

be able to generate the old version if for some reason I still need to do that?

The final stage is archival. So

if this metric is old,

how can I stop others from consuming it?

Where does it go? You know, do I still want to maybe calculate it at some point in the future, but I wanna make sure that nobody else is calculating it? And how do I retain the knowledge that's been built around this? So

I, you know, don't want people necessarily to consume this, but I still probably learned some valuable things around this metric over time, and we used it to make decisions.

And so there's some kind of lasting institutional knowledge that's been created

that needs to be tracked over time.

There are a few interesting points from that that are largely based on the organizational aspects of the metric, particularly in terms of

who needs to know about this metric changing, who needs to be brought in to help with the definition of the metric or validate that the way that I'm calculating it is accurate.

And I'm wondering if you can just talk through some of the collaboration

aspects, what you're building with Transform and how you think about

enabling these organizational

workflows beyond just the technical implementation?

So in our minds, the biggest challenge around,

you know, helping an organization to define these metrics

is really kind of creating that interface between the data consumer and the data producer. So

I said this previously, but we really do believe that the metric is the ideal interface because it is currently the language that data consumers around a company are using

to describe data and to to understand it.

And so by

enabling the data

producer

to then go out and

define these metrics

and kind of follow some process.

At the very least, it establishes a standard for, you know, where it's located and how it connects to these various systems.

And so that enables an organization, I think, to build some of their own process

around metric definition.

And hopefully,

you know, on the other side of this, there is a product

that can then support the process that they're trying to build. And so I think that that is probably 1 of the biggest things that we will be working on

in the future as we continue to expand our customer base

is just

understanding all of the differences between how these organizations are consuming metrics.

And,

you know, what that means for the actual process that they want to follow

to make sure that those metrics are agreed upon and trusted across the organization.

Your point about the metrics layer being the interface between data producers and data consumers puts me in mind of the feature store, which is another layer that's been gaining a lot of ground recently that acts as that same kind of interface point with the difference being that that's primarily for the machine learning workflow versus the analytics workflow that the metric store empowers. And I'm wondering if you have any thoughts on sort of the juxtaposition of the metrics store versus the feature store and the

relative utility of metrics versus features and maybe some of the overlap that might exist where you might want to have some level of communication between your metrics and your feature stores and how those different calculations are defined and performed?

That's a really great question and something that I kind of glossed over in my background was, for a while, I was working as a product manager at Airbnb,

and the team I was working on was building out Airbnb's feature store zipline.

And so

at the core, I think these 2 things are very similar.

But there are some really significant differences

that I think make it

a long way off of being

a kind of similar piece of infrastructure that is gonna get built out. But at the core, you know, what they're doing is creating derived data

and then serving that derived data to specific application.

The really hard part here

around the feature store is that there are much stricter requirements

around

the way that a feature is defined, and it tends to be a lot more granular.

And that means that it doesn't necessarily serve the analytical application nearly as well where you want to be able to slice and dice and ask different questions.

There are some other complicated ones around timeliness, you know, feature stores require

some kind of melding of real time and batch

data construction.

Machine learning models tend to require

something called point in time correctness or time travel.

And it's a complicated subject, but it's also something that,

you know, is fairly different between analysis

and feature construction.

And then the last really big difference between the 2

is consumption

and reuse. And so there are really strong forces within organizations

that

push metric consumption to be

consistent.

At the core, a metric is, you know, really just a way of kind of compressing a bunch of information that a company is collecting a bunch of data

into something that's useful for decision making or analysis.

And what that means is

that broadly, you have companies that

are trying to push for a consistent definition across teams, across individual data analysts.

It just makes the world simpler if everything is clean and consistent. And

that's a really big difference compared to features because

features can perform better in certain models, and sometimes you want many different iterations of the same features.

And so, you know, the ways that I saw feature stores being adopted

was primarily taking a feature, iterating on it, and then, you know, ending up with another variant to that feature. And that's not really something

that you do with a metric or if you, you know, do that kind of analysis,

it is

through some dimensions, and it's not actually changing the core definition of the metric.

You're just kind of aggregating it to some different granularity.

Are you bored with writing scripts to move data into SaaS tools like Salesforce, Marketo, or Facebook ads? Hi touch is the easiest way to sync data into the platforms that your business teams rely on.

The data you're looking for is already in your data warehouse and BI tools.

Connect your warehouse to Hi touch, paste a SQL query, and use their visual mapper to specify how data should appear in your SaaS systems.

No more scripts, just SQL.

Supercharge your business teams with customer data using Hightouch for reverse ETL today. Get started for free at data engineering podcast.com/hitouch.

In terms of the granularity and dimensionality of the metric, I'm wondering if you can dig a bit deeper into some of the complexities that come up and some of the ways that somebody who's trying to build a metric definition can shoot themselves in the foot when they're trying to figure out how do I calculate this metric and then be able to actually explore it at sort of different levels of granularity and dimensionality

and just some of the sort of technical and cognitive complexity that arises from that. When I think about the

most, you know,

complicated part of this tool, the most complicated technical challenge,

I really think that it's denormalization.

And so, you know, to kind of back up and just quickly define normalization and then denormalization.

So, you know, normalization is defined as reducing data redundancy

and improving data integrity.

And so

the goal there is to basically define these nice clean datasets that don't replicate data around the warehouse because then they're much easier to manage.

There are a bunch of great tools that have come out, you know, more recently that

have enabled companies to build better cleaned

normalized datasets.

And there's been a ton of research in this space and a ton of discussions of different techniques of normalization, like

Kimball and Inman and and etcetera.

And so, you know, when I think about what do you do with the data from there?

Well, that's really great that that data is clean, but then you need to go and make it useful. And in order to make it useful,

you need to start merging datasets.

You need to

start,

you know, applying filters and doing all of the different things that happen

in SQL or in Python to kind of transform data.

That's really where this framework is aimed at supporting

our end users technically.

And so

the input into our framework is

typically these nice clean normalized datasets.

And you can put in

raw datasets and partially denormalized datasets,

But

really, you get a lot more out of this framework

if you've gone through the work of building these nice, clean, and normalized datasets.

And so from there,

you know, denormalization

is happening across so many different tools today. It's happening in the data warehouse where we're building summary tables.

It's happening

in the BI tool where we're asking some question.

It's happening in dashboards where we've, you know, asked a bunch of specific questions.

And so

what we really wanna be able to do is build those metrics to a wide variety of granularities consistently

across all of those tools.

And,

you know, 1 of the biggest challenges there

is

what are you doing ahead of time and what are you doing on the fly? And so,

you know, ideally, you want those datasets to be really snappy. Right? You want your BI dashboards to load quickly.

But the more you've kind of baked into your

tables in the data warehouse,

the less questions you can ask.

And so the power of these data modeling frameworks

is that they enable you to ask a wide variety of questions

while also consuming those datasets in

all of the different places where you want to consume them, and hopefully it's making them much faster.

And so, you know, the kind of core technical challenge of our framework then

is enabling denormalization

to happen in all of these different places

efficiently and consistently.

And so in order to solve for that, we've worked on a bunch of different approaches to caching datasets and trying to make that end result, whatever the question is, whether it's something that that the end user has, you know, pre specified as a question that they ask frequently,

or if it's a question that is new, trying to make that dataset as fast as possible.

And then as far as the actual

platform integration,

as far as the data source, it's fairly obvious that you connect up to the customer's data warehouse and use either, you know, ODBC or JDBC for that connection.

And then on the other side, you know, you mentioned that you have these JDBC

interfaces or you have GraphQL APIs. But for somebody who maybe connects it up to their business intelligence dashboard and then wants to run a query

that uses

data from their data warehouse and also factors in this metric, is that something where they would just pass everything

through the transform layer, and then you would pull in your metrics definition and then also push down a query into the data warehouse and then join those 2 on their return flight back? Or what's the story of being able

to query against

the existing database tables and the calculated metrics?

We basically

have an API, and we call it MQL,

metrics query language. And it allows

the end user to ask questions in the format of metric by dimension. And so

you're asking for, you know, some metric aggregated to some dimension. And you can also apply filters

and, you know, ordering and whatnot.

But that API request

can basically be expressed within a SQL query. So I can say

from

MQL, you know, metric by dimensions, and that will return

to me some dataset

that kind of comes in as metric and then the various dimensions that I've aggregated that metric to. And so

I can then express that

API request within some broader SQL query where I'm using

the full power of the customer's underlying data warehouse.

So, really, what this is doing is it's just building

a denormalized dataset on the fly

and then querying that dataset

and joining it or applying aggregations or transformations

in whatever SQL the end user has expressed.

So in some ways, it's kind of the inverse of a, you know, stored procedure or user defined function in that instead of you pushing a function definition into the database, you're pushing the database into the function definition.

Yep. That's exactly right. Yep. That's that's right.

You mentioned that the

interface for the data producers

is this code first

YAML and SQL sort of combined format.

And I'm wondering what your

process was for deciding

whether to go with a code first and code native approach versus more of a sort of low code or no code, UI driven framework for somebody who's maybe coming from the business side who wants to be able to define these dimensions and just what you see

as the trade offs of having this sort of text based flat file

definition versus a more UI driven approach?

I think that there's probably a future where

those files get pushed into a UI or an ID kind of experience.

I think we just wanted to start

with

an interface that gave us the maximum flexibility

and ability to iterate.

And so, you know, in the early days,

when we kind of thought about that, what are the tools that our end users are using right now? Well, SQL and YAML are

pretty widely adopted in kind of the data

engineering analytics engineering world. And so we wanted to kind of meet them where they were. Another interesting element of this emerging space is how much support there is in downstream tools, thinking particularly around things like business intelligence dashboards

and, you know, other analytics frameworks for being able to

introspect

and understand

the additional context that can be defined and exposed by the metrics layer as far as having a, you know, prose definition of, you know, this is what this metric is for. This is, you know, how you might want to use it, and this is, you know, some of the metadata about who owns it and who created it kind of thing. And what are some of the missing pieces of the overall

data

ecosystem that you hope to see filled in in the coming months years as the metrics layer becomes a more established

architectural

sort of quanta?

The challenge here for us is that

the entire data ecosystem is really built around tables today.

And

it's not necessarily

a significant challenge,

but it is a missed opportunity.

And so, you know, we can build tables off of our API requests. And by exposing this JDBC,

you know, we can build

datasets that

make sense and share the metadata that we want over kind of whatever connection

is coming in.

But really what's kind of missing here is that you're not necessarily getting that rich experience that you get when you connect to an underlying database

where you can browse the various tables and you can,

you know, look at all of the different columns and

kind of get some kind of summary information around it.

And so, you know, ultimately, I think that

it's not necessarily

a challenge for our end users to get that information because we can expose it as tables to them. And so if they want to look at a metric and look at the various dimensions that they could aggregate that metric to, we can share that with them. But it's coming in the form of a table and obviously to kind of conform to the world as it exists today. It's more about a missed opportunity

to share that information and the kind

of interesting

information that can come with a semantic layer.

In terms of having this semantic layer and this more sort of holistically defined and uniformly

exposed

method of creating and managing these metrics,

what are some of the capabilities

or projects or

organizational capacities that are unlocked by adding this to the data platform that are either

impractical or intractable

otherwise?

Just to start with the core value proposition,

just consistent consumption of metrics and various tools. I think that that

it sounds obvious, but it really just doesn't exist at the majority of companies that we've talked to. It seems like it's 1 of the most universal challenges in the data stack right now.

And then, you know, looking out to the future,

I think that there are

a number of different applications

that are enabled if you have this information.

So

just thinking about the first 1 that really got me hooked on this type of tooling product experimentation,

when I was at Airbnb, I ran a 150 experiments in something like 2 years.

And, you know, I was looking at a 100 plus metrics on every single 1 of those.

That is

just not possible today.

People don't have that kind of tooling broadly.

You know, this is 1 of the core things that this enables.

Beyond that, I have a lot of ideas for our product around

this connection between forecasting and only detection annotations

and then notifications in context that can be pushed out to a company more broadly.

A forecast is, you know, where do you think the metric is gonna go? An anomaly is when it's outside of, you know, wherever you think it's going to go. And then an annotation is kind of the addition of some context for whenever

that metric moved outside of what you expected.

And then, you know, that's an important piece of structured information

that can then be pushed out to an organization.

And so I think that that is a very significant paradigm shift,

where today,

we're creating a lot of data objects where we expect data

consumers, so business users,

to come to a dashboard and pull some insight out of it. And that's a really, really tall ask. It's not just a tall ask because it's hard to get the data. It's a tall ask because,

you know, having all of the context that's necessary

to pull some interesting and valuable insight out of that data

typically takes somebody who has kind of seen the data go end to end, to that place.

And so I think that we can create these really interesting interfaces

beyond just the APIs and pushing the data out to actually add context

to these metrics.

And that kind of takes me to this last point, which is that

a metric is an incredible vehicle for information.

They're 1 of the most consistent

objects in a company over time. They don't switch teams. They don't quit. You know, they are consistent and long lasting,

especially if they're well managed.

And so by actually tying knowledge to them over time, you have the potential to

add a lot of context

that, you know, I think people don't have in many of the organizations that they're working in. So just to kind of tie that down

to a concrete example,

it just happens so frequently

in just about every organization that I was in

that, you know, somebody asked me what happened on this specific date. I know you were at this company 3 years ago.

Help me kind of understand that.

And, you know, oftentimes that information just gets lost. And I think that a metric is a really interesting unit to kind of carry that information forward.

Yeah. There's definitely a huge risk of loss of context

and loss of value in an organization

when somebody who has that

useful

understanding and experience either changes roles or responsibilities or leaves the company entirely

and doesn't actively document it. And so

being able to have this as the

long term artifact of somebody's experience, I can definitely see a lot of potential value from that. Yep.

In terms of the users of the platform and customers who are starting to onboard with transform? What are some of the most interesting or innovative or unexpected ways that you're seeing it used?

I think that probably the most interesting thing is

defining interfaces

between teams.

I think that I took this for granted when I was at Airbnb. I kind of just assumed that this was normal, but

we've seen a lot of teams adopt this tool and then define various metrics in different parts of the company

that

historically have not been kind of consumed or kind of crossed the boundaries of various teams.

And, you know, we've gotten some really fun feedback from our customers around,

hey, I've just I've never sliced this metric by this dimension before

because, you know, this 1 existed in a dataset that this team

relied on, and this 1 existed in a dataset that my team relied on. And so that's really exciting, and I think it demonstrates a lot of the potential of this framework. And, you know, I kind of think back now to my time at Airbnb where

I was on the growth team. Right? And so I consumed

metrics from a wide variety of teams because oftentimes growth teams work

impacts some other team. And so I was consuming, you know, the customer service contact rate

or the,

you know, account takeovers

related to sign up and log in flow work that I was doing. And I, to this day, don't know the definitions

of those metrics. I could not have written the SQL to calculate them, but I know that I consumed them. And I know that the teams

that reviewed my analysis trusted the analysis because they had defined the sequel.

And so it's this kind of incredible

unlock to basically just

be able to communicate with another team

reliably. I think that this actually touches on

1 of the kind of core principles of data mesh.

You know, that's an exciting future that, we are moving towards.

Yeah. The data mesh aspect is definitely an interesting

element to pull out because it's been gaining a lot of ground over the past couple of years

and has a lot of sort of utility in terms of how you think about building out the technical underlayment

of the organizational

capacity for data.

And I can definitely see the metric as being a

useful sort

of exposed artifact for a given data team to be able to propagate and let other teams consume and combine them without necessarily having to understand the

underlying

calculations and computation that happens. That's an interesting point worth noting. Mhmm. And then in terms of your experience

creating the transform product and building the business around it, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

You know, I think that the majority of these come from generalization.

So

we

saw this tool work within 1 company,

and

we went out and talk to, you know, maybe the 10 or 15 companies that have gone out and built similar tooling.

But

that's a very narrow picture of how people build and consume metrics.

And there are a lot of really complicated

factors

in there that, you know, require

us to then

generalize the way that the tool is built such that it'll be more useful broadly.

And so, you know, some of these include

just different data modeling techniques.

You know, Airbnb had a good mixture, I think, of

nice clean normalized datasets, semi denormalized datasets, and then raw datasets that were finding their ways into metrics.

But it it wasn't even close to representative

of all of the different, you know, data modeling and data engineering techniques that companies are using.

And so a lot of lessons there.

I think that

also

different scale puts different requirements on this framework. So

when I think about this, I think about that denormalization

challenge

of what are we building statically? So, you know, what are we building to the data warehouse ahead of time?

And

what are the kinds of questions that we're making

it so that even if there are a 100, 000, 000, 000 rows in this fact table, we can still answer the question of, you know, how many rows were there per dimension that I'm trying to aggregate it to.

And what that takes is basically pre aggregating datasets.

And that's something that Airbnb got really good at because it had large amounts of data.

But a lot of the companies that we're working with

really just want to be able to do these things dynamically and on the fly, and they still don't wanna wait that much time. And so it's, you know,

some combination

of building datasets and then storing intermediate

representations

of them such that incremental questions can be answered quickly.

But they don't necessarily

have the time to go out and build a bunch of, you know, nice clean,

denormalized summary tables that they can expose their organization.

You know, that's been a really big challenge, but also a really big learning.

And I think that it pushed us towards

making our APIs dynamic so that you can ask for any metric dimension combination.

But there's a bunch of interesting work that we're doing around caching to make it so that those results can get returned quickly.

You know, the last 1 I think is just organizational

challenges

associated

with metric definition and the whole life cycle management process that I mentioned.

It's tough

and just about every company has a different idea of how this works. And so

there is a big challenge around kind of productizing

that. You know, what that means is that there needs to be a lot of configurability

because this

catalog really needs to work in the the ways that companies expect it to work for that process that they want to run.

And your point about precalculating

summary tables is interesting because I've had a lot of conversations with people where the sort

of general

guidance is that you should have

1 or a small set of tables that can answer 80% of the questions in your business.

And with the introduction of metrics and the amount of information that you have about

what data is being used, how and by whom exposes the potential for an interesting feature where you can recommend a set of summary tables that would be useful to precompute to increase the

speed at which you're able to generate these other sort of metrics views of the underlying data. Yeah. That's right. And that's kind of why we have really 2 primary layers of caching.

The first 1 is 1 where the company can say, I know that I wanna compute, you know, this metric and this dimension together, and I want it to be really, really quickly. I want the queries to be really quick on top of that.

And so,

you know, that's something where they know ahead of time, and we can get that query down to, you know, in a really fast data warehouse under 1 second.

And

on the other end of that,

there are times when people just ask new questions, but if they find something interesting, they're gonna keep asking it. And so

we have this layer that we call dynamic caching,

which basically allows you to ask questions. And then if you go and ask that same question again,

it's gonna be really fast because we're saving that dataset in a similar way to the way that we're saving that materializations

dataset.

And this really enables people to

ask these metric questions really quickly,

but also

enables them to ask a wide variety of them. And so

I've definitely heard that 80%

of questions can be answered by core summary tables.

And I think I would push back on that, and I would say that

it might be that the people who are consuming data

at your company have just given up, and so you're not discovering

the rest of the data questions that they have because

you're kind of just seeing the ones where they ask a question and it's not answerable,

and then they give up. And so

I think that, you know, what we're seeing is that as more and more people are adopting this tool

and there are more combinations of metrics and dimensions that people can ask questions about,

They

will just ask more questions,

and hopefully, that leads to, you know, more interesting and valuable

insights getting pulled out of the data.

For people who are

interested in the idea of a metrics layer and they want to be able to add some uniformity

to how

the metrics are defined across their different tools, and they want to be able to

enable their business users to

explore more of the dimensionality

of their data, what are the cases where transform is the wrong choice and they might be better suited with some in house tool or something purpose built for their particular use case? I mean, we've talked to a lot of fairly small companies because

I think that they have

productivity challenges, but they don't yet have the trust challenges

that our framework

and the rest of our product is really aimed at solving.

And the reason there is just that if if you have 1 or 2 data analysts on a team,

you already have metrics consistency.

Right? It's already in the heads of those data analysts. They know the definitions

and, you know, they are kind of the interface to data for the rest of the company.

And, you know, there are some productivity challenges associated with that because if it's a data hungry organization,

there's gonna be a lot of consumption

of metrics, and that's a significant thing to support.

But then, you know, what inevitably happens is they add more data analysts to that

team, and then you start to have some of those trust challenges. And so I would say that, you know, fairly small companies

should probably just kind of focus on the core of getting

good clean data

into their warehouse and normalized and ready for consumption. And then they need to start thinking about, you know, what are the different applications where I wanna consume metrics?

Because transform is really valuable once you have,

you know, more than 1 application.

You know, just because if you're consuming in multiple places,

that is where, you know, this framework adds a lot of value.

The second 1 I would say is,

you know, there's a whole kind of

set of companies that consumes

metrics off of, you know, Salesforce or Zendesk or any number of other tools. And because we're built on top of

the company's centralized data warehouse,

we, you know, just can't serve those customers yet. But, you know, generally, I would say that just about

every medium to large company has metrics problems. And that's kind of the, you know, set of companies that we're working with in the in the early days.

And as you continue to build out the product and build out the business, what are some of the things you have planned for the near to medium term that you're excited

for? There's just so much foundational work. And, you know, the reason there is that if you are going to define a single source of truth for metrics,

there's kind of a core product philosophy that I think you have to have. 1 is that you have to be able to consume

metrics

from, you know, wherever it's located, and

you need to be able to build whatever metric types a customer wants, and we're still working on that. There are a lot of different types of metrics that companies wanna consume,

and, you know, I would put us in the kind of 90%

at this point. We can support all of the kind of core

types of metrics, but

still working to support some of the kind of edge cases that specific companies are interested in tracking.

And then on the other side of this, you have to be able to connect to

every single tool that a company wants to consume those metrics in. Because in order for this truly to be a single source of truth, it has to be consumable in all of them. The moment it's not consumable in 1 of them, they will go around this tool, and it is no longer a single source of truth.

So there's just a lot of foundational work to enable that vision.

But, you know, beyond that, I think that

once you have consistent metric definitions,

there are a bunch of really interesting applications. And these are the things that I already called out around forecasting and anomaly detection, you know, interesting correlation analysis between them, building metrics for different applications like experimentation,

you know, executive reporting.

There are just so many different applications,

and I think some of those are well served today. You know, BI is an example of something that there are many different takes on how BI

should work,

and there are many people who are kind of building the future there. But I think that there are a lot of different applications

for metrics where people are still just kind of starting

from home base and trying to figure out how am I going to

build this application.

And it all starts with building metrics out in the data warehouse and then figuring out how to then kind of production as that. And so I think we can help with some of those, I call them long tail

applications of metrics.

Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes.

And as a final question, I'd just like to get your perspective on what you see as being the biggest gap in the tooling or technology that's data ecosystem and

and

just making metrics a first class citizen of the data ecosystem and and generally making data more accessible.

But maybe more broadly, 1 that I'm passionate about

is,

I think, in order for data to really truly be accessible,

we need to make a lot of progress

with the data tools that we've built out. And I think in order to do that, there needs to be much broader cooperation

between the various companies working

in this industry. And so, you know, I'm excited about projects like Open Lineage.

I'm excited that we are kind of pushing

the specs of how our semantic layer works out into the open. And I think that this is something that will hopefully allow more companies to build on top of transform.

Well, thank you very much for taking the time today to join me and share the work that you're doing at transform. It's a very interesting product and an interesting problem space. I'm definitely excited to see more energy behind it and making and the wider availability

of metrics

across the overall data ecosystem. So thank you for all of the time and energy you're putting into that, and I hope you enjoy the rest of your day. Thanks for having me, Tobias. This was a lot of fun.

Listening. Don't forget to check out our other show, podcast.init@pythonpodcast.com

to learn about the Python language, its community, and the innovative ways it is being used.

And visit the site at dataengineeringpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com

with your story. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Parting Question

Links