Making Email Better With AI At Shortwave

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

A new approach to building and running data platforms and data pipelines.

It is an open source, cloud native orchestrator for the whole development life cycle with integrated lineage and observability, a declarative programming model, and best in class testability.

Your team can get up and running in minutes thanks to DAXTER Cloud, an enterprise class hosted solution that offers serverless and hybrid deployments, enhanced security, and on demand ephemeral test deployments.

Go to data engineering podcast.com/daxter

today to get started, and your first 30 days are free. Data lakes are notoriously complex.

For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte scale SQL analytics fast at a fraction of the cost of traditional methods so that you can meet all of your data needs ranging from AI to data applications to complete analytics.

Trusted by teams of all sizes, including Comcast and DoorDash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises.

And Starburst does all of this on an open architecture

with first class support for Apache Iceberg, Delta Lake and Hoody,

so you always maintain ownership of your

data. Want to see Starburst in action? Go to dataengineeringpodcast.com

slash starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macy. And today, I'm interviewing Andrew Lee about his work on Shortwave, an AI powered email client. So, Andrew, can you start by introducing yourself?

Good morning, Tobias. Yeah. Happy to be here. So the, you know, the thing I think a lot of people know me for is I was 1 of the founders of Firebase before this, which is a developer products

platform. But, I've got a new thing. It's a few years old now, but a new ish thing. It's called shortwave, and we're building,

an AI email app. I'm 1 of the founders, and the CEO here.

And so can you give a bit of an overview about what it is that you're building at shortwave and some of the story behind it and the role that data plays in the product?

Absolutely.

So we are building

right now the world's smartest email app that we're trying to make the data in your

email archive

super valuable to you. But that's actually not how we started initially. So, shortly started at the beginning of 2020,

And the motivation behind it was really,

that I I have a passion for email. I want to make it really great. And that came from

a couple things that happened in 2019.

1 of those being

the democracy protest in Hong Kong.

I don't know if you you remember back then, but there were there are a bunch of protesters in Hong Kong, and

people were using WeChat as a means to coordinate. And the Chinese government was actually using WeChat as a means of suppression and control of the folks there. And I looked at that and said, hey. That's a bad thing. But at least, you know, at least these people can use email, which is this open federated protocol. If they're unhappy with

what's going on with each ad, they can use this other protocol.

And then Google killed off Google inbox, if you remember

that product, which at the time seemed like the next gen coolest new email thing. And I got to thinking, hey.

If, you know, these centralized services really can't be trusted for some of the most sensitive communications.

And the big companies that are supposed to be investing

in email are not gonna do it. Maybe I should do something about that. So,

I started I I I basically pitched a bunch of my ex Firebase friends and said, why don't we go start a thing? Maybe we can make email awesome again. And they said, yes. And so we started

doing shortwave, and we we kicked off the company in 2020.

And

a couple years into that, after we built

kind of the core just the basics of email. You could receive emails and send emails and you your inbox work and you can do searches and things like that.

We realized that,

holy crap, we have this massive archive

of everything you've ever sent or received. And not just your personal correspondence, but, you know, you've

got the notifications from your bank with your with your balance, and you have everything that the Sonos ever sent you with all your projects that you're working on and to do is it's got your calendar invites, it's got all your newsletters.

And this is an incredible treasure trove of data, and suddenly we have this thing called LLMs

that can make this data actually useful for you rather than it just sitting on a disk of our heating up storage.

This is potentially very valuable. And so a couple years ago, we pivoted hard and said, hey. We are gonna try to make, you know, the primary focus of what we're doing at shortwave to make that email archive

super valuable for you,

using LLMs.

In terms of the shortwave

product, you mentioned that you wanted to make email useful. There's all of this useful information that you can extract from it. But what is the

core foundational problem

that needs to be solved with email despite the fact that everybody hates it, and yet it has survived for decades?

Really, I think there's

there's 2

big problems that are worth talking about. The the first actually has nothing to do with AI, and it's simply the legacy of this thing has been around forever.

It hasn't changed a ton, but the world has changed around it. And, you know, as an example,

Gmail was is 20 years old as of, I think Monday or yeah. Monday.

Last Monday. And

it was invented in a time when we didn't really have smartphones,

and we didn't have nearly the Internet bandwidth that we have today. And people didn't have, you know, cameras with them everywhere, and people haven't gotten used to messaging apps. And it is, like, this

very

kind of outdated

way of communicating, and it's still the most modern thing out there. Right? A lot of people are still at other, like, even even more

archaic solutions.

And

so dealing with

all of that legacy and trying to bring people forward and give them experience, it really feels like Vermona, that works well in mobile, that's real time, that works really well with media. I think that's 1 of the big challenges. So a lot of stuff we're doing in in shortwave is just how do we make it feel much more like a modern

chat application? How do we make it collaborative? How do we help it work better with media? Things like that.

The other problem

is overwhelm.

People

get a ton of emails. They send a ton of emails.

They are inundated with spam. They're inundated with automated emails. So a lot of people, like I mentioned, they feed in, you know, the stuff from their Asana and from their bank and, you know, they all kinda mark connect the marketing stuff, and so they need to deal with this incredible volume of email. And we're really trying to tackle both problems. We're trying to tackle the UX side, how do we make it just a really great experience for end users. The data side, how do we help you deal with this overwhelm of data and help you kinda sift through the noise and get to what matters?

Another

product that has come out recently

that is trying to

revive email is, hey, from the folks at 37 signals. I'm wondering what your sense is of the way that you're thinking of this problem and how to approach it versus the the way that they envision email and how it's supposed to be.

Yeah.

I I think there's a lot to like about hey. So I think 1 of the most popular things you might have seen the article in The New York Times

last week

from

Ezra Klein

talking about, hey, and why he likes case. So they have the screener, for example,

which allows you to choose whether or not someone gets to send emails and put them in your archive, which is, like, 1 of their ways of dealing with overwhelm. They have some cool categorization features. They have some cool collaboration features.

So I I love what they're doing, and I like to have other people collaborating

the space. Really, the biggest

difference I think that matters to most people is, hey, is a standalone,

like, independent email service. So if you have a Gmail account or your company uses Gmail, you can't

just use, hey. You have to change your, you know, your DNS settings over and point everything over at, hey, to make it work. So I think for most of our customers, that's the most relevant difference. It's just not an option for them. I think

besides that, from from a product perspective, I think we're trying to build something that's really much more for folks that have power user. So,

Hey, I think it's designed for more like SMBs where you have a moderate volume of email. It's important for your business, but it's not maybe something where you're getting, like, hundreds of emails in it every day, and you're, like, running, like, a much larger thing. We are much more focused on founders,

CEOs, execs, product managers of big companies,

things like that. So our like, we we see our number 1 competitor as superhuman, which is kind of a more of a power user product.

And now in terms of

email as a source of information and data,

what are some of the unique challenges that it poses as compared to some of the other types of data sources that organizations are already dealing with?

I think the volume is really the biggest the biggest issue. So,

and I'll give you some examples here. But we

embed all your email. When when it comes in, we take it, and we have an open source embedding model, and we throw it in the vector database and we use that to power a bunch of stuff. And my personal inbox, my personal Gmail has 3, 000, 000 messages in it. And taking 3, 000, 000 messages and embedding them on, throwing them in a vector database

is expensive, and it takes a while to actually do that. And if you wanna change the embedding model, you have to re embed all that stuff, and that that can be a challenge. So

thinking about

how do you deal with a very large amount number of of messages,

I how do you find the needle in the haystack that you wanna find, especially when you have messages of many different types of many different levels of importance? So, like, the vast majority of that email is actually, you know,

marketing content of some kind of promotions or whatnot that I just archived and didn't look at. In fact,

I was looking at the stats the other day, only about 2a half to 3 percent of all the emails that are received through our service ever get opened. So the vast, vast majority of the email you get is actually junk. So, I really think, yeah, the big the big data management challenge is

the tremendous volume, the diversity in

importance of that stuff, and helping you you know, finding ways to help users sift through all that and find the stuff that matters.

To the point of generating embeddings of all of the content, as you said, it's expensive, it's time consuming,

and

3, 4, 5 years ago, it was probably

not even really feasible because the technology didn't really exist for being able to natively store the data in that format and be able to actually do anything meaningful with it. And I'm curious how the

advent of large language models, the commensurate rise of vector databases and vector search and just the overall ecosystem around generative AI,

how that changed the ways that you thought about what you

could and should be doing with email and its role in your product direction?

It's totally changed it. So if you if you rewind 2 years,

we were following all of the developments

in LLMs

online,

and we were, you know, playing with some of the demos and stuff. But we weren't seriously,

you know, working on it at Shoreway because we didn't think it could actually be used to deliver things that that work end to end, like real end user experiences. And

around, you know, late summer, early fall,

a couple years ago, we started

seeing that, hey. Actually, maybe this stuff is close production ready. Maybe we can find a way to duct tape it altogether into a product that that actually works end to end. And,

I think stat gpt was really the thing that made it obvious that, okay, this stuff is ready for prime time. And

I wanna say that I think the tech that's out there right now is just

barely good enough. There's there's been a lot of things where, you know, we put put the thing up the way we think it should work and it doesn't quite work. And there's a lot of little, you know, weird things in the prompt and little fudge factors and and a lot of duct tape to, like, make this whole thing work. But I think for a startup, that's actually the right time for you to build a system is, like, figure out, you know, right before

when everybody else thinks that they can actually build a product, like, this is the time for you to figure out a way to duct tape together and and launch it as a start up. And and I think we managed to do that.

As we've said, email has been around for a long time. Lots of other protocols

and products have come along claiming that they're going to unseat email for all time, and yet it is 1 of the stickiest pieces of technology that we have aside from maybe the TCP IP protocol.

And I'm curious how you view the strengths of email as both a protocol and an ecosystem

as compared to all of the different means of communication that we have tried to generate to supplant it over those years?

Yeah. I think the the most important thing to understand about email is that it exists at scale and it has this ongoing network effect. And there is, you know, I have many complaints about

the standards and the way the ecosystem has evolved and, you know, like,

authenticating emails, for example, is a total mess. And the types of data that you can send in emails is a mess. And the way the headers work is a mess. The way it works with DNS is a mess.

But

it exists and a ton of people use it and it is, at this point, universal. And so I look at it as and it's the only the only system like that. There are lots of other little open source projects to do, you know, communication this way or that way, but there's nothing that's working at a scale where it can actually be a viable competitor to the centralized messaging services that I talked about. Right? If you wanna, you know, if you wanna build the thing that's gonna compete with, like, WhatsApp or Messenger,

email is really the only viable kind of open protocol that that exists today. And maybe someone will come up with 1 in the future, but I'm I'm skeptical. It's very hard to kinda bootstrap that sort of thing. So, I think the fact that it has been bootstrap it it it does exist. It has been been scaled to the size it is is really important, And that's

why we chose to focus on it. It wasn't, you know, necessarily that we thought it was an amazing protocol. It was the thing that everybody used.

And digging into

shortwave

as a product and as a platform, wondering if you can talk through some of the ways

that you've thought about the architecture and design at the system level

and how the overarching

user experience goals have been reflected in the architectural elements that you have focused on?

Yeah. So

the history of the company, actually, I think is really relevant here. When we started Shortwave, our initial

idea for differentiating from other products was to focus on a collaborative use case, where we said, hey, what if your inbox was a place where you could communicate really effectively with other people on your team or other people in real time? So we built, like, typing indicators, only made this feature called channels and you can emoji react on stuff. And to make that work,

we built a client server architecture where

the ingestion of your email the the syncing of your email from Gmail actually happens on the server side, and a lot of that heavy lifting is done on the server side. And our client talks over a protocol that we define between,

you know, your web browser or your phone and our servers. And we did this initially to make collaborative use cases work really well. And it turns out

that architecture is also great for doing AI things because well, what do you need if you wanna build a bunch of cool AI features? Well, you need a bunch of GPUs

that are right next to your data, and you need maybe, like, a big vector database that's right next to your data.

And you can do that now because we're sucking in your email on the server side,

where we have access to all of that stuff. So we have this client server architecture

where when you sign in, we connect to Gmail, we sync all your stuff to our our cloud service, and then we index it. So,

we have traditional full text search infrastructure that we throw it in, so we use elastic for that. We have we use Postgres, so we put a whole bunch of stuff in Postgres, and then we use,

Pinecone, an open source embedding model, and we embed all of that, and we we throw that into that database. And we have, you know, those 3 sources of truth available,

for all the features that we wanna make work down the line. And so when you're, you know, doing something in the in the app that needs to look at data, if it's something that needs full text search, we can run full text search. If it's something that needs, like, real time live updates, we can use postscripts and the streaming stuff that we've built. If you're needing to do some sort of semantic search, we've got it in Pinecone. We can run searches there really easily as well.

The aspect of discovery

and search and the fact that you have to have the same data in multiple different locations for these different

particular use cases is the state of the world today. But I'm curious what are some of the ways that you've been thinking about what the

aspects of specialization

are because of the fact that there are those different storage systems that are focused on these different

representations

of the data and how you think about, in particular, this rising

topic of,

hybrid search of being able to move across both text based search and semantic search. And,

in particular, as a business trying to be profitable,

the fact that you have to have all of these copies of data and all these different systems, which can become increasingly

expensive as you scale. Yeah. No. This is, this is a real thing that we that we face. So we have

many copies of the data. Right? Like I said, we have the stuff in Postgres. We

have, the stuff in Elastic. We have the stuff in Pinecone. We actually have other copies as well for for other purposes. So, yeah, all of your data is replicated a ton. And all of that's copied, you know, outside of Gmail. So Gmail also has copies of this, and they replicated that, I'm sure, as well. So, yeah, we're we're definitely storing way more than we need to. I would love for some of these systems to be able to do

more than 1 thing. So I'll give you an example here. Today, the public facing version of our

AI search so if you talk to the assistant, you ask it a question. The way it is retrieving data today

is

purely using,

well, I shouldn't say purely. Well, I'll give you a little background on how it works and and then this will make more sense. So when you run 1 of those queries, we're going off and we are searching in multiple different stores. And we are pulling in the information that you want and

some of that is coming from Elastic, some of that's coming from Pinecone,

and then we are re ranking that whole set. And if we are returning it to the agent and then we're asking the l m to answer your question.

The data that we are gathering there

is not using hybrid search and it is not

being filtered

by any sort of metadata. And so if you ask a question like, you know, what are my favorite foods, it might be able to go find things that are related to the foods that you like. But if you ask it a question like, you know, of the threads that are in my inbox that have this label,

what are my favorite foods,

it's not smart enough

to restrict

the domain to just that set of threads

and and and look for the foods. And if you had a specific

word in there, you know, you know, like, you know, what are my favorite types of pineapple? It's not gonna do the, you know, the hybrid search thing where it, like, looks for this rare word.

So we have a new version coming out soon

that does do this where it at the time that you ask your question, it figures out what are the metadata restrictions that the user has asked for. They said, okay. I just want you to look in my inbox, and I just want you to look for things that have a certain label, and then inside that, I wanna do a semantic search. And we're also connecting a hybrid search. If you mentioned pineapples, maybe we'll also look for that, you know, that's

that's some sparse vector and and we'll consider that as well. And

this

combination of, you know, identifying the metadata constraints,

constraining via the metadata,

and doing the hybrid search is actually something that's all been written in our code. And we are you know, our code is using

Pinecone and it's using Elastic behind the scenes, but there's a ton of stuff that we've done. And it leads to

performance issues and it leads to, to some extent, correctness issues. There are basically, we you know, for the metadata filtering, for example, we will go and pull a whole bunch of results, and then we will post filter that. And

if you happen to have a thing where there's a lot of semantic results and only a very few of those match metadata constraints, like, you might not be able to load very many very quickly that actually match your constraints.

And so if we could have a native version that allowed us to say, you know, in Pinecone, for example, hey, here are the metadata constraints that I want

and, you know, I want you to do a hybrid search on, you know, the semantic component

and have it do that in a sort of native super efficient way, that would be awesome. But we haven't seen anyone, with a with a solution we thought would work there yet, but, definitely looking around for that. Another piece of the stack that quickly becomes expensive

is the AI model itself and having to execute it across all these GPUs.

I'm wondering how you think about

the

optimization

of

balancing user experience with your cost of executing the AI to say,

for this class of problem, we don't need to bring the AI into it at all. We can just rely on good old fashioned software engineering.

This is the point where we actually have to use the AI to

turn the query into an embedding so that we can put it into the appropriate

database to be able to actually execute against these vector searches and just some of the ways you think about the balance of we wanna use AI because it's powerful, but we don't wanna use AI because it's expensive.

So

within the, you know, the decision between

using AI at all and not using AI, I have I have at least 1 fun example here, which is, Snooze in our app. So we, for a long time,

have said that we had smart snooze that, like, you can type in any time, and it'll figure out when it is. You can say, like, Christmas, or you can say, like, August 5th, or you can say 9:42

PM, and it gives you a suggestion and you can you can be on your way and it's, you know, natural language. Right?

And that system is

entirely rules based. There there's no, you know, statistical method or AI or anything. It's just block of code where we said, hey, there aren't that many versions of this and we can make this, like, fully client side and really fast and really cheap if we just write out the rules.

And so there are things like that that we do. We also use some of Gmail's tech. So some of the labeling of different emails and things, we just, you know if they think it's a a promotion, for example, we think it's a promotion as well. So

in in many cases, we don't do the heavy lifting.

We use what somebody else has done or we use some rules based system and it and it and it gets the job done.

Within the realm of AI, I think there's also a tremendous

differentiation

in terms of, like, how powerful a model we wanna use.

And

there's many considerations here. Cost is a big 1, of course, but we also need to think about,

latency

and privacy implications,

and I'll give you some examples here as well. So

the AI assistant that we have so when you open up shortwave on the side, there's a chat that you can talk to, and you can discuss your email with, with the bot. That is run on g p d 4 turbo. And the reason is it is or at least it was at the time may may maybe, you know, maybe plot 3 is better, but, it was at the time the best, smartest

latest model out there. And we thought in that case, smart was more important than speed or cost, and we want to have have the very best. So that 1 is running GPU 4 turbo for all of the output. But inside that assistant,

actually, every time you type command and press enter, there's actually a whole bunch of LOM calls going

to figure out what data to feed into that final prompt. So when you ask a question,

we are trying to figure out, hey, what data sources should should we be searching? Should we be, you know, searching their email history? Should we be looking at the thread on the screen? Should we be looking at their draft? Should we be looking at some of their settings?

We're also

extracting

constraints and keywords and things from that query.

And

those operations

are done in a variety of different ways. The latest version that we're working on right now is using a fine tune g 2 3 5 to try to figure out

what search constraints to look at.

We on the back end,

when we when we run those searches, we have,

obviously, an embedding model that embeds the stuff in the beginning. But we also have a re ranking model that's used to go through a whole bunch of potential candidate results and figure out the ones that that we wanna use. So,

you know, the the final output there is an expensive, slow model,

but there's many LMS calls that are being made when you type that. Some of them much cheap much faster, and then some some some other types of models being around on the back end. I'll give you another good example here, which is our summaries. So when you open up an email right at the top of the email, we put,

like, a very short little summary of the email, which has been, I think, a feature that people really love. And that 1 is using, Mistral running on some,

GPUs of ours and that is really primarily done for latency reasons where we don't wanna wait for you know, you open a lot of emails every day. If the summary takes a while to

display, it doesn't really help you. Right? So we gotta do it really fast and and, waiting for g p d 4 would take too long. So, yeah, we're using Mistral, and we're trying to get something out to you really, really quick.

Another aspect of

email

and all of the different products that have been built around it is the question of security and trust because email can often have a lot of information in it that you don't necessarily want to have

generally available or widely known.

The other aspect of this in your product is the question of personalization

for that AI model and the types of responses that you're getting. I'm curious

how you manage the complexities

of

the security and sensitivity of the information

and the personalization capabilities that you're building with these different AI models.

Yeah. So,

I think right now, 1 of the 1 of the things 1 of the interesting things that we're doing here is, so far,

we have not done any model training on end user data. That may change in the future, but

we, today, do all of our personalizations for retrieval.

So as an example, like, when you're writing an email in shortwave

and, you know, we advertise that we write it in your style. The way that we're actually doing that is

finding relevant examples where we say, hey, what are some emails that he he wrote in response to similar email threads? Or what are some emails that he wrote where it started in a similar way to the draft he's written so far? And we're throwing those in the prompt and being like, hey. Basically, the last 10 times you wrote an email like this, these are the emails that he wrote. And then we're having a LLM

sort of synthesize that together with what you've written so far and and give you a suggestion or give you, like, a full email.

And

that method

actually works really well, and we actually in in this particular case, actually, we do use some fine tuning, but that fine tuning, it was not done on any of the data. That was actually done on data from our team internally, and the fine tuning

was just to get the the output formatting right. So, like, if you use the AI autocomplete that we do, the fine tuning is there to make sure that it gives you completions

of the appropriate length and the appropriate format so that, you know, it feels natural as you're getting them. It's not to to give you a particular style.

I think in the future, we definitely do want to

unlock more value with models trained and use of data. And, yeah, privacy is a is a huge factor, huge consideration. We wanna make sure that everyone is, you know, aware that we're doing this and has control over, what data it's in on and make sure that the data you know, I think the fear that everyone has is you train on my data and then that data somehow ends up in somebody else's account. And and we'll make sure that anything we do is being extremely careful not to not to let that sort of thing happen.

1 of

the recurring questions that I found myself coming back to when speaking with anybody who is building a product that has an LLM as any component of it is the question of platform risk where

particularly in the case of the API focused models of OpenAI,

Claude, etcetera,

you have no control over what the end company is doing with their model, what their future direction is going to look like, and so there is inherent risk there.

But even with the open source models, there's platform risk of, is it going to be maintained? What does the future evolution of this model look like? Or I want to be able to do something

custom, or I would like to push the evolution of this type of model in a certain direction.

There's a lot of complexity and time and money investment that goes in that way. I'm curious how you think about the balance

of platform risk of the level of control that you have over the models themselves and their capabilities,

and then also the question of the

repeatability and reliability of the models given that there are probabilistic and not deterministic.

I I think,

honestly, those are not huge concerns for us. And the and the reason is just that we're a start up,

and we are under super rapid development. And the reality is 6 months from now, our system's gonna be totally different, not because the vendor changed things out from under us, but because we decided we wanna build things a whole different way. And

our our business isn't is is inherently risky, and and we know that. So

I'm not I'm not too right now, I'm not too worried about that kind of stuff. I think it's actually a really fun time to work on this because there is so much change and and so much churn, and there's new people watching models all the time, and the competition between the different vendors is super fierce.

And, you know, there may come a time when this stuff matures more where, you know, your choice of what platform to build on

is, you know, a much bigger deal. But I think I think right now,

my advice to folks would be, like, be comfortable trying all of it, build your system in a modular way. We can try to swap things out and expect that you're gonna rebuild your whole thing

every few months. And that's just the way it is because the the tech is evolving so fast.

You mentioned earlier

the

expense and complexity

of having to rebuild all of your embeddings when the models change.

I'm curious given your comment about focusing on modularity and the ability to experiment with a number of different models as the space evolves,

how that has influenced the way that you think about when and how often to rebuild those embeddings or how those embeddings should be constructed to minimize the need to be constantly rebuilding them?

So our AI assistant has been live for just over 6 months, and we actually are still running on the same embedded model that we launched the product with.

We

are very soon gonna be

swapping in a new model. We don't we haven't actually decided which 1 yet. We're look we're looking at some different options. We're also looking at embedding fine tuning,

and we we see a lot of potential there as well. And we're actually gonna be doing this at the same time

that we're gonna be, shifting to a different vector database. So Pinecone

has a new serverless offering

that, the big benefit is they separate compute and storage. So with the the old pod system that they had, you paid for kind of a set amount of compute and storage. And since our usage is, like, super storage heavy and not super compute heavy, it hasn't been great for us. But they're gonna have a new new offering that fixes that. Anyways, we're gonna be switching to the serverless offering at the same time that we are re embedding all all of these emails with a new model.

And we're hoping to kill a few birds with 1 stone.

So I I think the the answer to your question is it just makes it more work, and it makes us, you know, try to be more careful about doing this. And rather than us, you know, trying a new model every week, we're gonna try a new model every 6 months and be a bit more thoughtful in our in our testing before we do this. I think the other thing that that we've done here is we are

currently restricting

the what we embed,

pretty aggressively at the pricing level. So we don't, for example, embed anything for our free users. The the embedding is is a paid feature. We have a whole bunch of free users, and they get some basic AI features, but we, you know, the the the cost of the embedding and the storage and the search and stuff, we don't wanna handle for them right now. We also limit for the paid users, we limit the time

for the for the embedding. So our business plan only goes back 5 years versus, you know, some people obviously have much older Gmail accounts. So I think as we

get more confidence in the embedding models that we've chosen and we're gonna be doing

fewer of these migrations, we'll be willing to extend the histories and provide the embeddings for more people, because we won't be so worried about the the migration cost down the road.

The other data challenge in email is synchronization. You mentioned that you store

a copy of all of the emails in your own servers for purposes of data locality being able to do all of this processing more rapidly,

but then the users also want to have the emails that they write and send reflected back into their Gmail accounts. I'm curious. What are some of the complexities and sharp edges that you've run up against

with managing that data synchronization,

particularly given that you're constrained to things like the IMAP protocol?

Yeah. That's a that's a great question.

It's a a lot of work actually. Like, it is a ton of logic, a ton of effort has gone into making that sync happen smoothly. The Gmail APIs are

pretty good in this regard. So they, for example, give you a real time feed of

most things. And so if you're using shortwave, like, in some cases, actually, you get updates,

to your email before you see them in Gmail. You get them in shortwave first because the API is is faster than than Gmail.

But there definitely are some sharp edges. So I'll give a couple examples here. 1 of them is

not all of their APIs are real time and some of them are kind of expensive to pull. So for example,

contacts.

If you, you know, go and you make changes to a contact manually in Gmail, we don't get a real time update of that. And so there's a button in the settings where you can go and you can if you happen to have gone made some contact changes, you can click that. And we pull, you know, to get those updates, but if you, you know, we might not do it quickly enough for your case. So if you need to get an update faster, you can do that. Same thing with certain kinds of settings. So, like, if you change, you know, your signature from,

you know,

from Gmail,

if you,

you know, wanted to show up in short, you either have to wait a little while or you have to, like, click a button, or you have to sign out and sign back in for us to to refresh that.

The other,

I think, sharp edge I wanna talk about there's there's a bunch of stuff like that. But the other, I think, big 1 is

if you have an app that presents the data

in different format than people in Gmail are used to, figuring out how to transform those data models,

with each other can be a really interesting challenge.

So we actually have a different threading model than Gmail. In Gmail,

if I read an email to you and 1 other person

and each of those people responds just to me.

In Gmail, I'm gonna have 3 messages. The 1 I sent, the response from you, and the response from the other person.

And it's up to me as a reader to see all those 3 messages and figure out, oh, there's actually 2 conversations going on here.

In shortwave, we actually split those up, and we say, hey. You got 2 threads. You got, you know, the me and Tobias thread and the me and the other person thread, and they actually present as different entries in your inbox. And they they they mentioned each other. You can see that they kinda split in the UI, but we treat them separately. And you can label them separately, and you can, you know, pin them separately, and you can do these these things separately.

And Syncing that with Gmail to make sure that, you know, we handle all the weird edge cases that can come up where 1 of them preaches the collection of investors and the other 1 treats us as these, like as this, like, threaded tree is is quite a a fun challenge.

Another aspect of your product is the focus on email as a means of productivity

where we are exchanging a lot of valuable information, particularly when using email in a business context.

A lot of those messages will yield 1 or multiple to do items. You need to be able to track your progress on completing whatever that actual deliverable is.

I'm curious how the use of shortwave

changes the

workflow and communication patterns either for individuals or teams and some of the ways that you think about the

balance of the communication elements with the productivity elements.

Yeah. So a big goal that we have I I think I mentioned earlier, we're really focused on kinda power email users like execs and founders and CEOs and those those folks. A big goal that we have is to help you treat your email inbox like a to do list. And I think no 1 likes the fact that their email inbox is a to do list, but I I think you just have to admit it and say, hey. It is it's a to do list other people can add stuff to and rather than sort of pretend it's not, let's give you the tools to actually treat it like a to do list. And as an example as some examples and short ways of things you can do, you can

take threads and you can pin them to the top of your inbox. You can drag, reorder them. You can stack them up and give them names.

And very soon, we're gonna have, an expansion of this feature set. We can actually take those and you can put these on a separate page and you can, you know, group them in the sections and give them notes and things like that. So this is, like, a pretty

core part,

of our product.

And I'd say it is

it is not so much changing behavior

as it is helping people who are already doing this do it in a more native way in their product. So I I think when you go talk to users, you find

that a good chunk of users

don't wanna use these features, and they don't wanna do things this way. And and my ask for them is is mostly, well, maybe we're not the right product for you. Right? Like, I and, usually, the reason for that is they just don't have the volume of email where they really need a

system where they, you know, if they can more or less just handle things in the order that they come in, they probably don't have enough volume of email to really need something like like Shortwave.

But for the other folks who you go and talk to them and they say, oh, yeah. I got a ton of email and, you know, I've got this whole system in Asana or I have, like, you know, a ton of paper notes on my desk and I track them that way or, you know, I have

a a thing in Apple Notes that I use to track all this stuff. Like, these people, they have a system for doing this and we're trying to give them a way to do it natively

in the client. So rather than, you know for example, what I used to do is I used to literally, like, create a task

in my to do list that was, like, respond to this email, and then I would link to the email and then I'd archive it. So that way I could get to inbox 0. So I knew that I was on top of everything, but I could still have my to do list over here. We're just trying to automate that. We're trying to make that really simple and say, hey, if what you wanna do is

group related emails together, give them a name that extracts, like, the action item that needs to be taken,

prioritize

things, treat things that you have triaged separately from things that are new to your inbox, we're gonna have those tools, like, right there for you.

I have a good example of that being a podcast host. I have a lot of inbound messages of, oh, hey. I really wanna be on your show, or you should have this person on your show, or I just want to send you a message about my appreciation for the podcast. And so all of those different things require different workflows or somebody emails with a question about sponsorship.

And so I have had to develop a set of different labels where I'll drag and drop the email into the label to say, okay. I can't respond to this right now because it requires too much thought and consideration, but but I need to follow-up on it. And I don't want it to just get completely buried.

So for a situation like that, how would you go about using shortwave to

simplify and streamline that type of workflow for the case of a a podcast guest of, hey. I wanna be on your show. So I say, okay. I need to evaluate this suggestion of yes or no if it's the show. Then I need to sit respond and say, okay. Either I have some questions about it or let's go to scheduling.

It's scheduled. Now we need to write questions. So that's another to do, and it's all the way until the show gets to publish, and then I need to follow-up to say, hey. The show is published. Why don't you go ahead and share it? Yeah. That is exactly the the kind of workflow that we're trying to to help here. And I I think the fundamental problem that we're trying to avoid is if you just let all of that stuff pile up in your inbox,

you start to miss new things. Right? You start to have a 100 to dos, and then new stuff comes in. You and you don't notice and you don't get through them because there's just so much junk in here that you haven't read anyways. And then maybe something really important comes up. Right? And you miss it. And that's what we're trying to avoid. And so what we wanted do is say, hey, it's much easier

for you to decide what the next step is

for an email than for you to actually handle the email. So we wanna make you as soon as you see that email, figure out what the next step is. Put it in the right place,

and then don't worry about it. So a a podcast guest would be a great example. Right? So, you know, you get let's say you get, you know, every day, you get 10 people recommending podcast guests. You could take all those. You could put them in a single to do called consider for podcast guest.

And you put all those there and you set it on the side. And as new ones come in, you keep putting them onto the pile of consider as podcast guest. And if you have podcasts that are already scheduled, maybe you have, you know,

a particular person you're talking to, you could have a pile just for the conversations relating to that recording. You could have the the notes that you're exchanging and the scheduling threads and things. It could all be all be in a single stack. And so your inbox in shortwave would look like, you know, several groupings of different types of work with, like, clearly identified action items in each 1 of those.

And, otherwise, an empty inbox. And as new stuff comes in, you can decide, hey, do I care about this at all? If not, boom, it's gone.

If it's something related to an existing project, put it on a new project. It's a new project, you can create a new project.

And as you have been building this system,

riding the wave of large language models and all of

the turbulence that that brings along, what are some of the most interesting or innovative or unexpected ways that you've seen shortwave applied?

Lots of lots of fun examples.

I'd say outside of the outside of the AI world, I think it's people just

going crazy on grouping and organizing to dos. Like, I've seen screens where people have, you know, dozens of different projects. So, for example, like, consultants very often would have, like, many different people they're working with, and they might have a stack for everyone that they're working with. But in the realm of AI, I think it's even even more interesting.

So

people doing,

childcare stuff. Right? So, like, you know, 1 of the big applications

is has been parents dealing with their schools, where the school will send them, like, a ton of information about all the events going on, and they'll be like, hey. What are all the key dates here that I actually have to show up for? And rather than reading every email that comes in, they're they're just extracting information, making sure they know the action items. People have been,

writing blog posts

based on email history. This is actually something that I've I've done where the first draft

of some of the stuff on the shortwave blog was written by, you know, me saying, search my email for relevant emails about this product launch and, you know, look at customer feedback and, like, assemble a a blog post outline for this scheduling.

So we we today support

the creation of calendar events. The reason we support the creation of calendar events is

so many people were trying this before.

And, funnily enough, the AI would hallucinate that it could actually do this. So people would be like, hey, schedule a calendar event for me. And the AI would say, you know, yes, sir.

Here it is. And we didn't actually have that featured. But a bunch of people complained that, like, hey, it said to schedule a calendar, but it was not in my calendar. And we said, oh, we don't do that, but we could. And so we started adding that in there as well.

So just yeah. Just a few examples, but a lot of our product road map is informed by seeing people do interesting stuff in in the AI assistant, complain that it doesn't work the way they expect and us saying, oh, okay. Maybe we can make this work.

And in your work of building the product,

investigating

the capabilities

and features both in the email space and in the AI space, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

I think as a founder,

the the toughest lesson for me has been how little

transferability there is to some of the core founding skills that I thought I had. So, you know, when, you know, when we were at Google with with Firebase, we've been doing this for

several years, and we really knew the developer market really well. And toward the end, basically, everything we launched went well. People liked it. It grew. We got a lot of lot of users on it.

And I thought I had learned kind of the process by which you do the this product development, and I had a good intuition for, like, how to figure out what users wanted. And I thought that would transfer well to Shortwave.

And

the reality is

I had learned how developers think and how developers want products to work. I had not less necessarily learned how the typical email user,

works or or thinks. And turns out the typical email user is not a developer. In fact, developers tend to not be very happy email users.

And so,

you know, it's been a very humbling experience

trying to build this new product in a in a domain where, you know, the quality of our APIs is irrelevant. The quality of our documentation is not particularly important.

And what really matters is, you know, getting the workflows right for very specific UX interactions

and performance of the UI and design polish and things like that that weren't necessarily my forte. So,

yeah, definitely a lot of hard lessons about what it means to build kind of a more prosumer

UX heavy product versus a developed product,

and what it means to build a product for, a very nontechnical,

in most cases, audience.

And for people who are curious about what you're building,

what are the cases where Shortwave is the wrong choice?

I think anyone who doesn't have

a lot of email volume

or who, you know, isn't using it for a business use case. So if you're someone who,

you know, generally, Gmail is working reasonably well for you who, you know, you don't need a system to manage the introduced in your inbox where you're not sending a lot of emails in a given day, you probably don't need it. It's probably

not worth learning. It's probably not worth paying for. So,

yeah, like, kinda low volume users or or, like, personal users who don't have a lot of challenges or happy with Gmail, like, I wouldn't recommend it.

And as you continue to build and iterate on the product, as you continue to

keep tabs of the ways that the LLM space is evolving, the challenges

around data storage and retrieval for AI, what are some of the things you have planned for the near to medium term of the shortwave

product?

A ton. So

our AI assistant today

can

answer certain types of questions reasonably well. So you can you can go in there and you can say, for example, I could be like, hey. Who who who am I recording a podcast with today? And it'll probably give the right answer.

But if you say, you know,

give me a list of all of the emails where I've discussed recording podcasts,

it might struggle. If you say, you know,

who are the 10

best podcasters

discussed by our team, it'll probably struggle. So anything where you're working with, like, a much larger set of data, it's it's struggling right now. Also, I mentioned earlier constraints. So if you say things like, you know, what are the emails in my inbox that are cons that are discussing podcast recording? It's gonna struggle with that. So

we wanna

really double down on that AI assistant and have it move from a thing that can answer, like, relatively isolated specific questions to something that can answer much more

broad

questions or require aggregating a lot more information,

and that can answer questions with clear constraints,

where you can say, you know, only consider these types of emails or look in these places for this data.

We also wanna pull in more data sources. So today, like I said, you can you can look at the thread on the screen and the draft, and you can search your email history. You can look at your calendar, a few things like that. But you can't, for example, like, check out your Asana

or look in your Notion docs or read stuff in Google Drive, and we would love to do that sort of thing. You have integrations where

when, you know, you're asking a question that might benefit from additional data, we can authenticate you to that app and we can go look at that data for you. And then a lot of iteration on the UX side. I think 1 of the really exciting things about AI is

a lot of

UX ideas about how this stuff really should work optimally have been thrown out the window now that you can have voice interfaces for things and you can have text interfaces for things that are actually much nicer than they used to be.

And there's different types of information you wanna present. So I expect to see

UIs for products

change rapidly over the next few years as people learn new patterns. So I'll give you a good example of this. On Google Maps, you can pitch to Zoom, and I think people get really used to that idea, and I think it's, like, a really cool UX pattern. I've seen some demos recently of pinch to summarize, where if you could take a text block and you could, like, squeeze it on the screen and it summarizes. And if you expand it, it gets more verbose.

And I I don't know if that particular interaction is 1 that we wanna have, but I think this can be lots of sort of basic primitives of, like, what do certain sorts of gestures and things do on your phone

in regards to AI that maybe we can we can standardize and make really useful for folks.

The pinch to summarize gesture, that's an interesting 1 I haven't really ever thought of. So I'd be curious

to see how those types of interaction patterns evolve as we bring on these technologies

more broadly.

And are there any other aspects of the work that you're doing at Shortwave,

the ways that you have thought about architecting

the technical elements of a product that is dependent on AI, the vagaries of the emails ecosystem that we didn't discuss yet that you'd like to cover before we close out the show?

So so many so many little tidbits we could probably discuss, but, nothing comes to mind right now. Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.

Yeah. I think there's a there's a couple that I'll cite here. 1 of them is

testing.

So we today

with all of the, you know, the the AI assistance and

the, you know, summaries that we do and the autocomplete, like, the way we test this is just with a few golden examples where we say, hey, we know this used to work. Let's try that again.

And

because

each of those

test cases depends on you having the right, like, corpus of email in your history, it's actually really hard to to to make those

in an isolated way. Like, it's very hard to, like, have a a repo than ever everyone on the team can check out and, like, run these tests if it's testing against, like, 3, 000, 000 emails that you had personally, especially if you don't wanna share them. And so

I think our product, as it evolves forward, is constantly regressing in different ways, in ways that we're unaware of, and that's a really hard thing. And I've talked to a lot of other people who are struggling with the same issue of, like, how do we how do we test this thing? How do we make sure we're actually making forward progress? So I think testing is is a is a big piece of that. I think another piece is

building

LMS that are designed more for internal usage and agents rather than externally

facing,

like, end user output. So as I mentioned with our AI assistant

today, the

we use gbd4 turbo to actually output

information to the user.

But every time you ask a question, we're also running, like, 10 l l m calls to make decisions about which data to pull and to extract certain features from the text and to to rewrite things in a way that are friendly for doing semantic search and things like that. And those calls never present something to the end user, and it's very important that those calls produce output of a of a of a certain format.

And we run into a lot of bugs and errors where

those internal calls are not behaving well. So a good example of this is we did this thing called query reformulation

where

we take

the

history of your conversation so far and the context of whatever else is on the screen, and we rewrite the question that you just posed in a way that it's, like, ready for embedding. And

that often fails because the AI apologizes. Right? You'll ask the LLM to do this and it's like, I'm sorry, but. And then we embed, I'm sorry but, and we go search that. Those are the results we want. And it would be nice if we could say, hey, this is an internal call. Right? I want you to give the closest thing you can to, like, this type of output.

And, I I think there are people looking at this sort of stuff, but I I think there's definitely underinvestment here and a lot of opportunity. And we'd love some some better tools for, like, LMM calls inside,

software rather than as as end user output.

Well, thank you very much for taking the time today to join me and share the work that you're doing on the shortwave product. It's definitely a very interesting problem space that you're addressing as well as sharing your experiences

of

the data challenges

and product challenges

of working with AI and data embeddings. It's definitely a very

interesting space. It's constantly evolving, so I appreciate the time and energy you're putting into

understanding and tackling that and you taking the time to share some of your hard 1 knowledge to myself and the listeners. So, thank you again for that, and I hope you enjoy the rest of your day. Thank you so much for having me.

Thank you for listening. Don't forget to check out our other shows, podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning podcast,

which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a product from the show, then tell us about it. Email hosts at dataengineeringpodcast.com

with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Parting Question

Closing Announcements

Links