Gain Visibility And Insight Into Your Supply Chains Through Operational Analytics Powered By Roambee

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

Atlan is the metadata hub for your data ecosystem.

Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan's active metadata capabilities.

Push information about data freshness and quality to your business intelligence,

automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans could focus on delivering real value.

Go to data engineering podcast.com

/ atlan today, that's a t l a n, to learn more about how Atlan's active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork, and Unilever achieve extraordinary things with metadata.

When you're ready to build your next pipeline or want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends at Linode.

With their new managed database service, you can launch a production ready MySQL,

Postgres, or MongoDB cluster in minutes with automated backups, 40 gigabit connections from your application hosts, and high throughput SSDs.

Go to data engineering pod cast.com/linode

today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don't forget to thank them for their continued support of this show. Your host is Tobias Macy. And today, I'm interviewing Sanjay Sharma about how Roambee is using data to bring visibility into shipping and supply chains. So, Sanjay, can you start by introducing yourself?

Good afternoon. My name is Sanjay Sharma, and I'm the CEO of Roombi.

We are headquartered here in Silicon Valley, California.

And do you remember how you first got started working in data? Yes.

My previous startup was in the RFID space.

And while there were a lot of merits to this technology,

companies never had a line item budget to use RFID

in their processes.

So, obviously,

for that company, our business model was outcome based.

So we would go in, we take a manual process,

we deploy our RFID platform,

the hardware,

and we start deriving data, translate data into events, and finally, events into some business outcomes.

And then the way we made money was basically proving

that the benefits of RFID, the ROI

was much better than being in the previous process.

So, obviously, it's all hard. Right? I mean, you want to prove value to the customer

before and after savings.

It's always hard to be an outcome based

business.

But

after a few implementations,

we figured out, you know, what the secret sauce was to prove to the customer,

you know, how RFID could save money. And a piece of that was sort of our licensing fees for the platform and the services. So when there were companies who were doing,

you know, licensing of their platforms. We actually ended up translating that probably

1 of the few RFID successful companies at the time that actually were profitable

and growing and making money.

That's a little bit of a journey

in translating data into revenue.

In terms of Roombi, can you give a bit of an overview about what it is that you're doing there and some of the story behind how you got it started and why you decided that this was a problem area that you wanted to spend your time and focus on?

Yeah. Goes back to

the previous startup. Right? So with RFID, what we were solving was

delivering visibilities

inside the 4 walls of the enterprise.

Whether it was monitoring of inventory, whether it was monitoring movement of assets,

or any other problem statement the customer had within the facility, within the 4 walls.

After the company got acquired,

you know, we started looking at, you know, what the next problem would be. And after talking to many enterprises, we found out that the real problem was and a bigger problem was, you know, not having visibility outside the 4 walls of the enterprise.

And initially, we thought, like, you know, the enterprises would know where their goods and assets are,

and all we have to do is take all of that data and translate into

some very interesting efficiency outcomes.

But it was quite surprising when we started talking to the customers. They didn't even know, you know, where their goods or assets were once it left their dock door or left their facility.

And that's sort of where the journey of Roombi

sort of started.

Roombi is actually a data science company,

But we think that in order to deliver

very interesting analytics that can impact our customer supply chain,

it is very important to collect

highly granular

and accurate data.

And that's sort of how we got into

bringing the hardware component to our service.

And our thesis is if there is a sensor

or a device that's

available in the market,

we will bring that device in our ecosystem.

But if there is no device that solves the use case, we will go build 1. So we have a hardware engineering team. We constantly look at this ecosystem.

And, obviously, 1 thing that I learned

when you walk back or top down from data analytics to data collection,

we are collecting

10, 20,

you know, pieces of data for a shipment moving from, I don't know, Shanghai to Hamburg.

It's very easy. But when you are talking about millions of shipments,

500, 000, 000 bins

going through the supply chain,

or

11, 000, 000 trailers

moving from point a to point b delivering items. It's a completely different ballgame.

And that's where the enterprise grade

sensing devices come into play

that can, you know, self heal itself.

If there is a problem,

we can basically make sure it has a heartbeat. And if it doesn't have a heartbeat, we can basically send a heartbeat, make it come alive.

But at any point of time, collect

good amount of data

that then we can use

and extrapolate

using third party data streams as well to basically deliver a business outcome.

Right? And I can give you some examples. Right? So our customer basically

is moving

vaccine products, say, for example, you know, from USA to Bangladesh.

Now if that vaccine is sitting at Dubai airport

in room temperature for 5 hours,

the efficacy of the vaccine is dramatically reduced from the 1 year to maybe 4 months,

which impacts how these vaccines are going to be consumed

by the individuals in Bangladesh.

So collecting that data

granularly

and delivering value or delivering actionable intelligence for our customers is what that's what Roombi does

and does that very well.

So as far as

the personas of the people that are looking to a solution like Roombi, I'm wondering if you can

give some of the kind of broad categories of

what their positions are, what it is the the type of work or industry that they're working in, and some of the questions that they're looking to be able to answer about

the different kind of assets and resources

that they are in charge of managing.

Break these personas into few flavors. Right? So,

obviously, what comes to mind when you put a sensor on your goods and asset

is

it must be high value,

and, hence, it needs to be secured.

So our customers basically

would naturally

low hanging fruit. Can I basically use this technology

to secure

my products as it moves from point a to point b? And secure

means,

you know, security from theft,

security from getting misplaced,

security from you know, there are many handshakes that happen within the supply chain. Make sure there is an audit trail.

So right off the bat, the individuals that we talk to are

security

personnel who are responsible

for moving

or managing the security within the supply chain. So that's sort of 1 persona.

The second persona

is the supply chain

logistics persona.

These are individuals

who are looking at

optimizing the supply chain.

And the first thing they want to do is they don't even know where to optimize. Right? So the first thing they want to do is basically

fix these sensors

on various goods and assets

and light up their supply chain. And when they light this up, we basically do a good job of that. And when they light this up, they identify the glitches

that are there in the supply chain. And then these glitches becomes objectives

either to be improved

or to be fixed. Right? And then there are

a third persona

which is basically

the

business owner

or the finance owner

or either it's a CFO or it's a GM of a business unit.

And their problem statement is, okay. Now that I know in real time where these glitches are and I have confidence in my team that will fix it,

how more can I derive

in forecasting?

So today, as an example, right, demand forecasting at a very nontrivial

problem statement

every customer faces.

Can Roombi

deliver some very interesting demand signals

or inventory shock signals

or delay signals

that can basically I can extrapolate

and adjust my inventory or adjust my delivery efficiencies. Right?

And it's not a 1 month or a 2 month engagement. Right? It's iterative.

You continuously are feeding data,

training the models, and making sure you are getting closer to the truth.

And that's what Roombi helps

enterprises. Right?

All our customers are global

1000, global 2, 000 kind of a companies.

So the value we deliver,

you know, has a very exponential benefit, not only in terms

of either increasing the top line or reducing the bottom line, but also there are intangibles

where we are delivering this data

to the field force,

which can now act on it. Right? Whether it's a warehouse manager, whether it's a logistics manager, which never used to be the case

in a technology

category where, you know, you are using barcode and

EDI and some of those things like that because they were never enabled to take decisions.

Now with real time visibility,

you can empower some of these field workforce to take decisions and decisions quickly.

In terms of the

types of information sources that you're working with, you mentioned barcodes and EDI. I don't know if you can expand on what that acronym stands for when I hand it back over to you. But I'm wondering what are the sources of information that you're dealing with and some of the

structure that that information takes as you are, you know, collecting it at the source and then managing it into, you know, the raw stage of your data infrastructure?

The dinosaur era of data collection when it comes to real time visibility

is barcode. An an analogy I give you is, a typical FedEx or a DHL package.

Right? You know where it is only when it gets scanned in 1 of their hubs.

And by the time you know

the state or the movement of your courier

or your documents has changed. Okay? So it's time delayed. You can't act on it. And most often because it's a manual process,

it's also error prone.

EDI,

think about is old school email communication.

Right? So when I, let's say,

Lenovo or an Apple or 1 of those companies,

when I basically hand over my container of products

or air cargo or pallets.

I hand hand it over to a transporter.

I tell my transporter, hey.

Every time you basically, you know, have a meaningful update,

you have to

send it through

electronic communication. It's like an email, but it's, like, got its own language. You have to send it through an email format, and that's sort of what EDI means. So electronic data

interchange.

And what we are doing is we are basically

replacing some of these

old school technologies

with real time visibility. So for example, if you were to put a tracker on your shipment,

of course, you will get a barcode scan from the FedEx guys or the DHL guys, but we can do better. We can tell you every 15 minutes,

you know, as granular as that, where it is. We can tell you the health of the shipment. So is it hot and cold? Was it mishandled?

Did somebody tamper?

Is it separated from its other packages and cartons? So there is a lot more you can

gain out of this real time visibility,

and this then becomes extremely actionable.

As far as the

types of latencies that you are

aiming for in this kind of real time

approach, I'm wondering what are the delays that your customers are willing to accept? And,

you know, going from the barcode stage of, I just hope to be able to get some information because hopefully it actually gets scanned at the right places and doesn't just skip a terminal

to you're using Roambee, so you can expect a latency of whatever that time delay is. And some of the other types of guarantees that you're focused on providing, whether it's in terms of accuracy of the information,

how you manage things like data quality and being able to manage some of the kind of provenance of that information of, you know, I can verify that this sensor was processed through this port because I have some

signature of the scanning device that registered the scan event or something like that?

Yeah. Absolutely.

So if you look at it, right, I mean, there are various

flavors of customer requirements. So if you're a coffee filter company,

of course, you don't need to know every 15 minutes where your shipments are.

But if you're a pharmaceutical company, you do. Okay?

Now the coffee filter company is most interested or a retailer

who is making,

you know, bath products

and delivering to the Cascos and the Walmarts of the world,

their problem statement is,

did the product get delivered

on time in full,

and did it get delivered

in a quality condition?

For example, the 1 of the retailers I was talking

to was saying, you know, I'm shipping 300 pallets to, for example, Costco.

And Costco says I received only 290

or sometimes they say I received 310

were broken.

Now I'm in a receivable dispute with Costco,

which is elongating

my order to cash cycle. So I don't get paid for a long time. And if I'm operating at 2, 3% margin,

it is impacting my cash flows. It will be impacting

other KPIs from a finance perspective.

And if I'm a $1, 000, 000, 000 company,

even, like, 2 days

savings on the order to cash cycles is huge.

So they come to Roombi and say, Roombi,

I don't want you to come and tell me whether the shipment is on highway 101

or, you know, I5.

All I need to tell you with a high level of granularity and confidence

that this shipment, this 300 pallets were indeed delivered.

They were delivered on a Wednesday at 2 o'clock when I was supposed to be delivering that. It was in great quality. It was not tampered.

And if I can get that information with a high level of confidence from Eurombi,

I can share it not only with the likes of Costco,

but I can use that to automatically trigger an invoicing process in my SAP.

So the meaning of tracking is now completely different from a retailer

compared to a customer that was using us for security,

compared to a customer that was using us to

improve or maintain the efficacy

of the pharma product.

So

there are various inspirations the customer uses us for. And what we like to tell our customers is basically this is a journey. Okay? You start with collecting data first that basically tells you, you know, where the issues are. Then you basically use that data to start making some improvements.

And our customers come to us and say, you know, on day 1, they would want all kind of alerts. You know? For example, the truck stopped for more than it was scheduled. The truck took a left turn. The truck is 50 miles away from destination.

But once they start seeing the confidence in the data,

most times our customers will come, Roambee, can you not send me the 5, 000 alerts? Can you send me the 5 alerts that really matter?

This means that the customer has now graduated

to exception handling.

And that's sort of 1 part of the journey. Right? So we start sending them

only the problem alerts that matter, and that's sort of where AI, ML, and all of these interesting technologies come in play.

Then the customer starts thinking about, hey. You have given me 5 alerts,

but

you have given me as and when they occurred. Can you start predicting

some issues in my supply chain or disruptions?

So now because of, you know, our data collection been extremely savvy and granular,

we are able to predict

a spoilage risk, for example. We are able to predict the theft risk, able to predict the damage risk. We can do better ETA on when it's going to be delivered to a Best Buy store in Florida

much better than anybody else. So that's the next level of graduation

of the customer.

And the customer then starts saying to us, hey, I love the fact that you're now delivering some predictions.

Can you now integrate

with my planning system?

And it could be an ERP. It could be a warehouse management system. And if you do that, we would be able to get a visualization

of what was planned

and what is happening on the ground from Roombi. And now we are in a better position to compare

and start playing with some interesting knobs

in the business that we are in. Right? And it could be demand forecasting or it could be vendor managed inventory

or it could be as simple as customer satisfaction.

And then finally, the final state that our customers look for

is

having lived through all of this efficiency

data,

can I start modeling it? So, you know, most times hear the buzzwords called digital twins.

Can I basically build a digital twin on 1 or more processes within my supply chain

and feed that digital twin with a Roombi like sensor based data?

And if I'm able to do that, I can model a much efficient supply chain network.

So that's sort of how we graduate our customers from a very basic,

I'll tell you where things are, to actually be more efficient.

And our ambition

is to enable the autonomousness

in our customer supply chain.

It's easier said than done. Right? I mean, autonomous means self healing,

contextual,

dynamic.

And we think we are far ahead of the pack

because

we are using sensors

and driving some sensor intelligence

to enable various signals within the customer supply chain.

So that hopefully gives you some idea on how we are unpacking

the data journey for our customers.

Data engineers don't enjoy writing, maintaining, and modifying ETL pipelines all day

every day, especially once they realize that 90% of all major data sources like Google Analytics, Salesforce, AdWords,

Facebook, and spreadsheets are already available as plug and play connectors with reliable intuitive SaaS solutions.

Hivo Data is a highly reliable and intuitive data pipeline formed used by data engineers from over 40 countries to set up and run low latency ELT pipelines with 0 maintenance.

Posting more than a 150 out of the box connectors that can be set up in minutes, Hivo also allows you to monitor and control your pipelines.

You get real time data flow visibility with fail safe mechanisms and alerts if anything breaks, preload transformations and auto schema mapping precisely control how data lands in your destination,

models and workflows to transform data for analytics, and reverse ETL capability to move the transformed data back to your business software to inspire timely action.

All of this plus its transparent pricing and 247 live support makes it consistently voted by users as the leader in the data pipeline category on review platforms like g 2. Go to data engineering podcast .com/hevodata

today and sign up for a free 14 day trial that also comes with 20 fourseven support.

In terms of the

implementation of the Roambee platform,

I'm wondering if you can break it down into the

kind of sensor level capabilities and integrations that you're working with and the

data infrastructure and architectural elements that you're providing and some of the types of integrations that you have to build to be able to support the downstream use cases that your customers are looking to power?

Obviously, the first layer in the architecture is the hardware abstraction layer.

This is something we learned through a lot of failures. Right? You got fast throughput data coming from these sensors.

You need to really have, like, a limitless

messaging pipeline.

That is,

1, listening to this data

and acting on that data.

So how can you basically build this limitless message pipeline

that is processing messages at speed?

So we actually ended up, you know, our first bottom layer, I would call it as the hardware abstraction

layer because we can translate

any message. So even if it's a proprietary device, which is a non room b sensor, we should be able to translate that message,

pass it, and and push it into our pipeline that we can process it very fast.

For that, we've been basically, you know, using,

you know, things like Kubernetes,

Kafka architecture,

you know, also

dockerizing our container based services at scale. Right?

Once you have collected,

translated,

extrapolated,

and indexed this data,

you're not trying to make sense of, are there any meaningful

events

that I can break out of it? And maybe the data is an event in itself,

or maybe the event has to be derived

out of multiple streams of data. And that's the decision making that sort of happens. I think about it like the eventing layer. Right?

And then you start translating

those events into signals.

And think about the signal could be, as I said, a demand shock signal. A signal could be, I delivered on time in full signal. A signal could be a delivery confirmation.

A signal could be

noncompliance

with that temperature threshold.

A signal could be your shipment is going in the wrong direction.

Right? So over a period of time,

we think we will build this 5, 6, 700

signals

that are derived from sensor and sensorless data.

And sensorless data could mean

weather data, it could mean traffic data, it could mean ocean data, it could mean just a variety of other signals that basically we can bring.

And maybe

we have 5, 6, 7 week signals,

but the combination of 7 week signals

could be a very derived high confidence

event that basically we can deliver to our customers.

So that's sort of how I see this architecture play out. Our integration capabilities are extremely consumable

through webhooks.

You know, we think integration

should not take months. It should take days.

We have, you know, very classic standard plugins when it comes to SAP or Oracle or when 1 of those planning systems that our customers have that basically can take advantage of it. So that's how our architecture is laid out. And obviously,

you know, we have some,

you know, applications, enterprise

applications on top of it.

We think it's best to deliver the flexibility to the customer when it comes to the dash boards and the visualization of this data can either reside in an SAP

kind of a system,

or it could be in some kind of a BI tool. Or if the customer wants to have it live within the Roombi ecosystem,

that would work as well. So that's how we basically

deploy our solution.

It's running on any pass because it's completely dockerized. So we run on Azure, the classic Amazon,

IBM Bluemix, and Red Hat. But there are also customers who want us to run on private cloud.

And having this architected in a manner that we can

scoop these containers and deploy it on any data center or bare

metal makes it easy from a DevOps perspective to manage this very well globally.

In terms of the

evolution of the

ecosystem around the types of sensors that you're working with, the formats that the

data packets are generated in from those different sensors,

the volume of shipment, the types of analysis

that companies are looking to perform on their supply chain logistics.

I'm wondering how those

shifting targets have influenced the way that you think about the kind of design and focus

of Roombi and where you'd have been

targeting your investment to be able to kind of stay ahead of the curve and be able to provide value to your customers

as their

capabilities

and goals are constantly in flux as well?

It applies to all silos or modules that bring this service together. So let's start with the sensor. Right? We think,

at scale,

the sensor

should be

easy to use. What I mean by easy and smart. Right? So our sensors basically today

are purposefully built for monitoring of these goods and assets. What I mean by that is the sensor is smart enough to know the journey has begun. The sensor is smart enough to know the journey has ended. The sensor is smart enough to know it's on land or air on ocean and possibly reconfigure itself properly. The sensor is smart enough to know, hey. I'm going to die. I have only x amount of battery life,

so give me a boost.

So that's the part of the sensing capability that's very smart on the sensor side. But it when it comes to processing or decision making,

we can do some of it at the edge or we can push it to the cloud. And that's sort of where, you know, we bring that flexibility of that decision making.

The second part is basically starting to think about

once this data gets collected

into our cloud stack,

how this is going to be massaged or how it could be basically filtered through. And there are some very interesting industry standard filters that's, you know, that standards that are available that we use, but there are some proprietary ones as well. Right? And the fact is about learning.

Unfortunately,

these devices are not as powerful as our iPhones.

The iPhone you charge

daily,

these devices have to live for 90, 100 days on a single charge,

transmitting at

15, 20 minute interval or even 1 hour interval.

So the use cases are very different

when you start thinking about

mobility

in the supply chain. The the other piece I also see is customers are having a very diverse,

you know, ecosystem,

IT ecosystem. Right? You got Wi Fi, you got Bluetooth, then you got

ERP, you got, transportation management system, you got yard management system.

And the customer is basically now wanting to take advantage of some of the signals we deliver. For example,

we have a customer who said, Roombi, can you tell us when the shipments are 50 miles away from my destination

so that I can take that feed.

I can start

thinking about what is the open dock door that I can assign you to. So when you come in, you are not spending 7, 8 hours waiting for deliveries. You exactly know which gate, which dock door you are going to go. And by doing that, I'm compressing my transportation cost, for example.

There are many, many

scenarios that our customers come up with. Some of it is the ones we embrace are standard and repeatable.

The others are something that we empower our broader partner ecosystem to build on. In that situation that you're giving an example of where your customer says, I want you to trigger an alert when you're within a particular range of the destination so that I can do some appropriate routing.

Is that a situation where then there would be bidirectional communication

from you to them and then them sending information back to you so that you can feed it into some other destination systems?

Or are you largely dealing with a 1 way feed of information and then they do the processing on their end to figure out what that routing looks like? Yeah. I think 70, 80% of the time, while we can operate bidirectionally,

80%, 70, 80% of the time, it's single. So think about it. We are triggers to their workflows.

So these workflows are already defined.

We don't want to win the workflow business. We don't claim to displace any of them. But what we do and do very well is trigger this workflow. As an example,

right, I can only invoice if I deliver.

So, Rumbi, can you tell

confidently

I have delivered?

Okay. So that's a trigger that I can give you. Another trigger you talked about, and I give an example. Right? 50 miles away from destination.

That's a trigger. So these are interesting triggers that allow our customers to trigger some of those workflows

which has been built for many years

and which are being used by their field force for many, many years. So we don't want to change any of that. Right?

But then there are other triggers that basically we will enable the customers with. For example,

can you basically detect

there is a tamper?

And if there is a tamper, there is a complete different escalation process

that needs to be handled.

Now interestingly,

in the barcode world,

the tampering was never

detected or it was detected only at the end of the journey. So there was no escalation procedure to follow through. But now with Roombi,

the customers are basically reinventing their cells because now we are giving more triggers

that they could not even

get out of their existing infrastructure.

And that basically helps the customer,

you know, what we all call digitally transform themselves

to be more agile and more proactive

in some of their decision making through this complex supply chain.

Another interesting

element of the problem that you're describing is for the case where you wanna be able to trigger based on a particular

time or distance to destination,

is that that implies that you have to have information about what the destinations are, what the rate of travel is, and being able to incorporate some sources of information that aren't strictly tied to the specific asset or mode of transportation that you're dealing with. And so I'm wondering if you can talk to some of the platform capabilities that you've had to build in order to allow your customers to

provide you with some of those additional

values and some of the other sources of information that you're pulling in, whether it's things like weather data,

climate information, political events to be able to add in some of your predictive capabilities around. Oh, okay. I can see that there is a traffic event that's happening. So there was an accident on the highway, so I'm going to let you know ahead of time that your your shipment is going to be delayed by 3 hours. Just some of those other data feeds and sources of input that you're building to be able to have that kind of richer context around the alerting and triggering and predictive capabilities.

It was interesting. When we started about 5, 6 years ago,

we wanted our customers to enter the origin and destinations.

Okay? And they loved it, okay, because they were only doing 10 shipments.

And then 10 became a 100,

and they said, why don't you integrate with our back end and pull some data, the origin and destinations from that shipment?

And then few months later,

you know, customer said,

even I don't know,

you know, until the last minute when the shipment is leaving

where it distained to. Okay?

Or sometimes that information is fed in a very time delayed manner.

So

we don't expect our customers to give us the origin than destinations.

Okay. We are in the business of visibility, so we know our device when it moves out of a location. We know

through a lot of interesting ML and things like that. That's been the origin,

and the shipment has started.

And let's say from that origin,

based on the patterns that we have seen in the last 6, 7, 10 months,

the origin

to destination

pairs let's say there are 8 destiny origin destination pairs.

So when it starts, there are 8

possible destinations the shipment could go. But as the journey starts

progressing,

we start basically shortlisting those those origin destination pairs because we know

which are the possible destinations.

It's basically getting shortlisted to a point where now we have absolute clarity

through our technology

that the destination is 100%

point b and not point c. So this is all built into our system. This is all built into

you're using some very interesting probability theories here to start

deciding

where this shipment is destined to.

So that's 1. And second is verification. Right? So once we basically

make some of these

estimations,

at some point of time when the data feed of origin and destinations comes from the ERP into our system, we are able to verify, self correct, and and make it more automated.

Mind you, we are basically talking about not 1 to 5 shipments. We are talking about 10, 000 shipments. We are talking about you know, we worked with a very interesting

pharmacy company that basically delivers

cancer medication.

This medication,

from the time it was

developed,

needs to be delivered to the patient within 8 hours.

So there is absolutely no room for error.

You basically make sure that your ETA is very, very,

very, very calibrated,

and it's in the minutes

of what the forecasted.

So those are the kind of applications

we basically

are a very strong fit on. Now

from a data perspective, right, most companies think that this is a very big data play.

IoT is generating tons and tons of data, and we need to put, all these big data lakes and things like that. But it's interesting. After

we started looking and

consuming this data, we found that data has its own expiry dates.

Okay. So for example,

you know, the sensor said the temperature has gone up by 1 hour or 1 degree.

Now that 1 degree

exception or excursion

has probably a life of 2 hours if either you act within 2 hours or you don't. But after that, that data is meaningless.

So once you start,

you know, putting some expiry dates around the various kind of data streams you basically,

you know, get, and then you start deriving some of those contextual signals,

you're dealing with a small pool of data

for our customers.

So that's how we basically

view the world

of data, IoT data,

and other data streams in our ecosystem.

In addition to that, you also have capabilities

for customers to be able to build and

provide their own integrations for different downstream data systems that they wanna be able to work with.

You in addition to the sensor based capabilities

for adding context to a shipment, there are options for

user generated or user input data. And I'm wondering how you manage some of the quality controls around those

kind of destination

integrations and some of the user inputs that are outside of the control and ownership of Roambee?

It's hard.

And it starts with the physical addresses itself. Okay?

So we are working with a chemical company, and the chemical company said, here are the 60, 000 physical addresses that we deliver to. Now when you consume that 60, 000 physical addresses,

they're all wrong. Right? They are missing a digit or they are in a place that's not identified by the map. So now you start need to clean this data because

this data is so important because you got to geofence this data. And geofencing in our business is very key. If you put the right geofence, you can get a lot of accurate events.

But if your physical address is here and Google Map or any of the other mapping technologies says down the street

and my sensor is going in a very third direction,

which 1 is accurate?

So it is very easy to manually move these dots and say, this is my new dot and this is my new physical address. But you're dealing with 6, 7, 000, 000 records. How can you do that manually?

So you almost need a engine that basically start looking at, actually, where did the sensor go and where did the sensor go not 1 time but 10 times? What is the map lookup looks like? What is the address that was given? And combine that into 1 single source of truth. Okay. And to do that a 1000000 times or or and or and or again.

And then applying

geofencing

automatically.

Right? So the entire world of tracking today is, you know, drawing this

orthogonal

lines or boundaries

around the geofence you want to. Geofencing

is a nontrivial problem. How do you apply geofences

on 60, 000 addresses automatically?

Okay? And geofencing is also

as good as the signal strength of your tracker. So if you make the geofence too small,

okay, your tracker might give you a false positive saying

it is in and sometimes it is not in. Right? Even though the physical goods might be still within a warehouse.

So your geofencing has to be elastic, and that's sort of where

ML comes in. Right? So you start with

a bigger geofence

because you want to capture all of the dots that's there in that region, and you expand and contract until

the geofence

finds its right shape and size.

And to do that, you know, a few 1000000 times over without human intervention is again nontrivial.

So that's how we basically up our game when it comes to location accuracy.

We up our game in terms of ensuring when goods and assets when we say the goods and asset are within our warehouse

or within a local address, we have a high level of confidence

on that information that we deliver to the customers.

So that's how we basically do that. We, of course, integrate with any of the third party tools, whether it's a planning tool of the customer,

or sometimes it's also integrated with 3rd party cloud analytics tools that are available to the customer.

Prefect is the data flow automation platform for the modern data stack, empowering data practitioners to build, run, and monitor robust pipelines at scale.

Guided by the principle that the orchestrator shouldn't get in your way, Prefect is the only tool of its kind to offer the flexibility to write work flows as code.

PreFect specializes in gluing together the disparate pieces of a pipeline and integrating with modern distributed compute libraries to bring power where you need it, when you need it. Trusted by thousands of organizations and supported by over 20, 000 community members, Prefect powers over 100, 000, 000 business critical tasks a month.

For more information on Prefect, go to dataengineeringpodcast.com/prefect

today. That's prefect.

As far as the onboarding process,

supply chain logistics is a very complex and interdependent

area and requires a lot

of moving pieces and integrations to be able to fully realize the

potential for a solution like Roambee or being able to understand

what is the actual impact on my business and my shipment capabilities and my, you know, business logistics.

And so I'm curious how you think about the overall onboarding process of, you know, doing a proof of concept with a customer, figuring out what is the actual targeted use case that we want to invest in and be able to identify,

is this actually going to be useful for this user? Do they have the appropriate capabilities on their end to be able to take advantage of the capabilities that we're providing,

and then being able to just kind of demonstrate the overall utility of Roombi in the kind of shortest time to be able to demonstrate that value?

From our

service perspective,

lot of the

use case

product solution fit happens during the sales

process.

So we exactly know what is going on. For example,

you know, we are working with 1 of the largest retailer

in Europe,

basically who used to move products to trucks.

And now with the fuel cost

gone up and the truck journey become unpredictable,

they are now moving to railcars.

Now when you move to railcars, yeah, you get much more predictable ETA.

But railcars,

you know, if railcar is misplaced, you're talking about loads of misplacement. Right?

So,

few things that we do and do very well on the onboarding side, which is a combination of

making

the solution very simple. So starts again with the hardware. Right? The hardware is always on. So when I ship these hardwares to the customer, they are preconfigured.

You know, all of the black magic that goes into the hardware is all ready to go for that use case. The customer doesn't even have to push a button to turn it on. Because

imagine, you got an a warehouse operator who wants to stick this on their pallet, and they wanna do that 10000 times.

You don't want to be at the mercy of the operator

to know if the button was turned on or off on the device.

So our devices are always on. You can only shut them through a hammer or throw it in water. So that's sort of an example of simplicity. Right? So we bring a lot of simplicity

in our technology, in our solution.

Then second part is the people. Right? So we basically

part of the onboarding exercise

is to train the people.

And training the people is through, you know, we have a Rombi University. We sort of, you know, make this more

certification

driven. So we will basically bring some

personas

in a virtual environment and train them and certify them. So that's the second part.

And the third part is a process.

Supply chain, through my experience, is all about SOPs.

Okay. Can you give me a word document with these 10 steps that I have to do all day long?

And that's what we have. So we have invested in a lot of technology that allows us to build these SOPs out for the customers that we can give it to them and make it a part of their process.

And if you do these

3 things very well,

there's a very little friction

when it comes to customers embracing this new solution

and realizing the benefits from the deployment.

In your experience

of

building Roambee and working with your customers

and working in this space of supply chain management, particularly in this,

shall we say, interesting time of the past couple of years with the pandemic and

the changes in how

kind of workers are orienting themselves around how they think about kind of the value and the types of jobs that they want to do. So that introduces challenges around kind of staffing levels. What are some of the most interesting or innovative or unexpected ways that you've seen Roam be used to be able to navigate this time?

From an internal perspective, we also got hammered by

the short supply of semiconductors.

Right? So we had to innovate our own supply chain in making the devices. The demand is so huge right now that we are not able to, you know, deliver these services to our customers fast enough.

So we had to take parts of our own

device manufacturing supply chain in our own hands, and that sort of was 1 interesting exercise.

The second thing that we learned is from the customer is

every use case is very different.

So how do you basically

derive what parts of these use case can be repeatable

and either bring it into a process or bring it into a technology for automation.

And it could be as simple as fixing this device.

Okay. There

are probably 10 different ways

of doing this. Okay. You talk to the customer. So earlier, we were all

became very anxious because every customer wants to apply this device in a very different fashion.

And then we started applying science behind it. Right? What are the all the possible ways that could happen? So if if your shipment is circular, well, how do you deploy it? If it's a rectangle, how do you deploy? If it's a wooden crate, how do you deploy it? If it's a container

so

really making it a cookie cutter approach was something that we benefited from by early access to these customers,

by doing a lot of field visits and understanding the customer's problem statement. Right?

And lastly, in the very unique ways the customer is using this data,

sometimes the customer is using our data to really rate transporter behavior and transporter performance,

which we didn't think was the value that we were delivering.

Then there are some customers who are rating

route quality.

So if I were to ship this on a 280

versus I ship this on a highway 101,

You know, what is the quality of these routes? So can I score these routes? And based on the product, can I pick 1 route or the other? It is okay to have a longer route, but a better route. Right? So that's something that we didn't realize we were built for, but the customers are using it. Then the customer is also wanting to look at load and unload times. Right? So our technology,

when your shipment enters a geofence,

we deem it as delivered.

And the customer said, that's not good enough. It may be in a geofence, but it's still waiting to be unloaded. So can you basically give me a time dwell time between unload and load? And that's sort of where our technology basically push the envelope based on the customer ask. Right? So these are some very interesting

feedback that we got from the customer and brought this back,

into

making part of our onboarding process or part of our technology and application

to make it just super simple. Right?

Our customers constantly tell us, if you deliver value

with no mouse clicks,

that's the biggest benefit. But even if you want to, you know, have me click 1, 2, 3 times

in a in a warehouse that gets really mad after 3 o'clock in the afternoon,

we might have visibility value, but it's not helping the field force that's do actually doing this.

In your own experience

of working at Roombi and working in this space, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

Challenging lessons is quite a few. You know, when we move these

because of our entire business hinges on the sensors,

The sensors have to be certified

by country.

Every country has its own laws. It has to work on

almost any type of communication medium,

and cell tower infrastructure is very different parts of the world.

So just dealing with this was we are a technology company. We are not prepared to deal with a lot of this paperwork.

So, you know, obviously, that's sort of 1

challenge and 1 surprise that came into running the business.

The second was, basically,

how do you charge the customers?

Right?

So, obviously, our charging model is very much like a talk time model.

So, you know, my family is on a 5000 minute talk time plan. Whether I use it or not, you pay for 5000 minutes.

And that's sort of how we do this for our customers. But I think there is a better way, which is usage based. Right? So we are embarking on some ideas working with the customer to make it usage based. And lastly is, can we basically bring the cost of the solution down

so that it is on all packages and all shipments?

Obviously, you know, we can't be a chip company overnight

to build a system on a chip. But we feel that if the industry can go in the direction of a system on a chip, plus an antenna, plus an battery, all combined together in the disposable form factor manner

for sub $1,

I think there is no reason why

every box should have this kind of a technology.

I call that as a challenge, but I also call that as an opportunity for the industry in general to rally behind

the transformation of supply chain.

For people who are running a business that has significant supply chain chain requirements or they're trying to gain better visibility into the supply chain that they have, what are the cases where Roombi is the wrong choice?

I think Roombi is the wrong choice when you're

looking to monitor

vehicles

and not shipments. So most time, we basically get compared with the Fleetmatics

solution.

Right? They say, hey, but that vendor is offering

$10 at obd 2 device on a truck. We are more granular. So if you are looking to monitor a shipment within a truck, within a container,

within a plane, within a rail car, we are it. Right? But if you're looking to

just track the vehicle itself or the carrier itself, then we are not 1.

2nd is basically,

if you are looking for high volume monitoring,

we are the right company. So most times, you know, our customers will come and say, I've got a problem lane from, I don't know, from California to

Rhode Island, and I only want to do that lane. And I do

10 shipments on that lane. And that's sort of what my universe looks like.

We are not the right guys. We we might fail in very small implementations,

but we are classically built and thrive on

chaotic

multi tier supply chain.

You know, there's a lot of movement, a lot of volume of shipments moving between various nodes.

And the third part is last mile. Right? So

there are a lot of last mile delivery solutions.

If the solution last mile delivery solution does not have a requirement for condition monitoring,

we are probably not the right company to address that use case. You are better off, you know, having solutions that monitors our driver

through a driver app and combine that with telematics

of the truck.

And by doing that,

you can get that last mile visibility at a much cheaper cost than Roombi.

As you continue

to

build out Roombi's capabilities and evolve the platform, evolve the company, and continue to explore this

drastically

evolving space of

logistics and supply chain management and supply chain analysis? What are some of the things you have planned for the near to medium term or any projects that you're particularly excited to engage with?

I think the 2 ambitions we have, 1 in the short term,

is

addressing spoilage

in a big way. So

we

recently won an award from the

FDA.

Our solution is not only low cost, but actually has a lot of promise when it comes from farm to fork kind of implementations.

And part of that is also about

choosing the right packaging

for moving these products.

And we think because we monitor a lot of things around the package,

not only its health,

its handling,

but also its temperature, humidity,

its tilt and shock and tamper. And many of these parameters,

we think we are a company most suited to deliver

a packaging recommendation for our customers.

And if a right packaging recommendation can take 50%

of the risk of spoilage off the table, then the remaining 50%

is in transit.

And if we can solve that problem through granular visibility

and real time intelligence,

I think we can solve some of the

glaring problems in the industry, not only just on the food, fresh produce side, but also, you know, on the automotive

side, on the retail side, on the pharmaceutical side. The second ambition, which is a little bit long term,

our thesis is the supply chain of tomorrow.

And when I say tomorrow, it could take 3, 5, 7 years, It is going to be autonomous. And

we are seeing a lot of early steps being taken

by academia,

by companies like us, by large companies like the SAPs to go in that direction.

And we feel that Roombi is uniquely positioned

to enable the autonomousness

in the supply chain

through portfolio

of these real time signals

that we

can derive

from movement of goods and asset and translate those signals

to bring that autonomousness.

So I feel

we are just basically scratching the surface right now,

quite excited about where we are. And in the tech stack where we are positioned,

obviously, a lot of learning.

So we always

reach out to, you know, industry veterans. We have

customer advisory board. We also bring a lot of, you know, advising team in place from time to time because the last thing we wanna do is step on a landmine

without knowing

we are going south. So a lot of humility, a lot of transparency as a culture in the company to learn and be hungry

and to commercialize some of the technologies for our customers.

Are there any other aspects of the work that you're doing at Roombi

or the overall space of supply chain analytics and logistics that we didn't discuss yet that you'd like to cover before we close out the show? The 1 thing that, you know, we are watching this category in this space is satellite communication.

All of the IoT technology today works on cellular.

Is satellite communication the next

kind of technology

upgrade for a lot of IoT sensing?

The second part that we are also watching very closely is energy harvesting.

At some point, can we basically bring perpetual energies from some of these devices?

And maybe it's not applicable to the Roombi use case. But when you start talking about disaster management and things like that,

you almost need to have, you know, some form of perpetual energy

in your IoT

stack. And then lastly is, can we basically take data that we have collected and many other collected

in a federated manner? And can we start thinking about

the impact

and how it's going to propagate in the network? So if you have, you know,

3

ships of avocado stuck at Swiss Canal,

can you basically take that event

and look at an impact on pricing,

look at an impact on just the

availability of avocados in 1 or many regions around the world. So I think there's quite a few things that are unsolved,

but I think, you know, industry generally is quite excited.

And these problems are coming upfront, which never used to happen. These problems are coming to us in a much more accelerated fashion because of the broken supply chain

ecosystem we live in today. And, you know, COVID had just accelerated some of that brokenness.

Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you and your team are doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today. I think it's cleansing of data.

A lot of companies have done a lot of work on deriving value of the data.

But I think

the data cleansing, data scrubbing

part of this chain

is underserved.

We wish we could have more technology

looking at that. And lastly, simulating. Right? So 1 is cleaning. And if you've cleaned it well, can you basically, you know, multiply it a few more times so that you can use that data for simulation?

I think these 2 pieces would really solve

sort of the, you know, pain points that, data science

engineer and scientist, you know, impacted

with. Thank you very much for taking the time today to join me and share the work that you and your team at Roombi are doing on being able to bring better visibility

and autonomy to supply chain and logistics. It's definitely a very

important and increasingly complex area.

Appreciate the efforts that you're doing to make it a more tractable and maintainable problem. So thank you again for taking the time today to join me, and I hope you have a good rest of your day. Thanks, Tuvis. Bye bye.

Forget to check out our other shows, podcast dot in it, which covers the Python language, its community, and the innovative ways it is being used, and the Machine Learning podcast,

which helps you go from idea to production with machine learning. Visit the site at dataengineeringpodcast.com

product from the show, then tell us about

it.

Email hosts at data engineering podcast.com

with your story. And to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Closing Announcements

Parting Question

Links