Designing And Deploying IoT Analytics For Industrial Applications At Vopak

Hello, and welcome to the Data Engineering Podcast, the show about modern data management.

Have you ever woken up to a crisis because a number on a dashboard is broken and no 1 knows why? Or sent out frustrating Slack messages trying to find the right dataset? Or tried to understand what a column name means?

Our friends at Outland started out as a data team themselves and faced all this collaboration chaos.

They started building Outland as an internal tool for themselves.

Outland is a collaborative workspace for data driven teams, like GitHub for engineering or Figma for design teams.

By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets and code, Atlan enables teams to create single source of truth for all of their data assets

and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker, and more.

Go to dataengineeringpodcast.com/outland

today. That's a t l a n, and sign up for a free trial.

If you're a data engineering podcast listener, you get credits worth

$3, 000 on an annual subscription.

When you're ready to build your next pipeline and want to test out the projects you hear about on the show, you'll need somewhere to deploy it. So check out our friends over at Linode.

With our managed Kubernetes platform, it's now even easier to deploy and scale your workflows or try out the latest Helm charts from tools like Pulsar, Pacaderm, and Dagster.

With simple pricing, fast networking, object storage, and worldwide data centers, you've got everything you need to run a bulletproof data platform.

Go to data engineering podcast.com/linode

today. That's l I n o d e, and get a $100 credit to try out a Kubernetes cluster of your own. And don't forget to thank them for their continued support of this show. Your host is Tobias Macy. And today, I'm interviewing Mario Perera about building a data management system for globally distributed IoT sensors at Vopac. So, Mario, can you start by introducing yourself?

Hi. First, thank you for having me. So my name is Mario. I'm from Portugal, but I'm currently living in Netherlands. I work as a senior engineer at Vopak.

And do you remember how you first got started working in data?

So first, I started in, in tech startups. I worked at OutSystems. It's a local platform,

and I worked then at MessageBird. It's the Twilio, competition. And I worked majority in the data cloud infrastructure part,

and now

I started working in the edge

data infrastructure.

So around 10 years that I'm already doing it, but majority using cloud technology.

So the company that you're at now is Vopac, and I'm wondering if you can just describe a bit about what that business is and the types of information that you rely on to be able to power the business.

So Vopak is the leading independent liquid storage company.

It's a Dutch company, the oldest Dutch company. It's a 470

7 470 years old.

The business, what they do, Opak stores vital products.

So everything that is vital for mankind. So it could be oil and gas, industrial chemicals.

Currently, OPAC has 47 sorry, 73 locations. When I see locations, it's plants, facilities,

terminals,

which stores,

the oil and gas in 23 countries.

What the information that's the business or powers the business

is

the status of the liquid inside of the tanks, the flow meters, the state of the pump,

the truck, the information that are inside of the truck.

So all this information that is on-site

empowers the operational part of the business,

supply chain of the business,

and

maintain operability

of the business itself, the facility itself.

So what we do at Vowelpark, and I will explain later on, is how we ingest

this data and transform this data at the edge

that empowers

the business.

And you mentioned that you've got a few different categories of locations and different types of information that you're collecting. I'm wondering if you can talk through the different types of sensors and edge devices that you're relying on and what types of consistency you might have within a particular category of location and the degree of variance that you're dealing with across your overall fleet of sensors

and the the sort of matrix of complexity that you're dealing with?

So we have 73 locations.

Currently in production live. We are in, almost 20, 18, right now. And every location is different, totally different, with different sensors,

different edge devices.

What happens is

so currently,

what we do

at the edge, we ingest the data.

So

from 4 meters sensors, so industrial sensors, not your normal sensor. We 4 meters, so we know how much how does the speed of the liquid that is good, temperature,

the level of the tank,

weigh bridges, so we can weigh the trunks, license. We can ingest the data from license plates,

readers.

This is funny from drones. We have drones that inspect the tanks. We have robots that inspect the tanks,

camera, in this case, image camera. Everything

that has a data point. So everything that has a sensor, we grab it. We ingest it.

And then then this data is then

sent to its processes at the edge

before sending to the cloud regardless of the variance. What happens is that, for example, we have a tank. We are pumping the liquid to inside of the tank. You don't want to send all data to the cloud.

So it gets still maybe you only want when it starts at the middle then at the end,

or you need to let it,

flat get stable. Right? In this case, we use what they call dead band,

and

in some other sensors, we use what they call sinking door compression algorithm.

What happens is that this algorithm

cuts values

to some degree

that are in the middle and which you don't need. So what happens is,

for example, we are pumping, and the value is constantly changing.

We only get

some part of this value, the value that really changes, for example, because it can go to a value of 6 digits, and you don't want all this big value, and then we cut it. This is all done at the edge.

What we do at the edge too

of the data, we do some like

this, but it's a complex event processing.

So

based on some data change, we trigger

another

change.

So, for example, if a Weybridge if a truck arrives to Weybridge

it, automatically, we get the license plate and,

automatically, we get the weight of the truck.

And then based

on the billing, we are able to

open or not

the door to the truck to come in. So there is a lot of things that we still do at the edge before sending to the cloud.

As far as the

selection of sensors and the deployment of sensors, I'm wondering how much control or oversight you have as the person who's responsible for dealing with the data that they're collecting

versus just having to deal with accepting whatever the terminal manager happens to decide is the most economical or, you know, best option for them to be able to install and

maintain in their physical locations?

In this case,

it really depends

which sensor it is

and which wire it is and which type of terminals.

For example, this is public. We

recently invested

in the company that provide the sensors,

industrial sensors, and the reason is because the industrial field and in our field, a a lot of the sensors needs to have specific certifications,

for example, ADTEC certification

because it are places which has those hazards conditions.

What majority of the times as edge team, what we ask or

we have something in the saying is that we would like to have a sensor that connects to a DCS.

And then from the DCS, we would like to have the data from that sensor.

DCS, it's a common word from industrial world. It's a distributed computer system.

We kinda have sometimes,

we're able to say, hey, if this is a new terminal, bear in mind that changing these sensors is not something that you change

every day. Okay?

You are not gonna build terminals every month. We are talking about facilities with 30, 40 years. And so when you build it, you build it to last.

You build, like, some of our terminals have, like, a 100 years. You know? Of course, you constantly upgrade your sensors and your terminals,

but it's not something that, yeah, you upgrade every every day or you change every day. Yeah. Regarding that of the the selection of the the sensors, it's what we do in this case.

And then as far as

the general capacity for computation and storage, both within the sensors and at the edge locations where those sensors are operating.

I'm wondering if you can talk at least in terms of orders of magnitude, what kind of compute and storage and processing capacity you have to be able to push more of your logic to that edge location so that it doesn't all have to be centralized.

Currently,

in general, we are currently processing 65, 000, 000 events per day at the edge. And in the cloud, we are storing around 8, 000, 000, 000 events.

So the computing of this,

can already tell, but we are using IoT Greengrass from Amazon,

and IoT Greengrass from Amazon allows us to run Lambda's, which you run the cloud on prem

inside of your Raspberry Pi. So the computing power of this, you don't need a really big machine.

Okay? You can run these in a Raspberry Pi.

So from a computing power, it's a very low spec computer

that we have there. Currently,

we run it in the physical server and in the hack, and that's because of compliance.

Once again, because of the attack certification, I cannot just put the server

or the Raspberry Pi inside of a tank or or near a tank because, you know, it can explode.

So

it's a very low spec computer, which processes that all that information.

And then this is because we use the West Lambda at the edge.

So on-site in this case, because we use a less IT ingress.

As far as the processing that you're doing at the edge with that Greengrass

service? What are some of the types of filtering and preprocessing and logic that you are pushing out to that edge location

in order to determine

what events and what aggregate data you actually want to centralize?

To give them more information to that question,

we use in order to integrate and well, to ingest the data first, we use OPC UA. OPC UA, it's a machine to machine industrial protocol.

So what we do, we we connect to a OPC UA server, we have a client,

and we ingest this data. The data that comes, it's very hard to understand from IT,

software engineer, BI, data science

perspective.

It is like

example, temperature from a tank

originally. So as a hot data, it comes like CV337.cv

and then

37.

So, what we do at the edge besides the filter of the dead band, as I told before,

we add context information.

For example,

and this depends of the terminal, by the way, which type of sensor it is, which type of measurement it is, which type of unit of measurement it is. So we have

much more information, and then we have some business related information,

like the location specific location of the sensor

timestamps. So we have the server timestamp. We have the sensor timestamp. We have the ingestion timestamp,

transformation timestamp.

We are able to, at the edge, besides filtering, we,

adding information, so transforming the information

at the edge. So what happens so giving already a little bit of the architecture, we have IT ingress,

and then we have 3 Lambdas.

K?

So 1 Lambda runs PCA client, which ingests the data. So this data very hard data that is very hard to understand,

depends of the terminal, but can come on change. So every time there is a new value, we get it and can on quest on batch. For example, we can call the the way server every 1 minute or every 1 second

and see if the value changes or not.

So we then we have a Lambda

that transforms

the data, that gives the context,

which type of sensor it is, in which location,

which unit of measurement is is using,

the the method that is using to measure. Okay? Because in a tank, it's like a cup of glass. Right? It could be off full

or off empty, and this can change

on the tank. The sensor could be measuring if the tank is half full or half empty. And then we add this information, as I was telling. We transform this information,

and then we have a lambda

that stores

the data locally.

And we have and the normalization lambda transformation lambda is how we call it. It sends the data to the cloud, to the cloud.

We store it in MongoDB,

but

going a little bit back,

we have eduBLAS Kinesis

that streams the data to an s 3 bucket for the BI and data science team, and we have a less patch, which patch the data every day to other departments to use it. So

so the complexity of the things, the transformation,

as it asked,

it's done at the edge

and filtered at the edge using the normalization lambda.

The ingest ingest everything.

Normalization transforms, and then we have 1 that stores locally in case in case we need example, if the Internet fails,

we are able to cache, by the way. I didn't talk, but we are able to cache

the information

for around 1 hour.

But if the Internet fails for a longer period,

we may need to go to the data storage, the local data storage.

So this data is then, as I told you, sent to cloud through the normalization regarding your transmission and filtering.

To the point of

the network failures and the potential for not being able to

receive data in the central location for, you know, up to an hour or

more. I'm wondering

what the

tolerance is for those types of latencies and some of the ways that you're able to mitigate issues

due to that lack of central coordination by pushing more of the logic to the edge for any types of decision making that has to happen in a more real time

fashion?

So as I told you, we can cache for 1 hour. We try as a team to make it the maximum is 50 minutes of data. We shouldn't lose more than 50 minutes of data.

Okay? So we cache 1 hour of data. If you say, okay. If everything goes down

on a location,

then even the industrial system goes down, and that's very hard because these industrial systems are made to be keep running

forever. Okay?

So they are very redundant, and our system is very redundant too. We have 2 ingresses.

In case if 1 fails,

we can use the other 1. Even if it fails, there are safety mechanisms. We currently only ingest.

We don't write down, so we don't give instructions yet to the sensors. Okay?

And they may be we give instructions to the sensors,

but we only ingest. So if there is a failure,

everything

fails on-site,

not only us. So what I'm trying to say is it's very hard

to have something that really, really, really, really fails.

The Internet, we have double connection to the Internet.

So,

yeah, we built something when we built it, that's thinking about scalability

and high

availability. In terms of the

retention periods that you have, I'm curious what the

utility is of this sensor data as you move beyond the, you know, time horizons of maybe an hour or a day and into weeks months years, and what types

of longitudinal

analysis you are looking to be able to do over those longer time periods versus just saying,

you know, this particular

category of sensor data is useless to me past 5 minutes?

So at the edge, we store for 2 months, and the main reason is because if we lose data

or something, we are able to to fetch it, again. Or if a local application wants to access it in case of emergency,

it's there.

In the cloud, as I told you, I can say it goes to MongoDB.

Okay?

And then we have an API on top of it. Majority of this data, it goes I didn't talk, but Vopak

is going through a massive digital transformation.

And the reason that edge was built

this product was

built, is because during the digital transformation,

we found out that we only we store more than oil and gas and and other liquids.

We store vital data too. Okay? So that is very important for us and for customers and partners.

So the API provides

data like reporting

the end of the day or give me the current value

right now. And this data majority of the times is called by our b systems,

CRM systems.

Even our customers, this is 1 of the good values, is currently

able to see the state of their tanks. We have a mobile app,

powered app, where the customer can see the data, not only

the people on-site, the operations,

but the customers too. So they can see the status, what is happening on the terminal.

And then longer data,

we store it because we have a data science team, okay, and machine learning team besides BI team. So what happens

is we store it in the bucket, and, of course, in the bucket, then we have life cycle. Okay? And this life cycle, it's coordinated with the data science team. So when they train their data model so, currently, they do predictive maintenance. For example,

we do analytics

over alarms and events.

So alarms and events, it's a type of communication protocol that in existing industrial

field or domain. For example, if you are having leaks, it triggers a data point, an example.

Or a pump is too busy, triggers a data point.

And then we do analytics over this. We have our data science to this asset performance optimization in order to understand,

why it doesn't work as well. We do loading and offloading optimizations. We have these jetties and these arms

where it's used to inject to pump the

the liquid, and then we do optimizations on it. And this data, of course, needs to be stored

for many, many years. Know, when you train your data model, you cannot you need a lot of data and good data

to train them. For example, for the maintenance, you need to for example, vibration.

It's 1 of the things that everyone uses is vibration

in the spectrum.

It's easier to detect. With that, you need a lot of data.

So, currently, we are storing

all data industry buckets for that, but bear in mind that we built this in 2018.

So we

are starting building in 2018.

So we are starting since 2018.

Even that's probably to have more confidence on our data models,

we're gonna need even more data

from more terminals, from for more ears.

You mentioned that this overall effort of building this full end to end IoT platform is a fairly recent exercise, and I'm curious.

You know, you mentioned that some of these terminals have been in operation for on the order of a 100 years, and I'm sure that

maybe not for the full 100 years, but at least on the order of decades, they've had some sort of electronic sensors, and they've been doing something with that. And I'm wondering what the

existing situation looked like before you started this project and what some of the evaluation

and planning looked like as you were deciding what is an optimal architecture given the existing

physical

investments that we have and

the constraints that we have as far as access

power supply and reliability, you know, at these various locations and just what that evaluation and planning looked like for the technical selection and architecture design

and how you have been thinking about being able

to reverse engineer this end to end system on top of

physical limitations that have been in existence since before you became involved? What happened here is that

the industrial IoT, not only IoT, but more precisely on industrial IoT, the field is very segmented. Okay? You have a lot of companies

that are building industrial IoT platforms,

ABP, Amazon, Yokogawa,

Honeywell. What happens here is

it's so fragmented

currently.

So we have 1 company that is very good at ingesting.

You have, another company that is very good at transforming at the edge,

but is not so good at ingesting.

You know, you have another company that is very good

at doing the previous maintenance. For example, you have Fogorn,

that is very good at the edge, doing machine learning at the edge, or we are at the edge. During the selection period, so we went through multiple suppliers and providers

that were already being used on the terminal,

and we give it, like, a problem. We have this problem,

and we want to solve it. It's like a MVP

lookalike.

We give it to AWS. We

AWS partners, in this case, all these companies, Clinician, for example, PTC from Capware,

Siemens,

we give it the this problem and say, hey. Show us how can we solve this.

Do a demo prototype.

And then they all come up with it, and then we did an evaluation. Which 1 it fits our enterprise architecture?

Which 1 fits

the knowledge meant, the current knowledge meant, and application, and landscape

we already have in house. So there was this process of selection. Okay? We didn't just jump to 1. It took, like, 6 months to have the selection

phase, to proof of concept, to test on the field,

and we didn't do, like, a big test. We tested very small, like, with 1 sensor or 2,

and then we saw, okay, how much

can this scale?

I didn't talk here, but we have a very small team that was lucky to hire at this time, which is very hard to hire good people. I

hired 3 good very good engineers and seniors

that helped me build this platform.

So, we have a very small team.

We did the design, the solution design, what we wanted, and then we went through all the suppliers to see which 1 was better for us at the time.

But bear in mind, we are already building. We are 18, but we are constantly still looking for other platforms. You know, we never stop evaluating

because it could have that, for example, for the people that are listening to this, we built our products. Right? But, for example, Amazon now has IoT SiteWise Hedge, which is,

in the end, the same as what we built,

which is

funny because we built providing all the time feedback to Amazon. Hey. We are building this, giving this.

Maybe you want this, and they now recently did the same. But it's not only Amazon. Google, I think, have something similar. Microsoft has something similar

in Azure, and we are still checking maybe what happens if we go to site wise edge. What we need to upgrade? What we need to change?

Yeah. So that was a little bit of our process and continuous process. You know?

We're constantly

still building

the platform. We didn't build an okay. It's done. No. We are doing step by step still.

So now your modern data stack is set up. How is everyone going to find the data they need and understand it? Select Star is a data discovery platform that automatically analyzes and documents your data.

For every table in select star, you can find out where the data originated, which dashboards are built on top of it, who's using it in the company, and how they're using it, all the way down to the SQL queries.

Best of all, it's simple to set up and easy for both engineering and operations teams to use.

With SelectStar's

data catalog, a single source of truth for your data is built in minutes, even across thousands of datasets.

Try it out for free and double the length of your free trial at dataengineeringpodcast.com/selectstar.

You'll also get a swag package when you continue on a paid plan.

Given the fact that you are still in this phase of proving out the solution and scaling it up, I'm wondering what the criteria for success success are and when you can determine that, yes, this is exactly what we're looking for. We're just going to continue investing in this structure and strategy,

and we're just going to scale it out across our entire infrastructure or to the contrary saying,

okay. This isn't quite meeting our expectations. We actually need to stop and rethink and maybe tweak the formula a little bit before we try to hit that full scale

company wide adoption?

Product itself was a success.

You know? Because

as a company, we were not able to see this data to get this data.

It was in silos.

Our team is even like a use case inside of the company

because the company is not that company. Yeah? We wrote the code. We built the code. We activated the deployment.

It's quite funny when I say, okay. We are non tech house,

and we do microservices,

Lambdas, and we have a code code coverage of 99.5,

and we

have unit tests, regression

tests, smoke tests, well, tests, static code analysis, everything inside of CICD,

and everything is automated. It's quite funny when we say we did that. So the product monitoring and logging and tracing,

when I say

the product itself, it's a success,

and we are already scaling it.

But probably I mentioned before, we are always looking to see if there is something even better.

We're not just gonna stop. And this product is already giving a lot of insights and information

to other departments. It's quite nice to see that now we are doing digital twins, which we create the model of our terminal in 3 d,

and people can see in real time

the liquid flowing

and the temperature, and it's quite nice besides

the other initiatives, but it's something that you can see yourself.

Or being on the terminal, and you see people using smart glasses, and they are able to see

in real time with the smart glasses the data

of the assets of tanks through the glasses because they are getting the information from us. So it's a lot of things that you already say, okay. It was a success.

We are already scaling it. Of course, as you said, okay, tweaking, of course, that sometimes time we need to see let's say, we start noticing that there is a limit in fingers. Every technology has a limit.

So when we start seeing, okay, we need to filter even more the data,

or maybe we need to deploy multiple green grasses

in a location, of course, there is always, as I told you, a tweaking.

In terms of those scaling issues, as you are starting to bring more terminals online

and

starting to maybe experience a greater degree of heterogeneity

across the different sensors and the availability of data, I'm wondering how you have been

addressing that and some of the complexities that you're dealing with as you continue to expand on the usage of this architecture that you've designed.

1 of the challenges, this is about the challenge, is the contact civilization

at our scale. Okay?

The sensor itself, grabbing the sensor itself, the information, it's easy because of the UPC way. It's very standardized,

and our suppliers have it. And the configuration of it, it's standardized

already. It's the constellation

because every terminal is is different, but it's not the technology,

but a process

mindset that you need to change majority of the times

is and communication.

The challenges

is contextualization of the data. So you know

that that is really a flow meter

type, and

that is the type of measurement, which that information is correct. Okay? So the data quality of it, let's say, the trustiness of the data.

That, when scaling,

is the biggest problem

on this case

because you need

to communicate very well with the people on-site, with engineers, and you need to explain, hey, I need this information from this information.

So you need to adapt

very

easily to the people that you talk to.

And not only that, but expectations.

You know?

When scaling,

I think the biggest challenge is

mindsets,

people knowing of this data and this type of data with this context

to do this and this and this,

the adoption

and the data quality and the context, I think. Yeah.

Not on the technical part because the technical part itself,

if you built in our case, once again, I was very lucky. We built with the mindset

from automation from the beginning.

So the more terminals we onboard,

it's the work itself. It's the same.

It's more we grab the data

and transforming the data with the correct information. So it's giving the correct context at the edge.

As far as the quality aspect of the data, I imagine that that's quite challenging. And understanding

what is the lowest common denominator of information that I can expect across all of these different sensors

and then being able to also take advantage of the variance where maybe 1 sensor, you know, say it's a flow meter, it measures in terms

of liters per second, and then another 1 is gallons per minute and just being able to make sure that units are correct and that you're able to aggregate the information

and try to

maintain fidelity and not just kind of bring everything to the coarsest level.

That's the the biggest challenge right now is that, currently, our automation system,

in this case I know it sounds very strange, but the terminal on-site gives us a file

with all the context information. We have a CICD then pipeline that transform the file, and then it sends automatically downstream

to the CICD pipeline, sends automatically down to the edge part.

And then, we have some validation

rules

that validates the file of the engineer that gives to us

on-site.

So we have some kind of control on the validation of that data.

What happens is, if the data is wrong, so does that migration. Right?

Does that migration to to fix that

yeah.

If it's in the clouds, as I told you, we are starting 88, 000, 000, 000 records

Doing the change, it's then a little bit more

problematic, you know, because of the scale of the amount of data. It takes a little bit more time

to change it.

I think when you work on this scale, people are already used to it. So

yeah.

And then another interesting challenge of dealing with IoT data and maybe juxtaposing that against what a lot of people who are working in sort of standard corporate data

ecosystems

are thinking about as far as things like lineage.

I'm wondering how that translates for an IoT perspective where you say, you know, this data point or this aggregate set of data points is coming from this category of sensors in this physical location.

You know, this is the trip that it took from the edge location to our central storage. These are the transformations that are applied on it and just managing data lineage and sort of data cataloging

at that scale where you're dealing with, you know, 1, 000 or 100 of thousands of sensors in dozens of different physical locations and, you know, some enumerate number of pipelines that are going to be processing the data for a different machine learning or analytical use cases.

The data governance itself, it's quite funny because I was listening on this podcast. And the other day regarding Monte Carlo and monitoring the data quality monitoring,

We kind of built our own

Monte Carlo

where we built the SLOs and SLIs over the the the governance and the time,

for example, the processing time. Okay? So currently, from the sensor to the cloud, it takes off a second.

So ingesting, processing, and storing. Our monitoring system monitors that time.

We try to go from 1 minute,

max is 1 minute. If it's higher than that, then someone will wake up.

The 50 minutes as I told you. The data quality will monitor the

let's say, if we have a flow meter,

and suddenly the flow meter is measured in kilos, and then we get,

like, 2, 000, 000, 000 high cars with that. Let's say not 2, 000, 000, 000, but 100, 000

high cars with that. We have monitors

that triggers, an incident.

Luckily, once again, our system is very robust.

In 3 years, I was only waked up at night in 4 years, sorry, 3 times.

We don't have a lot of incidents

till now. Let's see what happens when we are live in 72 terminals.

But for now, in 18, we didn't have any problems yet. We are building the strategy,

the data governance monitoring for it

in order to prepare us for the future.

Yeah. It's a little bit a leap of faith

and see what we're gonna be getting next.

And then going back to what you were saying as far as some of the business users looking at real time aggregate flow information,

you know, from a central terminal or terminal managers walking around with smart glasses to be able to see, you know, these are some of the predictive maintenance things that I need to be considering.

What are some of the types of new capabilities and business processes that are enabled by being able to actually collect and aggregate all of this information in near real time and some of the new kind of business use cases that are yet to be considered?

As I told you, the customer now is able to access in real time information of the tanks.

We can even provide to our partners and suppliers. Okay?

Because Vopak is something like a man in the middle in the distribution of liquids. We store,

and then we give it to someone else

sometimes.

So from a business, now the customer and partners can see that information. From operational

perspective,

as I told you,

now the terminal manager can use smart glasses, autonomous robots, like Boston Dynamics for asset inspection.

So we don't need to send people inside

to see what is happening.

Because

if there is a problem in the tank or flow meter or the pump, the operations will stop. It means that then the customer doesn't get the liquid.

And because it's a vital product, it already it can happen. It never happened, but it can happen it can happen that we stop a country because, for example, if we look at when colonial pipeline was hacked, for example,

in the United States,

like, half of the country stopped. You know?

So,

all these

things,

when we stop,

yeah, we need to avoid as much as possible

to stop.

So, our data, it's 1 of the key pillars

of the digital transformation of FOPAC.

So with this data, then you create all these initiatives

that help the customer and the company.

Continuing on this discussion of the types

of impact that a failure in the physical plant can have and some of the interaction with the sensor data. I'm wondering what are some of the

other types of ramifications

that might happen due to failures in

the data collection and analysis

or the

physical failures that can be reflected or have an impact on the sensor data that you're processing?

As I told you, we only read right now.

So the systems, we don't write down. So if there is a failure,

not because of us, so we just ingest. In the case of failure,

we don't get any data.

Of course, then the person on-site clicks on the app and doesn't get the the actual information. But, however,

they are still on-site. You know?

So they have this control system,

like, a power plant imagine look alike. They have this control system, and I'm the person sure they will have an alert on their industrial system,

or they already have someone looking at it.

Because

if there is

problem

on the sensor

or we don't get the data or

the OPC UA, that's why we use the PC UA, if there is any problem with the sensor,

your PCUA itself gives a status code. And now we have our data quality monitoring system,

monitors

the status code.

Okay? So if this stats code the OPC weights have a lot of stats code, it's once again, it's industrial protocol.

Based on the status code,

it means a type of problem in the sensor,

and we monitor that. And based on that,

then we can notify

the local team, hey. If you are not looking, if it's probably they're already looking at it, this is happening with this sensor.

Yeah. So we can already do that

on behalf of the operation teams on-site.

We can already monitor on them. Hey. This is happening with this sensor. So the disruption itself,

of course,

they will not have the most recent data.

But because of our monitoring system, we can already notify on-site and that they can already fix the problem.

So this data will not go to the cloud, by the way, because the status code,

PCA,

only ingests

in the cloud when it's 1 of the rules of the transformation.

We only send to the cloud

tags, in this case of PCOA tags,

the data point that with the stats code, good.

So if it's not good, we're not gonna send to the cloud, and our monitoring system alerts

the people on-site or ask people on-site

what is happening with this.

And in terms of the

uses

of the system that you're building and the data that you're providing, what are some of the most interesting or innovative or unexpected applications of it that you've seen?

A lot, to be honest. At the beginning, I was not expecting

the smart glasses,

the digital twin,

the analytics over alarms and events. I don't know. There is

the operation set. We have 1 project, which is we are still starting in working. It's in the early stage

where

if there is an accident, we can it's a little bit like alarms in events, but we monitor

yeah. It's it's alarming events, but we can predict

or try to predict when this can happen and which situations can happen, but it's a lot of data. But I think

I don't know from a software engineer perspective. A lot of software engineers probably they build a picture, and they send it to production. They never see them using it. Maybe people, like, Google, they see people using it. But in my case, when I go to terminals

and I see people using the smart glasses now

and seeing all these that dashboard and I can see the impact, it's like, well, I was never expecting

this weapon or, for example, the energy management project. You can see it in Forbes. We talked about that.

Reducing the consumption of energy,

for example, for me,

was quite mind blowing because I was didn't expect

such an impact,

like,

in the company itself. You know? When we built the the system,

I was not expecting to become, like, a key pillar to the company itself.

It's a little bit hard to to say which 1 it was because so many initiatives

taking place

because of the data that we are able to provide.

Yeah, it's hard to select 1.

Maybe the thrones the thrones inspection is nice. I was not expecting that.

In your own experience

of designing and implementing

and evolving this platform, what are some of the most interesting or unexpected or challenging lessons that you've learned in the process?

I think when we were building at the beginning, early stage, we start very small.

It was the expectations.

We started with 1 or 2 sensors, and people, okay, then we can already start doing machine learning. No.

But at our scale, maybe

the operational part of it, I was not expecting

or, like, example, fixing data. Right? As I told you, data migration

was something that we were not expecting

from the beginning to

be so challenging. You know? Or even giving context to the data was something we were not used to it, you know, or expecting that to happen. So the the design while designing it, then building it,

if it was now maybe

the transformation

Lambda, so the normalization Lambda

so but this is now. Right? But this can change. Right? Maybe I will look at it

with more careful.

We focus a little bit more at the beginning in the testing because you need to give value.

You need to give value to the business as fast as possible, and they want the data. Let's ingest it. We focus a lot of in ingesting the data, understanding the OPC wave protocol,

our modbus protocol works, all the industrial side of the things work. And then when it went up to the parts of transformation of normalization of data, we didn't focus

so much on it, you know?

Yeah.

In terms of the near to medium term future of the work that you're doing, what are some of the things that you have planned or particular areas that you're excited to dig into?

I think the capability to run machine learning at the edge.

Greengrass allows that. So they call it machine learning interference.

We are doing proof of concepts with it, even in the parts of the energy management. Like, for example,

we see that a pump, it's consuming too much energy,

and we do big shaving.

It shouldn't be consuming so much energy because it's not gonna be used for 2 days or 3 days because you need to warm up the pump in order to work. We send the instruction to the operator saying, hey.

Based on previous history and based on future schedules, this pump doesn't need to be on. And the operator can say yes or no, and we shut down the pump.

So that's something that we are currently

evaluating,

testing

to see the capability to

it's running the machine learning, the part of the AI, with the help of our data team. And, by the way, our data team is doing a very good and awesome work because I'm saying these what excite us or what we're gonna do next, but majority of the times is what the business wants next.

You know? And for that, you need to teach

we are even creating, like, a data academy in house

where we teach business to people on-site,

managers and directors,

what they can do with the data.

And based on that, they are the ones that can give us ideas.

You know? And then, okay. Yeah. We can build it, and

let's do it.

We need the business and the people on-site

to help us. You know, without them, we are not gonna solve anything.

Are there any other aspects of the systems

that you have designed and implemented or the overall complexities of dealing with IoT data at distributed scale or just the overall space of

building and managing these complex data systems that we didn't discuss yet that you'd like to cover before we close out the show?

It seems 1 thing that needs to be building these IoT systems or any distributed system

or these hybrid systems

is that someone wants to build it.

I think, first, people needs to think is about deployment

and testing.

Of course, you need to give value, but this IoT system is very hard due to test. Right? Because there is many factors that can change.

So

testing

and automation testing, it's very important.

Automation,

it's a very important topic for the IoT field. Okay?

Yeah. If you control the sensors, let's say if you are a company that build your own sensors, you don't have that problem. Right?

But

if you are someone that ingest data from different sensors, you really need that. Yeah.

Alright. Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.

From a technical

perspective,

it's the fragmentation

in the IoT sector,

at least at the edge.

Okay. It's good that companies focus in 1 problem.

But when we were evaluating,

it's very hard if you're gonna need to maintain, like, 3 applications at the edge or 4 applications at the edge.

So

it's very hard you to find an application that does everything,

and the cost's still not very high. You know what I mean? So

I can have 4 applications that are not very expensive and easy to maintain and the scale, but in the long run, I will need to hire 20 engineers to maintain it.

So,

yeah, the fragmentation.

From a people

perspective,

is you need or management

of the data is when you are deploying worldwide

as in our case.

This is not the sprints.

It's a marathon.

You know? You're not gonna start ingesting and processing this data in 1 day, and

everyone that you're gonna deal with is gonna have different expectations

for it. So I think that's the biggest gap because you're gonna come as an engineer, a software engineer full of ideas,

and then the other side, you don't know what are the expectations of the people. I think the gap is a gap in acknowledgment.

Alright. Well, thank you very much for taking the time today to join me and share the work that you've been doing at Vopac. It's definitely a very interesting problem space and an interesting

design

challenge that you've been working through. So I appreciate all of the time and energy that you have put into that and the time that you've taken to share your experiences

with me and the audience. Thank you again for that, and I hope you enjoy the rest of your day. Thank you.

Listening.

Don't forget to check out our other show, pod cast.init@pythonpodcast.com

to learn about the Python language, its community, and the innovative ways it is being used.

And visit the site at dataengineeringpodcast.com

to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast.com

with your story. And to help other people find the show, please leave a review on Itunes and tell your friends and coworkers.

Data Engineering Podcast

Summary

Announcements

Interview

Contact Info

Parting Question

Closing Announcements

Links