Summary
The information about how data is acquired and processed is often as important as the data itself. For this reason metadata management systems are built to track the journey of your business data to aid in analysis, presentation, and compliance. These systems are frequently cumbersome and difficult to maintain, so Octopai was founded to alleviate that burden. In this episode Amnon Drori, CEO and co-founder of Octopai, discusses the business problems he witnessed that led him to starting the company, how their systems are able to provide valuable tools and insights, and the direction that their product will be taking in the future.
Preamble
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- When you’re ready to build your next pipeline you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to run a bullet-proof data platform. Go to dataengineeringpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
- For complete visibility into the health of your pipeline, including deployment tracking, and powerful alerting driven by machine-learning, DataDog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you’ll have everything you need to find and fix performance bottlenecks in no time. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial and get a sweet new T-Shirt.
- Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
- Your host is Tobias Macey and today I’m interviewing Amnon Drori about OctopAI and the benefits of metadata management
Interview
- Introduction
- How did you get involved in the area of data management?
- What is OctopAI and what was your motivation for founding it?
- What are some of the types of information that you classify and collect as metadata?
- Can you talk through the architecture of your platform?
- What are some of the challenges that are typically faced by metadata management systems?
- What is involved in deploying your metadata collection agents?
- Once the metadata has been collected what are some of the ways in which it can be used?
- What mechanisms do you use to ensure that customer data is segregated?
- How do you identify and handle sensitive information during the collection step?
- What are some of the most challenging aspects of your technical and business platforms that you have faced?
- What are some of the plans that you have for OctopAI going forward?
Contact Info
- Amnon
- @octopai_amnon on Twitter
- OctopAI
- @OctopaiBI on Twitter
- Website
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
Links
- OctopAI
- Metadata
- Metadata Management
- Data Integrity
- CRM (Customer Relationship Management)
- ERP (Enterprise Resource Planning)
- Business Intelligence
- ETL (Extract, Transform, Load)
- Informatica
- SAP
- Data Governance
- SSIS (SQL Server Integration Services)
- Vertica
- Airflow
- Luigi
- Oozie
- GDPR (General Data Privacy Regulation)
- Root Cause Analysis
The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Hello, and welcome to the Data Engineering podcast, the show about modern data management. When you're ready to build your next pipeline, you'll need somewhere to deploy it, so you should check out Linode. With private networking, shared block storage, node balancers, and a 40 gigabit network, all controlled by a brand new API, you get everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode to to get a $20 credit and launch a new server in under a minute. And for complete visibility into the health of your pipeline, including deployment tracking and powerful alerting driven by machine learning, Datadog has got you covered. With their monitoring, metrics, and log collection agent, including extensive integrations and distributed tracing, you'll have everything you need to find and fix performance bottlenecks in no time. Go to data engineering podcast.com/datadog today to start your free 14 day trial and get a sweet new t shirt.
And go to data engineering podcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. Your host is Tobias Macy, and today I'm interviewing Amnon Drori about Oktopi and the benefits of metadata management. So, Amnon, could you start by introducing yourself?
[00:01:17] Unknown:
Hi, Tobias. My name is Amnon. I'm from Octopi. And as you said, 1 of the things that we do is to help companies manage their data in a more automated manner.
[00:01:26] Unknown:
And how did you first get involved in the area of data management?
[00:01:31] Unknown:
So it has to we have to go a couple years back. 1 of the things that I remember my self, as an executive in my previous company is that we went through this very important board meeting and I was showing some numbers about our sales numbers. And for some reason, when I showed those numbers, the CFO, which was sitting next to me, was saying that my numbers are higher than his numbers. So during that meeting, we were asked about why there's a difference between my sales numbers and the CFO's sales numbers. Because he was pulling out from the financial system and I was pulling this out from the CRM system. And the difference between the numbers were a couple of 1, 000, 000 of dollars of revenue. Yeah. That that's a pretty substantial difference.
Right. And 1 of the things that we were asked is where is this differentiation coming from? So we were asking each other what is the name of the report that you're looking at and surprisingly we're looking at the same report. So the other question that came up is how come the 2 executives are looking at the same report but they get different numbers? Now from that point on, in 7 days later, we had to invest 3 people's time to track the data movement process that landed the data on these reports. And what eventually we found out is that 1 of the processes that was responsible for streaming the data to that report was modified.
And it indeed up updated the data only on 1 of the reports, and this is why the discrepancy came to to that meeting. And when we looked around, we saw that there are so many reports that we're using. And we look at the data in order to take major decisions. And the discrepancy of data which led to data integrity was all over the place. Actually, we found out that it's not only our challenge, it seems to be that when I talk to my peers in other companies, it seems to be that this was also their challenge. And at some point of time, we said, well, these things happen.
Maybe you cannot change it, but definitely you can change how fast, how accurate can you be when you manage your data in order to hand it over in a trusted manner to the business users for them to take decisions. And at that point of time, I remember that when I was looking at how, the business intelligence team was manually trying to track the data movement process that we do landed on that report, this is the way I was asking myself, there must be a better way to do that and not spend weeks of, many people in the team to try to track the data movement process on a single report. And we were using thousands of reports. That that was kind of the moment.
[00:04:22] Unknown:
And as you mentioned, Oktopi is intended to help solve some of the issues associated with being able to collect and track and maintain and gain inference from this metadata. So I'm wondering if you can discuss a bit more about what it is that the business does and your, you've already spoken a bit to your motivation for founding it, but any other background that you wanna provide about creating the business and getting it up off the ground?
[00:04:49] Unknown:
Yes. Absolutely. 1 of the things that you can see in any organization, they're using a lot of business applications, CRM, ERP, PLM, HR, finance, marketing, HR, and there's tons of different applications that serve the company's purpose. If you think about insurance companies, the banking, the retail, the telecom, government, any large company has tens of business applications. The problem is that a lot of business consumers, the business users, and their affiliates wanna get a grip on that data in order to understand what's going on in the business so they can take further decisions. The team that is responsible for landing the data to those business users is the business intelligence transforming the data from those business applications until they land on that report. And that involve a lot of business intelligence systems that do that. 1 of the things that we found out is that data travel or the data movement process is very, very complicated to track.
In a typical enterprise, you can find 100 of 1, 000, 000 if not billions of ways how the data moves. And almost every day, you need to try and find different routes of the data journey because of different use cases, and that takes time. What we wanted to do is to create a product that what it does, it automatically extracts, centralized, analyze, and visualize every data journey possible within those infrastructure. And in that way, you can understand how to get from point a to point b or what it seems to be like from source to target in 5 seconds, not 5 hours, not 5 days, not 5 weeks, 5 seconds.
[00:06:46] Unknown:
And I'm wondering if you can enumerate some of the types of information that you classify and collect as metadata and a bit of the mechanism that you use to be able to collect that information.
[00:07:01] Unknown:
Right. So metadata is not a new term. Metadata is the description, is the context, is the labeling of the data. Without metadata, the data doesn't have any meaning. If I'm gonna ask you, could you give me 20? You're gonna think about my question for a second, then you're gonna ask me 20 what? That what is the description about the data. As you know, it's data about the data. It's actually in the market, people use the word data and information. Information is the context of the data in which then we understand what is the meaning of the data. The challenge is that the metadata that describes the data lands on a lot of business, I would say business intelligence systems.
And sometimes they are even coming from different vendors. So 1 of the challenges that we had to face with what how can you extract the metadata from each system that uses metadata for its own purpose? So ETL systems are using that to create business processes. Database or data warehouse stored that metadata on tables and views and reports are using the metadata to project them on a table or a dashboard. So the structure of the data, the synonyms, the expressions, the SQL statement, the table, the anything that has to do with the descriptive of that data, if this is physical metadata, business metadata, and operational metadata, collecting all of these pieces of metadata in order to understand the data journey between all of those systems until it's finally land on a single report was very, very challenging.
So what we do is we collect all that type of metadata and then centralize them into 1 place.
[00:08:57] Unknown:
And what does the technical architecture of your platform look like for being able to collect and aggregate all of this information?
[00:09:08] Unknown:
So 1 of the things that as you've heard is that it's not a 1 thing event. There's our product combines different technologies that, together can provide the output that we provide our customers. I think that the first layer or the first challenge that we have to face with is to extract the metadata from those systems. Because as I said, metadata is distributed among so many different BI systems. Some are from Oracle or Informatica or Microsoft or SAP or HP or even other systems, it's very challenging to extract the metadata from their metadata repositories. So that was 1 of the challenges that we had to face with. In order to cope with that, we've created a layer with extractors. And what they do, they are capable of extracting a granular metadata from each 1 of these systems even though they might be from different vendors. So that was the first challenge. The second challenge was in the architecture that once you've extracted the metadata, where are you gonna put all those metadata files?
And then we create a product that runs on the cloud, either on the customer's private cloud or on Microsoft Azure and AWS. Once those metadata files have been expected, they are being encrypted by running this through Evolve and then upload to a dedicated server of that specific customer. Now we have all the metadata centralized. The third thing that we do once all the metadata is centralized is the analysis of that metadata by running a lot of modeling, very smart, or I would say deep algorithmics and principles of machine learning. That what they do, they do 2 major things. They index each and every metadata, element or data element. And the second thing, which is really cool, is to understand the relationship of each and every metadata item in order to create the dependencies and the connections between them in order to draw a full exact map of that data journey. The 4th layer is a very, very, very robust search engine.
Because when you wanna see data journey that needs to run on more than a billion type of ways of how the data journey can flow and you wanna get that in 5 seconds, we are firing tens of thousands of search requests in order to build that map that you can see in 5 seconds. And the last layer is the visualization, the ability to see what you were looking for in a nice graphical easy intuitive way that the entire stack beneath that as I described, the extraction, the centralization, the analysis, the searchability, and then the visualization, all of this magic happens in 1 click and generates that in 5 seconds.
[00:12:06] Unknown:
Once the metadata has been collected from the customer system and uploaded to your platform, is there a way for the customers to go in and annotate or create any modifications to the metadata to manually create some linkings that might not be able to be easily inferred for, adding that additional relationship information?
[00:12:28] Unknown:
So the answer is yes. What we've learned is that our product can be a viewer and also an editor. And that is 1 of the things that our technology is very, very strong, of. But we also learned that metadata is very much managed in the source systems, in what we call the BI app the business applications. And in some cases, you don't wanna use occupied to edit that metadata and then inject it back to the business application rather than you want this to be a viewer for the business intelligence team to understand what the hell is going on there. So as much as we would allow editing the metadata, typically, Oktopi is being used as a viewer of the metadata and through that to understand where the metadata is and understand the data journey.
1 of the things that people use occupy also to add description and understand the metadata within and write that and add that within Oktopi. So they use it as the main platform for metadata management. But I think that the thing that we introduced to the market is that you can still do metadata management but in an automated manner. And you leverage automation to do that rather than the manual way that you've used to work so far.
[00:13:42] Unknown:
And in terms of the volume and searchability of the data that you're using, are you primarily leveraging, existing data storage technologies such as Elasticsearch or PostgreSQL, or have you found the need to create your own custom data stores for being able to index and analyze all that information?
[00:14:02] Unknown:
So that's a great question. It's actually combination. There are some techniques that we are leveraging, but there, there there wasn't need to do a lot of modifications and add ons that we have developed ourselves in order to create for the concept of the product to work. So we had invested a lot in extending the development to be able to leverage what already exists, but also make it robust and very capable of handling masses of metadata and analyze them in a very, very thorough manner.
[00:14:34] Unknown:
And once the data has been ingested into your system, is there a way for customers to then be able to export that for their own analysis within their own platforms or if they need to migrate to an in house metadata platform after they have, you know, maybe built some additional in house capabilities? Absolutely.
[00:14:57] Unknown:
1 of the things that we are being used is, like, the central repository of the BI metadata. So whereas we collect metadata and we analyze it and we visualize this analysis in the form of discovery, compare, ETLs and reports and also the data lineage, we are also capable of sharing that metadata outside the product, I would say, for other use that, for example, use for, analysts or data scientists or even to inject that as 1 of the examples for data governance. There are some data governance tools that are relying on occupied automation to analyze metadata from a variety of business, intelligence systems. Not only so fact, but also so accurate in a way that they can receive that metadata that had been analyzed by Oktopi and use that for their own other users.
[00:15:51] Unknown:
And as you mentioned earlier on, in the broad sense, metadata management systems are difficult to build and scale and maintain. So I'm wondering if you can speak to some of the difficulties that are inherent and unique to metadata platforms as opposed to your more general data storage systems.
[00:16:14] Unknown:
Yes. I think that, when I need to go back, about 3 years ago when we're asking ourselves what what is missing in the market that we can help with? Because we, ourselves, were using different tools to manage the metadata, and we were using different vendors' tools because we had ETL systems, we had database systems, and we had reporting systems. So what is missing? And 1 of the things that we've seen is that most vendors help manage the metadata but only for their tools. There's no view that is a cross platform. And when I mean cross cross platform, I don't mean just the ability to see how the data moves from an ETL to database to to analysis and to reporting. But also if within each and every 1 of these pillars, there might be different vendors participating in that data journey. So we were challenged by 1 of our customers to reverse engineer how the data landed on a specific report, And that specific report was in SSRS and the BO, consuming data from Oracle database and sorry. Informatica and data stage, we're running ETLs to upload data to those database tables that then generate reports in different vendors tools. So how do you unify and how do you flatten the view of the data journey even though there are, different systems from different vendors responsible for the different steps of the data moving process.
And this is really, really difficult. So, this is why we've established the company. Fortunately for us, we see we see that the demand to understand cross platform data movement or data journey is something that is really, really important. The second thing that we saw is how fast can you get this up and running. As we've experienced with other tools and this is being verified every day that we talk to a new prospect that some of them are actually using different tools in the market, but they require a lot of customization, a lot of preparation.
And in a dynamic changing environment where any company change their data all the time, they create new processes, They change processes. They modify processes. They delete tables. They add fields. They generate reports. They obsolete reports. Thing are being changed all the time. Therefore, they need to continue invest in understanding what already exists in the in in their infrastructure and what will be changed tomorrow. And with Oktupy, 1 of the things that we've challenged ourselves, how fast can you get our product up and running, And what would it require from the customer to enjoy the latest refresh metadata within the organization?
And the third thing that we wanted to put in as a challenge was this tool needs to be intuitive and this tool needs to be as cool as possible for everybody to use it. I remember myself using tools that were designed 20 years ago. There's nothing wrong with that, but if you wanna modernize yourself and you wanna be able to refresh the metadata and upload this with very minimum work on your side, and you would like it to be covering the entire infrastructure, these 3 values, cross platform, easy to get up and running, and very, very simple to use. Either each and every 1 of them or even the combination of all 3 of them, we haven't seen in any 1 of the products from the different vendors that we've used to work with.
So 1 of our customers even shared that occupy is a great example where the whole is greater than the sum of its parts, and we are very, very proud of it.
[00:20:13] Unknown:
And when somebody is getting started with onboarding onto your platform, what is involved in actually deploying and, and configuring the collection agents that you use for being able to load the metadata into your platform?
[00:20:28] Unknown:
Right. So, typical customer would go probably through the following process. So we will create an account for the prospect on our cloud. They will get dedicated server and a link. They will click on the link, and they would choose which systems they would want to extract metadata and upload to occupy. Then they will download the these extractors, very thin client, like, what it what it does, it extracts some metadata, and then they need to run those extractors. And that's the only thing they need to do. That extractor runs anything from half an hour to an hour, extract some of the data, and create an XML file, which is readable. So you can see that the extractions is only about metadata, not data.
Then you just need to point those this file to a link, and that link leads to a vault where the metadata is being encrypted. And from that point on, it's being uploaded to your account on occupy. That's the only thing you need to do. The next thing you know is you get a link 24 hours later saying all the metadata that you've uploaded had been analyzed and ready for you to be used. And the good stuff about it is that once you run the extractor, you can even schedule how often would you like that extractor to continue and extract the most and the latest metadata that you have in your organization.
So companies use occupied to refresh metadata sometimes every day, every week, every 2 weeks, every month. That's the only thing that the customer needs to do.
[00:22:05] Unknown:
And for people who are using ETL tools that are more free form such as the airflow project from Apache or Luigi or some of the, Hadoop style or or some of the projects from the Hadoop ecosystem, are there considerations that they should keep in mind as they're building out these workflows for being able to generate useful metadata?
[00:22:31] Unknown:
That's a great question. I think that this is a great tip or a suggestion for everyone because 1 of the things that had been neglected, in the past years is that people were very, very busy in generating business processes, not really paying a lot of attention to the metadata. Now it's becoming a challenge. And I will actually follow your recommendation where as you build your next EPL or your next business process, pay attention to how metadata is being related there. How it's being descriptive. How it's being, manipulated, how it's being transformed.
So then you can easily understand its meaning. 1 of the things that we see is that metadata keep or tend to be changed as it being modified between different systems. And that you can find in the ETL a field called customer product, and then on semantic layer of the reporting, you can see name of product or product name. So now the question starts, do they carry the same meaning even though they are being called the same in a different way? So 1 of the things that occupied paid attention to is, first of all, to solve what you call the legacy system that it is today. If this has to do with, more traditional systems being used for many, many years, and this data staging for Modica and SSIS, and then also take care of the less structured or I would say freehand, capabilities of writing EPL.
[00:24:10] Unknown:
And 1 of the reasons that I wanted to talk to you in particular is some of the interesting use cases that you have documented for people who are leveraging your platform. So I don't know if you can speak to some of the types of workflows and information analysis that are enabled by using Oktopi and having a robust metadata management platform in place for your business information.
[00:24:35] Unknown:
Right. I think that what we see very interestingly is the combination of the capabilities that we have been able to generate available for the business intelligence once the metadata has been analyzed and what they're using it for. So I would say that from capability point of view, we are able to after the analysis of the metadata, we generate the data lineage and we generate the discovery and the mapping of the data. We understand, for example, the ability to compare between reports and between ETLs. And now the question is, what are you using these type of capabilities for? And interestingly enough, we see that the BI is dealing with a lot of different, I would say, use cases. 1 of the most popular ones is as I started this call.
Business user x complains about a report called y, and I need to reverse engineer how the data landed in that report. Unfortunately, this event, this use case happens a lot of time to any organization every week. When we look at our customers in the past 18 months, there's not a single customer that needed to draw a data lineage from a report backwards all the way to DTL through the source system tens of times on a weekly basis. These incidents or of cannot trust the data that I see happens a lot. The second use case of which discovery and data lineage is important is when you want to design a change in an ETL. Typically, what would happen, you would go to your data architect and say, well, we need to modify some of our ETL processes because we want to add more data consumed from the source systems through those ETL processes. So now with change, it may be very, very difficult Because 1 of the things you need to pay attention to is when you wanna do a change, you need to anticipate ahead of time before going to production what could be the impact of that change on anything that depends on what we're gonna do to that ETL process? And, unfortunately, since you don't know which are the exact reports that are gonna be defected because of a change that you do downstream DTL, you just go live. And then you're faced with use case number 1, that business users complain now about data mismatch or data integrity issues that have been caused by modification that was done 3, 4 weeks earlier. The third thing that we've seen customers looking for metadata management and leverage our automation to do that is when you run on cross organizational changes. And that happened a lot in the past year and probably will continue to be as you, modify yourself to, for example, meet the, GDPR.
2 use cases that happened to us is this insurance company that needed to change how they calculate the number of employees in their organization. And to do that, they needed to find a field that is being calculated based on 3 different formulas. Those formulas run on different ETL processes, storing data in different database tables and show that in hundreds of reports. The challenge was where is this formula? How do we find it in all of the thousands of ETL processes and each ETL processes has different layers of maps and jobs that run that formula? Another interesting use case was a telecom company that for some reason the GK or the GK is the field that generates those subscribers had gone bananas, and they started to generate fake numbers that caused the financial to be completely, inaccurate. So they needed to find almost 5, 400 areas of which this GK field exists, and it took them about 4 months to do that. So finding a data element within the entire landscape is very, very difficult. Another use case was that a big telecom company bought an Internet company, and that big telecom company had an ETL system.
And the Internet company had a different vendor of the ETL system, and they wanted to join them together. So 1 of the question was, are there any ETL processes that we can eliminate from the Internet company instead of restructuring it in the existing ETL. So the question they were using Oktopi for, are there any ETLs that are not being consumed by any reports? Because if there are no reports showing data from these ETL processes, actually, we don't need them. We can delete them. This way we save them 60% of the modifications that are needed to do so. Anything that has to do with root cause analysis or reverse engineering of, about data, integrity in a report, use case number 1. Use case number 2, understand impact of changes. If you wanna change anything in a table or an detailed process or recreate a new 1. Use case number 3, if you wanna find a specific field, formula, map, calculation in order to modify this and understand in 5 seconds where that mapping leads you to find that specific data element. So these 4 top, I would say use cases happen a lot every week, every month across the year, which takes huge amount of time to try to do this manually. And this is where automation comes in place to expedite and to enhance the capability to track, to analyze, and understand metadata across the Bay Al landscape instead of manual way rather than with automation.
[00:30:25] Unknown:
And when you're collecting the metadata from your customer systems, are there any particular types of information that you need to be careful of collecting in terms of sensitivity that might be exposed if it's not managed properly, or are there ways that customers can exclude certain attributes of their metadata from being collected to meet certain compliance requirements?
[00:30:50] Unknown:
The answer is yes. First of all, let's remember that we are analyzing only metadata. So from data point of view, we don't show this. We don't extract it. We don't manage it. Nevertheless, there are some cases in some of our companies that they wanted to choose what type and which in metadata is going to be uploaded to Oktopi. And they have the freedom to decide which metadata will be analyzed by Oktopi. And even within the application, they are capable of hiding some of the metadata if they needed to decide which metadata is gonna be seen by different people within the BI team. So they have full control of visibility of the metadata that is being extracted, analyzed, and then shown to users within Oktoby.
[00:31:41] Unknown:
And what is what have been and what are or have been some of the most challenging aspects of building and growing Octopi both from the technical and business perspectives?
[00:31:53] Unknown:
So I think that the first challenge was to make the dream a reality. The ability to take these 3 values that we put on ourselves 3 and a half years ago and say, can we have a cross platform product that can be up and running within a day and can be used by everyone? I think that taking those values and understand what does it mean from the product perspective was very very challenging. Actually we were talking to large companies trying to ask them why don't you do that? And they said, well, you know, it requires very deep technology understanding to make that happen. It's not just, more of the same. This is this should take a very destructive approach, and I and I'm very, very happy that we were able to look at the different technologies and make that happen. What are, so so first of all, our challenge was to make this dream a reality, which means that we have a product that actually does what we want that product to do. The second thing is I think that from the business point of view, we are a small company that introduced a very destructive technology. And in the enterprise, very much of the business decisions to choose 1 product over the other has to do with the trust. So 1 of the things that we are introducing to the market is saying, hey. Stop doing anything you're doing in a manual way.
Stream that metadata to ArQipi, which from their point of view is a black box, and the output is gonna be showing you the metadata in the most accurate manner. And that leap of faith needs to be happening. And fortunately for us, 1 of the things that we were able to do is to find those customers that are using the product in the past 18 months because they are innovators, they are industry leaders, and they were in a situation where they really were curious how that product can actually help them. How can it be that there is a product that combines all these 3 values. And by that, 1 of the things that we are doing is create a very good reputation in the market that can be trusted by additional customers that have seen that other customers are using it. So 1 of our next challenges is to grow our business, because we know that we have a very good product. We know that it works, and we know that it is solving a problem that is on the agenda to be solved within thousands of customers. And we wanna be able to reach that point where they're taking a look at our product as a valid option to solve their problem. And are there any particular
[00:34:30] Unknown:
plans that you have for Oktopi going forward in terms of new features for the platform or improvements in your internal processes
[00:34:38] Unknown:
or anything customer facing that people should be keeping an eye out for? So absolutely. First of all, 1 of the things that we're doing in the product side is to extend its reach. And when I say that is we want to grow widely with adding more and more systems that can be analyzed by Oktopart. And not only on the BI space, but also that can serve the business, the BI technical team, but also on the business application side. And the other access is what we call the deep. The deep has to do with additional functionality that we can help our customers to benefit as a result of the metadata being centralized and analyzed.
So have the product not only become a viewer, but turn it to be more proactive. I'll give you an example. If a customer refresh the metadata constantly with occupy, which means they've uploaded now April metadata, and we already have March metadata. Occupies product can be proactive in a way that we can alert the customer of any discrepancies that have been done before going to production. Not only showing what exists, but also help them to understand which reports have been deleted or modified. Which data elements have been deleted, modified, or changed, or added. And that turns the product to be more pro proactive to the customer rather than the customer to use Oktify to find out things. And the third thing that we are very focused is to build our market reach by partnering with very strong system integrators and also expand our marketing activities to create more awareness and knowledge about who we are, what we can do, how we have been trusted in the past 18 months by a lot of customers, and why should you too consider looking at occupy
[00:36:41] Unknown:
and actually start using it? So for anybody who is interested in getting in touch with you and keeping track of the work that you're doing. I'll have you add your preferred contact information to the show notes. And as a final question, as somebody who is working very closely with a lot of the tooling that's being used for building and managing data systems, Is there anything that you view as being the biggest gap in the tooling or technology that's available for data management today?
[00:37:14] Unknown:
Yes. I I think that I see a lot of tools that are still looking at the cloud as as a difficulty. It's still kind of on prem, very heavy duty professional services customization. I think that more and more customers feel more and more comfortable with going to the cloud, which where our product is. And 1 of the challenges that we had to face with when talking about our solution was, oh, you're on the cloud. I don't think any financial institution or banking or insurance company or retail or health care would agree to use your product on the cloud. And guess what? They do. Because they do understand that metadata and data management tools are more robust, more easy, and less costly when they are being managed on the cloud, either on our cloud or their cloud.
This is something that needs to be discussed with each and every customer. But cloud is something that, is taking place over a traditional type of installation. So I think that's 1 of the things that we see a lot. The second thing is the automation, leveraging technologies like machine learning to ease the work that the end customer needs to put in place. So you can build a product that can show you data lineage, but if this involves customer to do documentation and manual mapping and everything that then will be created just to connect the doc in the modeling, and that takes about 3, 4 weeks to if if not months, well, that's half good. We want the customer to focus on what they do best, get results from their data, and help them not to deal with the cumbersome, tiring, sometimes frustrating infrastructural work. And for that, we wanna provide them automation. And we didn't see enough tools that are leveraging automation rather than continue the practice that we've seen in the past 3 years or 5 years of let's have a lot of people mapping, documenting your existing processes and just invest huge amount of time and labor. So these are the 2 areas, the automation and the cloud.
[00:39:27] Unknown:
Alright. Well, thank you for taking the time out of your day to join me and discuss the work that you're doing with Oktopi. It's definitely a very interesting platform and 1 that obviously solves a very real need for a lot of businesses. So thank you again for that and I hope you enjoy the rest of your day. Thanks so much. Have a great day.
Introduction to Amnon Drori and Oktopi
Challenges in Data Management
Oktopi's Solution for Metadata Management
Technical Architecture of Oktopi
Customer Interaction and Metadata Annotation
Data Storage and Export Capabilities
Unique Challenges in Metadata Platforms
Onboarding and Deployment Process
Considerations for ETL Tools
Interesting Use Cases and Workflows
Handling Sensitive Metadata
Challenges in Building and Growing Oktopi
Future Plans for Oktopi
Contact Information and Final Thoughts