Summary
In this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with agentic capabilities. From leading AI engineering at Deutsche Telekom to his current entrepreneurial venture focused on multi-agent systems, Arun shares insights on building agentic systems at an organizational scale, highlighting the importance of robust models, data connectivity, and orchestration loops. Listen in as he discusses the challenges of managing data context and cost in large-scale agent systems, the need for a unified context management platform to prevent data silos, and the potential for open-source projects like LMOS to provide a foundational substrate for agentic use cases that can transform enterprise architectures by enabling more efficient data management and decision-making processes.
Announcements
Parting Question
In this episode of the Data Engineering Podcast Arun Joseph talks about developing and implementing agent platforms to empower businesses with agentic capabilities. From leading AI engineering at Deutsche Telekom to his current entrepreneurial venture focused on multi-agent systems, Arun shares insights on building agentic systems at an organizational scale, highlighting the importance of robust models, data connectivity, and orchestration loops. Listen in as he discusses the challenges of managing data context and cost in large-scale agent systems, the need for a unified context management platform to prevent data silos, and the potential for open-source projects like LMOS to provide a foundational substrate for agentic use cases that can transform enterprise architectures by enabling more efficient data management and decision-making processes.
Announcements
- Hello and welcome to the Data Engineering Podcast, the show about modern data management
- Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.
- This episode is brought to you by Coresignal, your go-to source for high-quality public web data to power best-in-class AI products. Instead of spending time collecting, cleaning, and enriching data in-house, use ready-made multi-source B2B data that can be smoothly integrated into your systems via APIs or as datasets. With over 3 billion data records from 15+ online sources, Coresignal delivers high-quality data on companies, employees, and jobs. It is powering decision-making for more than 700 companies across AI, investment, HR tech, sales tech, and market intelligence industries. A founding member of the Ethical Web Data Collection Initiative, Coresignal stands out not only for its data quality but also for its commitment to responsible data collection practices. Recognized as the top data provider by Datarade for two consecutive years, Coresignal is the go-to partner for those who need fresh, accurate, and ethically sourced B2B data at scale. Discover how Coresignal's data can enhance your AI platforms. Visit dataengineeringpodcast.com/coresignal to start your free 14-day trial.
- Your host is Tobias Macey and today I'm interviewing Arun Joseph about building an agent platform to empower the business to adopt agentic capabilities
- Introduction
- How did you get involved in the area of data management?
- Can you start by giving an overview of how Deutsche Telekom has been approaching applications of generative AI?
- What are the key challenges that have slowed adoption/implementation?
- Enabling non-engineering teams to define and manage AI agents in production is a challenging goal. From a data engineering perspective, what does the abstraction layer for these teams look like?
- How do you manage the underlying data pipelines, versioning of agents, and monitoring of these user-defined agents?
- What was your process for developing the architecture and interfaces for what ultimately became the LMOS?
- How do the principles of operatings systems help with managing the abstractions and composability of the framework?
- Can you describe the overall architecture of the LMOS?
- What does a typical workflow look like for someone who wants to build a new agent use case?
- How do you handle data discovery and embedding generation to avoid unnecessary duplication of processing?
- With your focus on openness and local control, how do you see your work complementing projects like Oumi
- What are the most interesting, innovative, or unexpected ways that you have seen LMOS used?
- What are the most interesting, unexpected, or challenging lessons that you have learned while working on LMOS?
- When is LMOS the wrong choice?
- What do you have planned for the future of LMOS and MASAIC?
Parting Question
- From your perspective, what is the biggest gap in the tooling or technology for data management today?
- Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.
- Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
- If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.
- LMOS
- Deutsche Telekom
- MASAIC
- OpenAI Agents SDK
- RAG == Retrieval Augmented Generation
- LangChain
- Marvin Minsky
- Vector Database
- MCP == Model Context Protocol
- A2A (Agent to Agent) Protocol
- Qdrant
- LlamaIndex
- DVC == Data Version Control
- Kubernetes
- Kotlin
- Istio
- Xerox PARC)
- OODA (Observe, Orient, Decide, Act) Loop
[00:00:11]
Tobias Macey:
Hello, and welcome to the Data Engineering Podcast, the show about modern data management. Data migrations are brutal. They drag on for months, sometimes years, burning through resources and crushing team morale. DataFold's AI powered migration agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year long migration into weeks? Visit dataengineeringpodcast.com/datafolds today for the details.
Poor quality data keeps you from building best in class AI solutions. It costs you money and wastes precious engineering hours. There is a better way. Core signal's multi source enriched cleaned data will save you time and money. It covers millions of companies, employees, and job postings and can be accessed via API or as flat files. Over 700 companies work with Core Signal to develop AI solutions in investment, sales, recruitment, and other industries. Go to dataengineeringpodcast.com/coresignal and try Core Signal's self-service platform for free today. Your host is Tobias Macy, and today I'm interviewing Arun Joseph about building an agent platform to empower the business to adopt agentic capabilities. So, So, Arun, can you start by introducing yourself?
[00:01:34] Arun Joseph:
Hey. Hello, Tobias. Thanks for inviting me to this podcast. So I'm Arun. Arun Joseph. I'm based out of Germany, Bonn. Well, I've always built distributed systems. I've been leading engineering organizations. In my most recent role, I was heading the AI engineering program for Deutsche Telekom Group, where we build something referred to as Elmos, which is the agent platform, which is open source. And now, right now, I'm on my entrepreneurial path, building something around multi agent systems because this has been something which is close to my heart quite a for quite a long time. And, well, I I'm originally from India. I used to work for large scale enterprises and leading engineering teams across the globe, also in San Francisco, Juniper, then, Merck Pharmaceuticals, for I built their industrial IoT platform for Merck. Always build large scale distributed systems. It was mostly entrepreneurship kind of, kind of, endeavors is the first time I'm jumping headfirst into, you know, entrepreneurial stuff outside an organization. So I'm passionate about distributed systems and large scale AI systems, which is gonna define the future.
[00:02:51] Tobias Macey:
And do you remember how you first got started working in data and also in the AI and ML space?
[00:02:58] Arun Joseph:
Yeah. I think the first introduction to large scale data was when I was used to work for Nielsen, which is into a market research company as and me Nielsen had this interesting subsidiary called Arbitron. And Arbitron had this device which used to collect listen to radio signals placed in these homes where people sign up and collect listening data, listenership data. And this used to go through these large data pipelines to build listenership market intelligence. So this was the first time, we started thinking about large scale data pipelines. This is I think this was right around the time Hadoop was, brought in. And, AIML pipelines, MLOps, which was also there in in several of their endeavors, but that's essentially the the the recent experience.
[00:03:49] Tobias Macey:
And so in terms of the overall space of building agentic systems, particularly when you're looking at the organizational scale and not just a little toy implementation or a proof of concept, There is a lot of complexity involved. So the early stages was very oriented around just simple rag bots. I I say simple in air quotes because they could be very complex. But as we move to more agentic capabilities where we're adding to the sets of tools and we're giving more free rein to the language models to make decisions and determine the execution paths in nondeterministic fashion, there is a lot more that goes into it. And particularly, if you are trying to empower nontechnical stakeholders who don't necessarily have the deep domain expertise in how these systems work, you wanna make sure that they're fairly foolproof.
And I'm wondering if you can talk to some of the ways that you thought about the different components and domain segments that go into building something like an agentic system and then being able to scale that to the overall organization, just how you started to approach that overall problem definition.
[00:05:02] Arun Joseph:
Yeah. Absolutely. So this is a fascinating topic because there are several constructs to unpack in here of which one of the most important construct is English is the new programming language, or your thought is the new design. And, essentially, what happens what is happening right now is this this thing called as agentic orchestration, which is merely a feedback loop if you look at it. Right? You provide a goal. And to an intelligent system, it could also this is how even people work. You give a goal, and the entity attempts to do it given enough tools.
If it doesn't work, loop through it, get new insights, and then iterate. So this orchestration loop makes it possible that if you're defining your instructions very precisely, like as a program, add two numbers, it doesn't have to go through that loop multiple times. But if you're specifying something in an abstract manner, I have two chocolates and then somebody's giving me something and I need to figure out what's the best way to do it without knowing the construct of addition, then this is the loop first figures out what should be the mathematical construct to apply. Okay. This is good. Now let let's apply this. Great. And then let's verify it. So in this example, what just happened? So you move from programming, which was very deterministic, used to be written by SMEs or specialized expertise, which was required. It shifts to this capability allows people to describe in a more broader manner what they want, and the loop takes care of it to some degree. The implications are profound, especially in enterprises.
What is most enterprise information? What does most enterprise information systems do? They merely move data from one place to the another and then do some kind of transformation. And, of course, there is resilience, reliability, and distributed systems. But most of the requirements are around moving data from one place to another, and and some business stakeholder would say, I want it this way. I want it that way. And and now tying it back to the analogy that I said, now the engineers of an organization need not be building systems exactly as how the business people would say, but rather build systems which allows them to the business to tell what is required, and the system makes it come true. So this is this is the fundamental mental model. But in order to make that possible, you need great models. You need, a robust way to connect data and tools, a robust way to do orchestration loops, a robust way to manage the growing complexity as you build more such programs, and the complexity on how these programs can interact with each other. That, in essence, is the anatomy of a multi agent system.
[00:07:56] Tobias Macey:
And in terms of the primitives that you have available to build a system like that, that's obviously a very rapidly evolving ecosystem to your point of multi agent systems that, of course, brings to mind the a to a protocol that Google just donated to the Linux Foundation. In terms of tool call definitions, it brings up the hype that's been growing around the model context protocol. There are obviously custom tool definitions that you can create. The idea of handing tools to these language models has been around for at least a year or two. And I'm wondering how you think about what are the stable primitives that you have available to be able to build an enterprise scale system on top of without risking that one of these protocol authors is going to introduce a new revision that's going to massively change the capabilities or require a lot of rework in terms of how you actually think about your implementation?
[00:08:53] Arun Joseph:
Yeah. This is a very, very good question in terms of why went through the other side of it. Because we built a product all first before MCP, before a two way, into Elmos, the platform that we built. But then these MCPs came up, and our stakeholders and as well as our team started to ask, what are you gonna do now? Yeah. So this is a valid question, but there there there's, let's unpack a little a a few things around here. We went into production back in the first quarter of twenty twenty four as as an agentic platform, not not only, the first two agents which went to production was in the first quarter of, of twenty twenty four. It used to connect to telco APIs, which are really complex.
So there was no MCP. There was no eight way, but we could do it. The first principles don't change. That's the whole point, essentially. And then we started to when we started to build more such agents and more such APIs, the need for some sort of harmonization came in. What is the pattern that you tell an engineer without reinventing the wheel or the pattern which we follow? This was the beginning of what we refer to as our Elmos protocol, which was based on web of things, because Elmos was built on the fundamental construct that everything should be open, and not reinvent the wheel. So let's pick up the best protocols out there to without reinventing a new protocol, how do you allow agents to collaborate? But at the same time, organizations, especially when models needed to connect to data, MCB was the first one which came out with that topic. And it immediately gained a lot of adoption because of the size and scale of the cloud and desktop, as well because it could connect to the models quickly the tools quickly. So but the first principle of how you would need to connect your data is not gonna change. How you need to the need for bringing your data, because most of these enterprise APIs are have been there for years. When GraphQL came in, a lot of people tried to build GraphQL, wrappers on top. And now MCP comes in, and then people are trying to retrofit MCP into the traditional APIs. And then on top of it, a two way. My suggestion is before betting into any big protocol, it's pretty simple to connect your existing API to any of these LLM prompts.
Start there, experiment it, and then if it scales, then bring in new additional layers. And MCP looks like is adopted by OpenAI and also Entropic and Microsoft. And a lot of MCP servers is good enough to bet, but asynchronous large scale asynchronous communication is not solved yet in MCP. And, a two way, similarly, whether the need for a large scale agent collaboration is required or not is is still a question. Because if you need a large scale agent collaboration, you need large scale state management, which should be true. Otherwise, two agents cannot collaborate. So a two a is still shady area in my view, but, and most of the enterprise system might not require every problem to be seen as agents. That's that's another thing. In microservices, well, we say don't start with microservices. Right? Start with the monolith.
So it's easier to build your particular department's use case as a single system. Connect your data with MCP. Or if it's legacy API, don't try to MCPify it. Keep it simple, the case principle. Keep it simple. Stupid. And then think about scaling it to other protocols.
[00:12:39] Tobias Macey:
The other interesting challenge when you're dealing with an organizational scale enablement system, particularly for something like agentic use cases, is to your point, you need to be able to connect it to the underlying data that the agent needs to be able to use either to operate against or to use as context. And that's a substantial challenge in and of itself, again, because of the fact that data in and of itself is challenging, but, also, you're working with people who don't have that domain knowledge to understand all of the complexity that goes into all of the data prep, all of the ETL pipelining, how to manage the different chunking strategies, etcetera.
And so I'm wondering how you think about that element of exposing the existing organizational data and then streamlining the work of actually doing the additional preparatory work for being able to load that context into something like a vector database and standardizing on that core technology.
[00:13:39] Arun Joseph:
In fact, I mean, this clearly has been a missing piece in most enterprises. Enterprises used to have data pipelines, but it was mostly for transactional data. Transactional systems would emit the data, which is going into huge data pipelines, NiFi's, etcetera, landing in warehouses. What right now emerged is a need for, unstructured natural language data to be parsed into semantic vectors, which could be queried by and and and finally collapsed for an answer by the LLMs. So this is a new skill and technical skill sets that most organizations need to learn, but the first principles of how large scale data ingestions work still applies, which means the data still would need to go through phases and pipelines. For example, in, when when in in in Deutsche Telekom, when, we wanted to ingest, this, corporate knowledge base for customer support people and FAQs, if you just ingest it as it is, it would not the the the the search endpoints were not performing as we would have wanted it to be. The accuracy rate was subpar.
So we had to create the ontology and cleaning up of the domain objects and the domain topics to create the ontology of the knowledge for the customer support groups and build that ontology into the pipeline. So it it requires two skill sets. One is a technical skill set on what is a chunking strategy, what is an embedding model, what kind of vectorization approaches can you use. The other is the domain knowledge on, hey. My customer support is broken into domain billing domain and customer domain and contract. I have Magenta TV and and all that. So both the knowledge are essential, and this is this in essence need to come hand in hand to create those pipelines. But but at the same time, there are only a few things that you can tweak to bring the best answers, especially in RAG. One is definitely the vector database. The second is the pipeline itself on how do you create the ontology and the cleanup.
And the third is the actual search endpoint that you built, which should be able to rely not just maybe on vectorization, but also on additional dimensions to create a hybrid search approach. These are the levers which you have. And and I think the vector side plays a huge role in terms of not many developers are going to create their own vector databases for sure. Rest of the pieces, they can tweak. What kind of search algorithm that they need to put in? What kind of pipelines that they need to build in? So the choice of, the vectorization approaches and and and how that system would scale with large ingestion pipelines is something which need to be really thought through. I think in in DT, we brought in Quadrant, which served really well the purposes of operational simplicity that Quadrant baked in. And in terms of the vectorizations, it it Quadrant has this brilliant user interface, which shows the entire embeddings space, and the operational simplicity was unparalleled.
And this helped in streamlining, at least, for the developers to focus on the ingestion pipelines and not on operational characteristics of maintaining the topology of the of the vector databases because you would need multiple multi tenancy and all that, which resulted in that pipeline that we referred to as. So this was developed by a couple of engineers in my team, Thomas Weigel. So in German, means root. So this was, like, built as pipelines where it it's it's like roots that can go into the soil. So it's like the enterprise world is like soil, and it need to go in and collect the nutrients, which is the data, which is in many different formats. So, essentially, it's it did not try to replace anything, but rather what's the best embedding strategy or what is the best frame the a lot of frameworks were coming up, and these frame there will be one or two capabilities within the framework, which would be really good. How do you club together these capabilities and not bet on only one Lama index or, something like that? So this was WordCell, which is a pipeline, and it ingest the data, and it can land in the quadrant cluster that we had. And then the search endpoints were built.
[00:18:12] Tobias Macey:
In terms of the embeddings and the additional context that you need to bring in, one of the common pieces of wisdom when you're building RAG systems is that you don't know what chunk chunking strategy you want to use until after you've tried it and experimented a little bit. And so I'm wondering how you thought about either determining this is the standard strategy we're going to use because it's good enough on most use cases or how much control you wanted to give to those end builders of these agents to be able to say, you know that content that you're working on, so you're going to set some parameters to determine which chunking approach to use and just how to think about building that into a flexible system without making it flexible to the point of being useless.
[00:19:00] Arun Joseph:
Yeah. Absolutely. I think too much obstruction. It's it's a right balance that we need to think about which went into the system as well. So, essentially, since because we were not building a platform right from the start as in we we were supposed to solve these use cases. So we were only measured by one one, one metric. How accurate are the answers for the German customer service bot for the customer? And and in order to achieve that, we did not start with by thinking what flexibility do we need to have in the framework. We did the heavy lifting of figuring out what is the right chunking strategy. And even embedding models, we ended up having our own fine tuned embedded models, because it was German. Yeah. So we started building our own, embedding models back then, which which proved with better accuracy than the off the shelf models back then, from OpenAI. So but once we started figuring this out, then the next country came in, which which was, I I believe rightly, it was Croatia.
And then you needed to abstract away what's the best way for the Croatian knowledge ingestion pipeline to be created such that the data scientists and the people who know Croatian, which is a Croatian telecom, group, can manage that pipeline. And then we built World Soul into in in such a manner that they can play around with multiple chunking strategies, not on a UI level because these are small atomic units, which was built on so there is DVC, the DVC pipelines. Right? So this Woodrow was built on top of DVC, and you have something called as the the Woodrow steps or the Woodrow routes here.
One of the step is there will be a default chunking strategy step, which is, again, not exposed as a UI. It's it's a simple module, which is a boot sales step, which is a Python program. And you can bring in your favorite library, which has better utilities for the chunking strategies that you might want, import it into this unit, and try it out, and then plug it into rest of the pipeline, and it will work. So it's like the UNIX pipes approach, that we use to build the Woods Hole pipeline. So this flexibility was crucial if you want to expose it to other people who know the language because, otherwise, you are limited by the engineering group who might know only one language. So that's the way we started to, expand it to then Hungarian and Croatian so that the data scientists in those group are not working as a black box with some UI drop down chunking strategy, but they can bring in their favorite library into the step and only change that. Rest of the pipeline remains the same.
[00:21:42] Tobias Macey:
So based on the work that you did at Deutsche Telecom, you ultimately created the LMOS open source project. And one of the perennial challenges of releasing something as open source that you first built for a particular organizational use case is that the design of the system as it's implemented in the business implicitly encodes a lot of the organizational patterns of that organization, which don't necessarily translate externally. And so I'm wondering how you approached that translation process of figuring out what are the pieces that are generally applicable, but not so general that they're useless, and then design it in a way that it can be used as a foundational substrate that other people can build on top of without making it so complicated or confusing that they're never going to start down that path?
[00:22:34] Arun Joseph:
That's a very sharp question. It's like the Conway's law design Conway's law design of a framework. Right? Right? So two things that really, really worked out well in terms of reaching a just enough point in abstraction is in order to mention this, there's a few things. Yeah. For example, when we started in, 2023, there were these frameworks. Slack chain was a major one in 2023. We started picking Lang chain, but almost all our transactional systems and APIs, the the profile API, the the the billing API, All of that were built in JVM stack. The second point is almost the entire operational systems with distributed systems were built on the Kubernetes CNCF stack. So you have the Kubernetes, the observability, the Grafana, the Prometheus, and everything. Now comes a new framework which was invented somewhere else based on some use cases. Now how do you fit this in into the people who actually know the data, Do know the domain, and what happens to all the client SDKs that has been built? What's the reason we went back to the drawing board in starting creating Elmos? And and we did not stand out creating Elmos by wanting to create a framework. The problem statement was quite simply this. We we implemented something in Langchain. It took a couple of months, and after that, no one knew how to build it into a platform because it was so chaotic. And how does it actually fit in into the rest of the power stack? So we went back to the drawing board and came down to the realization. Only if we provide the right amount of tooling to the people who know these APIs and domains and let them build it. That's the only way this will scale. Otherwise, you'll have a new team building something else, and then you need to ask data from this team and Jira tickets and this and that, Conway's form. Now second point is the only thing that you can do is you cannot state that this is how the framework should be. What you can only build is how to shorten that loop of doing an experiment. Let's say changing something in an agent and how to take it to staging, for example. For that, you need to have a robust pipeline from the time a developer changes some behavior in an agent. An ephemeral environment is immediately spun up to test this in isolation.
And this is all coming in from the distributed systems back up, right, from the Kubernetes world. So this is how we approached it and not as a framework, but how do we shorten the feedback loop for testing because no one knows how to build these systems reliably. So which resulted in the stack which goes not as a framework. It's not Elmos is not a framework. Elmos has something called as ARC, which is the agent reactor as we call it. Right? Like the the Jarvis ARC. It was built on Kotlin so that we could build a DSL, which is just enough information for the engineers who know their APIs who are on Java to build agents, not having to figure out hundreds of new APIs and Spring AI and this and that. Then these agents need to live somewhere, which is a life cycle management. So we build the Elmos platform to deploy these agents with one Git push, for example. Right? It spins up an ephemeral environment. And in that Elmos platform, it's entirely Kubernetes based, which means you're not reinventing the wheel. Agents were created as first class citizens there. So you could do kubectl get agents. And then life cycle management of an agent is taken care. When you push an agent, you could say, I'm a billing agent. I can handle billing queries, and I can handle billing disputes.
I advertise this as an agent. I advertise it to the network when I am deployed into the Kubernetes, platform. So you use the discovery mechanisms, which is already proven in Kubernetes, right, through STOs of the world, and then bring that into the Kubernetes registry, the STO registry. So you're not reinventing an a two a registry or something like that. It's all stack which enterprises and operational teams are familiar with. And this, the Elmo stack is should be universally applicable for distributed systems teams without trying to do too much into ARC. And Woodzel, for example, is also similar. So we knew exactly where we wanna stop. If you try to build a UI and drop down with, chunking strategy and this and that, it will immediately lose the flexibility of bringing in the best, you know, Python frameworks or stacks, which is on the right pipeline side. So we just stopped at that, which allows and picked DVC for the point in time recovery and all that. So it's also bet on large scale ingestion pipelines based on Kubernetes, etcetera. So it's universally applicable in enterprises would be my answer.
[00:27:18] Tobias Macey:
Interestingly, the LMOS acronym is language model operating system. Operating systems are a ubiquitous concept in computing for decades now. I'm curious what you were thinking in terms of the naming about what you're trying to convey with that concept of operating systems for this language model context and some of the core theories around operating system design that you brought into this framework to enable such a generalized substrate for agentic use cases? Yeah. This is,
[00:27:53] Arun Joseph:
this is clearly one of the reasons I'm starting the start up itself. Yeah. Because, I was so fascinated by the idea when AI took off. Different people saw it as, oh, it's magic. Right? But there's I I love thinking in terms of fundamental abstractions, either in physics or biology or computational systems. And, yeah, I I used to wonder, oh, I was I wasn't there when Linux was born or something. As in, so it was like a new way when computing was emerging on how do you build programs. For example, when I first interacted with the language models, it felt like now you have a new microprocessor, and this is made up of some magical silicon. So instead of precise, you know, x 86 instructions, you can give natural language instructions, and it's gonna chunk out some response into your registry. So now, suddenly, the it it flips.
Oh, this would mean you would need to build new programming and operating constructs from the scratch to build a new computational unit, which could be agents is was the thought process. So so that shift from language models as microprocessors resulted into thinking, let's build all the layers above starting with how do you interact with these models. And models are going to emit strings as tokens. And these strings and tokens need to control the program flow, which is a totally different programming paradigm. Right? And how do you build scheduling on top of it? For example, let's say there is a request coming in, and it need to do some planning while at the same time, the same program need to give response to a task. So you need a scheduler which optimizes the resources, which in this case is the language models, which could be either cost or unit of time. And and all those constructs needed to be revisited in an and how do you handle nondeterminism fundamentally in a program? So that resulted in this thought process of Elmos, and we thought Linux had the Tux, mascot. So let's, let's bring in Sesame Street Elmo into this being the next stack for agentic computing. This was a vision in 2023 as in, like, like the Xerox part group. Yeah. This was how most of the engineers joined the team as well when, you know, personal computing taken off and and, Xerox Spark came up with object oriented programming and all that. So we were a couple of engineers passionate about building something great. And then, oh, let's let's do something. Let's build the foundations for agent computing, and it's called Elmos. Let's build agent communication protocols, agent computational units. Let's build the scheduler for interacting with language models, while at the same time solve the customer use cases for Deutsche Telekom. So this was the background, and the storyline behind, Elmos.
[00:30:51] Tobias Macey:
One of the other interesting challenges that has become exacerbated by the scale of capabilities of these language models and the high degree of variability in terms of their pricing structures and the complete unpredictability in terms of the number of input and output tokens that are used for the context. How do you think about reporting on and managing the costs and budgeting of these different agents to make sure that you're not going to bankrupt the company in the process of solving their problems?
[00:31:25] Arun Joseph:
Absolutely. This is so, essentially, this is also this need to be fundamentally looked at from a computational point of view. Most of the agent building process these days falls into the category of let's use LLM invocations for every computation. So let's assume if you're if if there is a flow when a customer requests a refund, then call the refund API, then get the response, then then do the, account update API, etcetera, etcetera. What is being observed today is most people are using LLMs all the time for this invocation. When you take a step back and and and and and and think about how computation works in nature for energy conservation, we don't try to do we don't try to, you know, reinvent the same process using our brains all the time once it becomes deterministic or like the habit formation loop, you you put it into the low energy execution phase. So for agents too, this sort of a paradigm need to emerge is is the way I'm thinking as well. So instead of invoking LLMs all the time, if the LLM has figured out or the agent has figured out 80% of the use cases can be solved by this deterministic flow, the agent itself constructs that deterministic flow. And for all the invocations that comes in, uses this deterministic flow and then falls back into this rest 20% flow. But all of this requires rigorous instrumentation into looking into costs. So, essentially, just like in the in the operating system. Right? For each process, you assign CPU and memory, and then you do the stamping and observation, the process IDs.
You would need to think that unit, in this case, is agent, such resource allocation, and then monitoring it, and then massive observability platforms, intelligent decision making platforms keeping in check on this. So two things. It's like the OODA loop. Right? Observe you need to keep observing what is going on, and then you need to make the decisions on what need to be optimized. In this case, computation need to be optimized. It doesn't make any sense to do LLM calls all the time. It is soon going to come crashing is what I what I bet.
[00:33:46] Tobias Macey:
And then in terms of the focus on openness and local control for businesses that want to be able to capitalize on these agent capabilities, the ability to use local models for more of that cost control or more predictability in terms of being able to freeze or pin to a specific model version so that you're not at the whim of whatever the model provider is going to do under the covers of their API. I'm curious how you see the work that you're doing complementing or overlapping with other projects in the ecosystem such as Oomi or OUMI, which is focused more on enabling organizations to build their own foundation models and control more of that element of the life cycle as well.
[00:34:33] Arun Joseph:
Yeah. So this is also one of the reasons. So I'm going into this entrepreneurial journey on on betting on also one of these key constructs. What we have observed is in enterprises, once a pattern has been set in let's assume if somebody has built in with the best model available for some use cases. No. There is no incentive for it to be shifted later. No. In enterprises, typically, once once something has been set, it's very difficult to shift. But the worrisome point might be all this information in terms of the feedback loop. For example, if you had control on the fine tune models or or getting information on the model outputs that you need to rigorously track, This is a wealth of information that you could use as we move forward to do fine tuning and maybe bringing down the computational cost and also building better intelligence for your organization. And that opportunity is missed if you keep betting on these large models. So for that to happen, you need to have a platform or a layer which allows this mitigation strategy to be baked in. So when you call the completions API, instead of directly going into, let's say, OpenAI or Gemini or whatever it is, there should be a way in which this is mitigated by that semi proxy layer, which allows quickly shifting different models to be plugged in. And, essentially, as OpenAI recently has shown with the responses API, which is a higher order API, there is a beautiful example where OpenAI shows with three lines of code, you are writing a call command, call, and say, add this toner pad to my shopping cart, to a Shopify MCP server tools which are plugged into the responses API. This is the only only, input that was given, and you configure the MCP server. There is no additional code. It does the agent decorecastration, calls the search a product, add a product, and, check out a shopping cart. All of that, by that agent deco. This happens underneath the platform. So if organizations start to use these APIs, it is super simple. The the the simplicity of such an API and the value it brings in is tremendous. You don't you could write any any number of use cases in a in any given day. But the problem is if you don't have the mitigation strategy of how your organization is orchestrated to to to some some black box API, you will miss out on a lot of information which you could have preserved for the time when model tuning is gonna become much cheaper, and you would soon be tied to these large model companies and just consuming some black box higher order API. So the layer that I'm also building is something like the responses API. One of the components that we are building is this one, which allows the same OpenAI like structure, but it allows different models to be plugged in while you can be deployed on your premises. And this data is immediately accessible for you to fine tune your models. It's immediately traceable with the same simplicity as OpenAI that in four lines of code, you can build a number of orchestrations.
[00:37:47] Tobias Macey:
One of the other interesting elements of this new world of agent and capabilities is the broad applicability of the problem spaces that they can be applied to. And I'm wondering how you see the potential for something like Elmos to be employed in the context of a data team to be able to build their own agents to help to manage some of the pipeline design implementation, do some of the, maybe, analytical queries that you want to expose to the business or some of the other end user capabilities that are particularly interesting or innovative that you have either conceived of or seen in action?
[00:38:30] Arun Joseph:
Yeah. So, essentially, there was, recently a meetup for the Eclipse software defined vehicle group in, I think it was in Copenhagen. So where we demonstrated with Elmos, so the vehicles are emitting telemetry data. So the software defined vehicle group is a consortium of some of the major companies in Europe. They're focusing on telemetry data emitted from vehicles. Yeah. Software defined vehicles. So we demonstrated a simple agent, which was built with Elmos Arc, the agent developer framework, the agent, reactor framework, which could connect to this telemetry data. It was a time series database, and the query building was being done by the agent using OpenAI APIs. But the fun part is so you could ask questions like, how many of my vehicles are running low on fuel, for example, amongst maybe thousand vehicles, and all of that being returned in a few lines of code. So, essentially, what this actually means is for data analytics teams, it is all about querying.
Right? Because the query language, it could be a SQL or a cipher or, a number of approaches are there. But this acts as a layer on top for them to quickly build this dynamic query creation agents even though the underlying systems won't change. You don't have to change your Athena. You don't have to change your Redshift. You don't have to change your in any of yours, your your snowflake. You could build an agent that constructs these queries because language models are also good in understanding the existing querying languages. And, it could be done pretty easily, as we have seen so far.
[00:40:15] Tobias Macey:
And in your experience of building these agentic frameworks, figuring out how to manage the creation and maintenance of the context corpus to enable these agents to work effectively? What are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:40:36] Arun Joseph:
Yes. Andrew Karpathy, you recently pointed out, yeah, every engineering information engineering is gonna converge into context engineering. So it's a man. Simple questions are easy to handle, but as the system grows in complexity, how do you pass the right context to the same agent or different agents? Yeah. There are many approaches which are emerging. Nobody has fully figured out this pattern yet because as it grows and scales, this this is gonna become challenging. But one of the things is if you have a platform or a layer let's take an ecommerce organization, a company with champions commerce. If that company either if different departments starts to build different agents, then already what you would end up is today's spaghetti systems.
You would lose out on on deriving the true value of AI because the context is now segregated. So the true value of AI in enterprises is going to emerge only from unifying at least the foundation for the context so that other agents can be built. What that would mean is a big shift in enterprise architectures. The microservices world taught that small is good, small, nimble teams. They can go and do whatever that they want. But with AI, there is a problem because if you don't have a unified view of the truth of your enterprise stored somewhere, then the agents that are being built will not be as effective as if it had not been, which brings in the need for a core context and model management platform where you connect your enterprise tools so that your different departments and engineers can build easily without asking for permission.
Hey. Can I use your department's data? Because what will matters is the business outcome. Suddenly, your orchestrator, if you say, I want to optimize my profits, for the teenagers, what suggestions exist. Now if the platform has the tools on, you know, you know, sales projections analysis tool, then buyer patterns tool, then inventory tool, and market analysis tool. Then suddenly with this question, immediately, the agent orchestration picks in. It's able to come up with immediate results. So this layer is the essence. If you're not building your organization for agent dec orchestration, that is the definition of AI native as I would call it, the architectural equivalent of an AI native. What is the definition of an AI native? How to prepare your organization to be a native? Models are gonna get better, cheaper, New paradigms will emerge in programming. But one thing certainly is not gonna change, which is you need to have this context building platform layer for the world which is going to come, And and that is going to be the core of your AI native transformation journey.
[00:43:30] Tobias Macey:
And I think to extending on your reference to the microservices architecture is that we went through a similar problem where as you go from the monolith to the microservices, you still have to figure out where that state gets created and maintained and how it gets distributed. And then that also leads to all of the complexity of reconstituting all of the relevant context in the warehouse context where you pull the data out of all of these different microservices systems, and then you have to figure out what are the linkages, recreate some of the API based connections at the data layer to figure out how to bring it all back together to make it semantically meaningful. And we're in a similar stage with these AI systems where if you have that warehouse catalog, you can use that to a certain extent, but we need to figure out how do we feed a lot of these LLM interactions back into that to help maintain and grow and evolve the corpus without having it shard into these different domains that then have to be reconstituted if you even know that they exist in the first place.
[00:44:38] Arun Joseph:
Absolutely. Absolutely spot on. And that's what most people are I think it's it's from what I've seen and and some of the stories from other places, the spaghetti that we have seen in the previous world, it's only going to get multiplied. Everyone is building three agents on top of the existing microservice. Now the microservice is now four four units of computation, and no one knows the context anymore. My agent can only respond address customer address. Then then now you think about what protocol should I use, a two a, to connect to the order order thing? And and now you have scrum meetings, and it's insanity.
[00:45:22] Tobias Macey:
Alright. For people who are figuring out how to build their own enablement layer for these agentic use cases in their organization, what are the cases where you would say that Elmos is the wrong choice?
[00:45:37] Arun Joseph:
So Elmos has multiple components. So let's assume if you're so Elmos itself has the agent framework per se, which is around Kotlin. So if you're a non Kotlin JVM organization, it doesn't make sense to use that agent framework. But, but then the platform layer, it's still an incubation phase. It's standard Kubernetes. It at least will provide a lot of inspiration for you to pick up irrespective of whether you run Python agents or so. So organizations who are not into JVM and Kubernetes or, if it's entirely based on Python or c sharp, then it might not make sense. But there is still a lot to learn, on how to manage the life cycle of agents. But that's also the reason why the learnings and also where it goes is it's going beyond agent frameworks and which is one of the components which is built under the new startup is is, is, is the Mesioc open responses component.
So, essentially, it does not restrict you from using any framework. As I described earlier, it's the complete the responses API just like OpenAI. It provides once it's deployed this component, every agent framework that you can think of, OpenAI agent SDK and, AutoGen or Kotlin or whatever should work because this addresses this problem of context building for large scale enterprises and model switching. So it goes just if if if the OpenAI developer platform, if it were to be deployed on your premise, what would it be like? That's that's the way we are thinking. That's the essence of what we need to build in your organization, irrespective of whether you use this component or not.
[00:47:20] Tobias Macey:
And as you continue to contribute to Elmos and continue along your own venture that you're building, what are some of the new capabilities that you have planned? What are some of the ecosystem evolutions that you're paying particular attention to or any of the capabilities that you're particularly excited to dig into building and growing?
[00:47:44] Arun Joseph:
Yeah. I think the on the Elmos side, there are now two initiatives. One is Elmos, which is on under, the Eclipse Foundation, and we are listening to the industry as well on what is actually required and not try to make it bloated to become, you know, let's add this feature or that feature. So it's kind of at least the agent framework is very stable. Kotlin is used in production. The Elmos platform, we originally built it as you should be able to deploy any agent to be Python or this into and and convert them into agents as first class citizens in Kubernetes.
This, we are rethinking the approach because if you try to fit for everybody, it will never work. So we just wanna convert that into a minimalistic registry based on Kubernetes, which does ex external orchestration. So agents for JVM stack, Elmos should hold true for today. I would say it's it's it's one of the best frameworks out there, which allows business and engineers to allow the same thing, which is very rare because a business is able to use if you're using Eleazar, the business can define the use case. I want to define the billing dispute use case. The business would write it in English language, and we are able to do it because if you just randomly write prompts, it will never work. So you need a runtime for the English language, meta language, which we created, which is referred to as ADL in ARC, so it works. The part which I'm most excited about is is this open responses component, which is being built by my cofounders, Jasbir and Amanth.
They are building this component to become, like I mentioned, that that orchestration layer for models and context building layer for the whole enterprise so that your developers, your departments, no one is gonna change the department's structure. They should still build agents on their favorite stack, but the minimum guidance is use to to to prepare yourself to change models quickly and don't lose context. You should have the ability to connect to any number of tools, and don't you bring your siloed tool registry and all that. That layer is called the Mosaic, open responses layer. And by the way, Mosaic is the name of the company I'm building, and, Mosaic stands for multi agent systems MAS. So this is the thing which I'm most looking for. It's also open core, open core model as open source, then building on top. Yeah.
[00:50:17] Tobias Macey:
Are there any other aspects of the work that you're doing on Elmos or Mosaic or the overall space of enabling agentic use cases, all of the data requirements that go into it that we didn't discuss yet that you'd like to cover before we close out the show?
[00:50:32] Arun Joseph:
Yeah. I think we went into the idea of orchestration. Yeah. So I would say everything revolves around the idea of orchestration from here on. And the idea of orchestration is simply feedback loops and feedback loops of experimentation, like in real world, yeah, as in how the the Apollo model versus the SpaceX model. In Apollo model, failure was not an option, and, SpaceX model failed fast and learned. So orchestration is a programming engineering, systems engineering view, which is gonna transform organizations. And it's more than a pro programming construct. It's a mindset construct and behavioral change inducing thing is the thing which enterprises need to think through as well if you try to retrofit your existing processes and and your ways of working into using any framework and AI, it will not work. It should be used as a lever to become space excess of the wall so that you focus only on one thing, which is feedback loop shortening and learning and reapplying. That's, AI nativity part, I would rather say.
[00:51:38] Tobias Macey:
Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.
[00:51:54] Arun Joseph:
I think I have my personal opinion on see, first of all, I have seen large corporations try to see essentially, it's time to take out things rather than adding more things. This is this is the way I would rather see it. Yeah. Every tooling that has come in has resulted in there is transactional system, then there is data pipeline, then there is some other team sitting somewhere building some new other tools. And when a business stakeholder asks, hey. What is going wrong? Now you're deracing Jira tickets between five teams. And every team would say I have the greatest framework and the tool. So I think it is time for a cleansing, and it's more like adding more but taking things out. For example, can there be new kind of systems that can be built, which can be queried both transactionally and operationally because agents might be able to reconstruct the truth, using computation as well, for example. So I would if I were to frame it as one statement, it's over engineering has crept up the data space as well. The it's time to clean it up, and AI provides a good lever to do it.
[00:52:59] Tobias Macey:
Alright. Well, thank you very much for taking the time today to join me and share the work that you've been doing on building the Elmos and the thought that you've put into how to enable the creation and maintenance of these agentic use cases at organizational scales. Definitely a very interesting and complex and timely problem, so I appreciate the time and energy that you're putting into making that more tractable for everyone else. Thank you, Topias. I really enjoyed the conversation as well. So thanks for the invite, and thanks for the thoughtful questions. Thank you for listening, and don't forget to check out our other shows. Podcast.net covers the Python language, its community, and the innovative ways it is being used. And the AI Engineering Podcast is your guide to the fast moving world of building AI systems.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts@dataengineeringpodcast.com with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Hello, and welcome to the Data Engineering Podcast, the show about modern data management. Data migrations are brutal. They drag on for months, sometimes years, burning through resources and crushing team morale. DataFold's AI powered migration agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year long migration into weeks? Visit dataengineeringpodcast.com/datafolds today for the details.
Poor quality data keeps you from building best in class AI solutions. It costs you money and wastes precious engineering hours. There is a better way. Core signal's multi source enriched cleaned data will save you time and money. It covers millions of companies, employees, and job postings and can be accessed via API or as flat files. Over 700 companies work with Core Signal to develop AI solutions in investment, sales, recruitment, and other industries. Go to dataengineeringpodcast.com/coresignal and try Core Signal's self-service platform for free today. Your host is Tobias Macy, and today I'm interviewing Arun Joseph about building an agent platform to empower the business to adopt agentic capabilities. So, So, Arun, can you start by introducing yourself?
[00:01:34] Arun Joseph:
Hey. Hello, Tobias. Thanks for inviting me to this podcast. So I'm Arun. Arun Joseph. I'm based out of Germany, Bonn. Well, I've always built distributed systems. I've been leading engineering organizations. In my most recent role, I was heading the AI engineering program for Deutsche Telekom Group, where we build something referred to as Elmos, which is the agent platform, which is open source. And now, right now, I'm on my entrepreneurial path, building something around multi agent systems because this has been something which is close to my heart quite a for quite a long time. And, well, I I'm originally from India. I used to work for large scale enterprises and leading engineering teams across the globe, also in San Francisco, Juniper, then, Merck Pharmaceuticals, for I built their industrial IoT platform for Merck. Always build large scale distributed systems. It was mostly entrepreneurship kind of, kind of, endeavors is the first time I'm jumping headfirst into, you know, entrepreneurial stuff outside an organization. So I'm passionate about distributed systems and large scale AI systems, which is gonna define the future.
[00:02:51] Tobias Macey:
And do you remember how you first got started working in data and also in the AI and ML space?
[00:02:58] Arun Joseph:
Yeah. I think the first introduction to large scale data was when I was used to work for Nielsen, which is into a market research company as and me Nielsen had this interesting subsidiary called Arbitron. And Arbitron had this device which used to collect listen to radio signals placed in these homes where people sign up and collect listening data, listenership data. And this used to go through these large data pipelines to build listenership market intelligence. So this was the first time, we started thinking about large scale data pipelines. This is I think this was right around the time Hadoop was, brought in. And, AIML pipelines, MLOps, which was also there in in several of their endeavors, but that's essentially the the the recent experience.
[00:03:49] Tobias Macey:
And so in terms of the overall space of building agentic systems, particularly when you're looking at the organizational scale and not just a little toy implementation or a proof of concept, There is a lot of complexity involved. So the early stages was very oriented around just simple rag bots. I I say simple in air quotes because they could be very complex. But as we move to more agentic capabilities where we're adding to the sets of tools and we're giving more free rein to the language models to make decisions and determine the execution paths in nondeterministic fashion, there is a lot more that goes into it. And particularly, if you are trying to empower nontechnical stakeholders who don't necessarily have the deep domain expertise in how these systems work, you wanna make sure that they're fairly foolproof.
And I'm wondering if you can talk to some of the ways that you thought about the different components and domain segments that go into building something like an agentic system and then being able to scale that to the overall organization, just how you started to approach that overall problem definition.
[00:05:02] Arun Joseph:
Yeah. Absolutely. So this is a fascinating topic because there are several constructs to unpack in here of which one of the most important construct is English is the new programming language, or your thought is the new design. And, essentially, what happens what is happening right now is this this thing called as agentic orchestration, which is merely a feedback loop if you look at it. Right? You provide a goal. And to an intelligent system, it could also this is how even people work. You give a goal, and the entity attempts to do it given enough tools.
If it doesn't work, loop through it, get new insights, and then iterate. So this orchestration loop makes it possible that if you're defining your instructions very precisely, like as a program, add two numbers, it doesn't have to go through that loop multiple times. But if you're specifying something in an abstract manner, I have two chocolates and then somebody's giving me something and I need to figure out what's the best way to do it without knowing the construct of addition, then this is the loop first figures out what should be the mathematical construct to apply. Okay. This is good. Now let let's apply this. Great. And then let's verify it. So in this example, what just happened? So you move from programming, which was very deterministic, used to be written by SMEs or specialized expertise, which was required. It shifts to this capability allows people to describe in a more broader manner what they want, and the loop takes care of it to some degree. The implications are profound, especially in enterprises.
What is most enterprise information? What does most enterprise information systems do? They merely move data from one place to the another and then do some kind of transformation. And, of course, there is resilience, reliability, and distributed systems. But most of the requirements are around moving data from one place to another, and and some business stakeholder would say, I want it this way. I want it that way. And and now tying it back to the analogy that I said, now the engineers of an organization need not be building systems exactly as how the business people would say, but rather build systems which allows them to the business to tell what is required, and the system makes it come true. So this is this is the fundamental mental model. But in order to make that possible, you need great models. You need, a robust way to connect data and tools, a robust way to do orchestration loops, a robust way to manage the growing complexity as you build more such programs, and the complexity on how these programs can interact with each other. That, in essence, is the anatomy of a multi agent system.
[00:07:56] Tobias Macey:
And in terms of the primitives that you have available to build a system like that, that's obviously a very rapidly evolving ecosystem to your point of multi agent systems that, of course, brings to mind the a to a protocol that Google just donated to the Linux Foundation. In terms of tool call definitions, it brings up the hype that's been growing around the model context protocol. There are obviously custom tool definitions that you can create. The idea of handing tools to these language models has been around for at least a year or two. And I'm wondering how you think about what are the stable primitives that you have available to be able to build an enterprise scale system on top of without risking that one of these protocol authors is going to introduce a new revision that's going to massively change the capabilities or require a lot of rework in terms of how you actually think about your implementation?
[00:08:53] Arun Joseph:
Yeah. This is a very, very good question in terms of why went through the other side of it. Because we built a product all first before MCP, before a two way, into Elmos, the platform that we built. But then these MCPs came up, and our stakeholders and as well as our team started to ask, what are you gonna do now? Yeah. So this is a valid question, but there there there's, let's unpack a little a a few things around here. We went into production back in the first quarter of twenty twenty four as as an agentic platform, not not only, the first two agents which went to production was in the first quarter of, of twenty twenty four. It used to connect to telco APIs, which are really complex.
So there was no MCP. There was no eight way, but we could do it. The first principles don't change. That's the whole point, essentially. And then we started to when we started to build more such agents and more such APIs, the need for some sort of harmonization came in. What is the pattern that you tell an engineer without reinventing the wheel or the pattern which we follow? This was the beginning of what we refer to as our Elmos protocol, which was based on web of things, because Elmos was built on the fundamental construct that everything should be open, and not reinvent the wheel. So let's pick up the best protocols out there to without reinventing a new protocol, how do you allow agents to collaborate? But at the same time, organizations, especially when models needed to connect to data, MCB was the first one which came out with that topic. And it immediately gained a lot of adoption because of the size and scale of the cloud and desktop, as well because it could connect to the models quickly the tools quickly. So but the first principle of how you would need to connect your data is not gonna change. How you need to the need for bringing your data, because most of these enterprise APIs are have been there for years. When GraphQL came in, a lot of people tried to build GraphQL, wrappers on top. And now MCP comes in, and then people are trying to retrofit MCP into the traditional APIs. And then on top of it, a two way. My suggestion is before betting into any big protocol, it's pretty simple to connect your existing API to any of these LLM prompts.
Start there, experiment it, and then if it scales, then bring in new additional layers. And MCP looks like is adopted by OpenAI and also Entropic and Microsoft. And a lot of MCP servers is good enough to bet, but asynchronous large scale asynchronous communication is not solved yet in MCP. And, a two way, similarly, whether the need for a large scale agent collaboration is required or not is is still a question. Because if you need a large scale agent collaboration, you need large scale state management, which should be true. Otherwise, two agents cannot collaborate. So a two a is still shady area in my view, but, and most of the enterprise system might not require every problem to be seen as agents. That's that's another thing. In microservices, well, we say don't start with microservices. Right? Start with the monolith.
So it's easier to build your particular department's use case as a single system. Connect your data with MCP. Or if it's legacy API, don't try to MCPify it. Keep it simple, the case principle. Keep it simple. Stupid. And then think about scaling it to other protocols.
[00:12:39] Tobias Macey:
The other interesting challenge when you're dealing with an organizational scale enablement system, particularly for something like agentic use cases, is to your point, you need to be able to connect it to the underlying data that the agent needs to be able to use either to operate against or to use as context. And that's a substantial challenge in and of itself, again, because of the fact that data in and of itself is challenging, but, also, you're working with people who don't have that domain knowledge to understand all of the complexity that goes into all of the data prep, all of the ETL pipelining, how to manage the different chunking strategies, etcetera.
And so I'm wondering how you think about that element of exposing the existing organizational data and then streamlining the work of actually doing the additional preparatory work for being able to load that context into something like a vector database and standardizing on that core technology.
[00:13:39] Arun Joseph:
In fact, I mean, this clearly has been a missing piece in most enterprises. Enterprises used to have data pipelines, but it was mostly for transactional data. Transactional systems would emit the data, which is going into huge data pipelines, NiFi's, etcetera, landing in warehouses. What right now emerged is a need for, unstructured natural language data to be parsed into semantic vectors, which could be queried by and and and finally collapsed for an answer by the LLMs. So this is a new skill and technical skill sets that most organizations need to learn, but the first principles of how large scale data ingestions work still applies, which means the data still would need to go through phases and pipelines. For example, in, when when in in in Deutsche Telekom, when, we wanted to ingest, this, corporate knowledge base for customer support people and FAQs, if you just ingest it as it is, it would not the the the the search endpoints were not performing as we would have wanted it to be. The accuracy rate was subpar.
So we had to create the ontology and cleaning up of the domain objects and the domain topics to create the ontology of the knowledge for the customer support groups and build that ontology into the pipeline. So it it requires two skill sets. One is a technical skill set on what is a chunking strategy, what is an embedding model, what kind of vectorization approaches can you use. The other is the domain knowledge on, hey. My customer support is broken into domain billing domain and customer domain and contract. I have Magenta TV and and all that. So both the knowledge are essential, and this is this in essence need to come hand in hand to create those pipelines. But but at the same time, there are only a few things that you can tweak to bring the best answers, especially in RAG. One is definitely the vector database. The second is the pipeline itself on how do you create the ontology and the cleanup.
And the third is the actual search endpoint that you built, which should be able to rely not just maybe on vectorization, but also on additional dimensions to create a hybrid search approach. These are the levers which you have. And and I think the vector side plays a huge role in terms of not many developers are going to create their own vector databases for sure. Rest of the pieces, they can tweak. What kind of search algorithm that they need to put in? What kind of pipelines that they need to build in? So the choice of, the vectorization approaches and and and how that system would scale with large ingestion pipelines is something which need to be really thought through. I think in in DT, we brought in Quadrant, which served really well the purposes of operational simplicity that Quadrant baked in. And in terms of the vectorizations, it it Quadrant has this brilliant user interface, which shows the entire embeddings space, and the operational simplicity was unparalleled.
And this helped in streamlining, at least, for the developers to focus on the ingestion pipelines and not on operational characteristics of maintaining the topology of the of the vector databases because you would need multiple multi tenancy and all that, which resulted in that pipeline that we referred to as. So this was developed by a couple of engineers in my team, Thomas Weigel. So in German, means root. So this was, like, built as pipelines where it it's it's like roots that can go into the soil. So it's like the enterprise world is like soil, and it need to go in and collect the nutrients, which is the data, which is in many different formats. So, essentially, it's it did not try to replace anything, but rather what's the best embedding strategy or what is the best frame the a lot of frameworks were coming up, and these frame there will be one or two capabilities within the framework, which would be really good. How do you club together these capabilities and not bet on only one Lama index or, something like that? So this was WordCell, which is a pipeline, and it ingest the data, and it can land in the quadrant cluster that we had. And then the search endpoints were built.
[00:18:12] Tobias Macey:
In terms of the embeddings and the additional context that you need to bring in, one of the common pieces of wisdom when you're building RAG systems is that you don't know what chunk chunking strategy you want to use until after you've tried it and experimented a little bit. And so I'm wondering how you thought about either determining this is the standard strategy we're going to use because it's good enough on most use cases or how much control you wanted to give to those end builders of these agents to be able to say, you know that content that you're working on, so you're going to set some parameters to determine which chunking approach to use and just how to think about building that into a flexible system without making it flexible to the point of being useless.
[00:19:00] Arun Joseph:
Yeah. Absolutely. I think too much obstruction. It's it's a right balance that we need to think about which went into the system as well. So, essentially, since because we were not building a platform right from the start as in we we were supposed to solve these use cases. So we were only measured by one one, one metric. How accurate are the answers for the German customer service bot for the customer? And and in order to achieve that, we did not start with by thinking what flexibility do we need to have in the framework. We did the heavy lifting of figuring out what is the right chunking strategy. And even embedding models, we ended up having our own fine tuned embedded models, because it was German. Yeah. So we started building our own, embedding models back then, which which proved with better accuracy than the off the shelf models back then, from OpenAI. So but once we started figuring this out, then the next country came in, which which was, I I believe rightly, it was Croatia.
And then you needed to abstract away what's the best way for the Croatian knowledge ingestion pipeline to be created such that the data scientists and the people who know Croatian, which is a Croatian telecom, group, can manage that pipeline. And then we built World Soul into in in such a manner that they can play around with multiple chunking strategies, not on a UI level because these are small atomic units, which was built on so there is DVC, the DVC pipelines. Right? So this Woodrow was built on top of DVC, and you have something called as the the Woodrow steps or the Woodrow routes here.
One of the step is there will be a default chunking strategy step, which is, again, not exposed as a UI. It's it's a simple module, which is a boot sales step, which is a Python program. And you can bring in your favorite library, which has better utilities for the chunking strategies that you might want, import it into this unit, and try it out, and then plug it into rest of the pipeline, and it will work. So it's like the UNIX pipes approach, that we use to build the Woods Hole pipeline. So this flexibility was crucial if you want to expose it to other people who know the language because, otherwise, you are limited by the engineering group who might know only one language. So that's the way we started to, expand it to then Hungarian and Croatian so that the data scientists in those group are not working as a black box with some UI drop down chunking strategy, but they can bring in their favorite library into the step and only change that. Rest of the pipeline remains the same.
[00:21:42] Tobias Macey:
So based on the work that you did at Deutsche Telecom, you ultimately created the LMOS open source project. And one of the perennial challenges of releasing something as open source that you first built for a particular organizational use case is that the design of the system as it's implemented in the business implicitly encodes a lot of the organizational patterns of that organization, which don't necessarily translate externally. And so I'm wondering how you approached that translation process of figuring out what are the pieces that are generally applicable, but not so general that they're useless, and then design it in a way that it can be used as a foundational substrate that other people can build on top of without making it so complicated or confusing that they're never going to start down that path?
[00:22:34] Arun Joseph:
That's a very sharp question. It's like the Conway's law design Conway's law design of a framework. Right? Right? So two things that really, really worked out well in terms of reaching a just enough point in abstraction is in order to mention this, there's a few things. Yeah. For example, when we started in, 2023, there were these frameworks. Slack chain was a major one in 2023. We started picking Lang chain, but almost all our transactional systems and APIs, the the profile API, the the the billing API, All of that were built in JVM stack. The second point is almost the entire operational systems with distributed systems were built on the Kubernetes CNCF stack. So you have the Kubernetes, the observability, the Grafana, the Prometheus, and everything. Now comes a new framework which was invented somewhere else based on some use cases. Now how do you fit this in into the people who actually know the data, Do know the domain, and what happens to all the client SDKs that has been built? What's the reason we went back to the drawing board in starting creating Elmos? And and we did not stand out creating Elmos by wanting to create a framework. The problem statement was quite simply this. We we implemented something in Langchain. It took a couple of months, and after that, no one knew how to build it into a platform because it was so chaotic. And how does it actually fit in into the rest of the power stack? So we went back to the drawing board and came down to the realization. Only if we provide the right amount of tooling to the people who know these APIs and domains and let them build it. That's the only way this will scale. Otherwise, you'll have a new team building something else, and then you need to ask data from this team and Jira tickets and this and that, Conway's form. Now second point is the only thing that you can do is you cannot state that this is how the framework should be. What you can only build is how to shorten that loop of doing an experiment. Let's say changing something in an agent and how to take it to staging, for example. For that, you need to have a robust pipeline from the time a developer changes some behavior in an agent. An ephemeral environment is immediately spun up to test this in isolation.
And this is all coming in from the distributed systems back up, right, from the Kubernetes world. So this is how we approached it and not as a framework, but how do we shorten the feedback loop for testing because no one knows how to build these systems reliably. So which resulted in the stack which goes not as a framework. It's not Elmos is not a framework. Elmos has something called as ARC, which is the agent reactor as we call it. Right? Like the the Jarvis ARC. It was built on Kotlin so that we could build a DSL, which is just enough information for the engineers who know their APIs who are on Java to build agents, not having to figure out hundreds of new APIs and Spring AI and this and that. Then these agents need to live somewhere, which is a life cycle management. So we build the Elmos platform to deploy these agents with one Git push, for example. Right? It spins up an ephemeral environment. And in that Elmos platform, it's entirely Kubernetes based, which means you're not reinventing the wheel. Agents were created as first class citizens there. So you could do kubectl get agents. And then life cycle management of an agent is taken care. When you push an agent, you could say, I'm a billing agent. I can handle billing queries, and I can handle billing disputes.
I advertise this as an agent. I advertise it to the network when I am deployed into the Kubernetes, platform. So you use the discovery mechanisms, which is already proven in Kubernetes, right, through STOs of the world, and then bring that into the Kubernetes registry, the STO registry. So you're not reinventing an a two a registry or something like that. It's all stack which enterprises and operational teams are familiar with. And this, the Elmo stack is should be universally applicable for distributed systems teams without trying to do too much into ARC. And Woodzel, for example, is also similar. So we knew exactly where we wanna stop. If you try to build a UI and drop down with, chunking strategy and this and that, it will immediately lose the flexibility of bringing in the best, you know, Python frameworks or stacks, which is on the right pipeline side. So we just stopped at that, which allows and picked DVC for the point in time recovery and all that. So it's also bet on large scale ingestion pipelines based on Kubernetes, etcetera. So it's universally applicable in enterprises would be my answer.
[00:27:18] Tobias Macey:
Interestingly, the LMOS acronym is language model operating system. Operating systems are a ubiquitous concept in computing for decades now. I'm curious what you were thinking in terms of the naming about what you're trying to convey with that concept of operating systems for this language model context and some of the core theories around operating system design that you brought into this framework to enable such a generalized substrate for agentic use cases? Yeah. This is,
[00:27:53] Arun Joseph:
this is clearly one of the reasons I'm starting the start up itself. Yeah. Because, I was so fascinated by the idea when AI took off. Different people saw it as, oh, it's magic. Right? But there's I I love thinking in terms of fundamental abstractions, either in physics or biology or computational systems. And, yeah, I I used to wonder, oh, I was I wasn't there when Linux was born or something. As in, so it was like a new way when computing was emerging on how do you build programs. For example, when I first interacted with the language models, it felt like now you have a new microprocessor, and this is made up of some magical silicon. So instead of precise, you know, x 86 instructions, you can give natural language instructions, and it's gonna chunk out some response into your registry. So now, suddenly, the it it flips.
Oh, this would mean you would need to build new programming and operating constructs from the scratch to build a new computational unit, which could be agents is was the thought process. So so that shift from language models as microprocessors resulted into thinking, let's build all the layers above starting with how do you interact with these models. And models are going to emit strings as tokens. And these strings and tokens need to control the program flow, which is a totally different programming paradigm. Right? And how do you build scheduling on top of it? For example, let's say there is a request coming in, and it need to do some planning while at the same time, the same program need to give response to a task. So you need a scheduler which optimizes the resources, which in this case is the language models, which could be either cost or unit of time. And and all those constructs needed to be revisited in an and how do you handle nondeterminism fundamentally in a program? So that resulted in this thought process of Elmos, and we thought Linux had the Tux, mascot. So let's, let's bring in Sesame Street Elmo into this being the next stack for agentic computing. This was a vision in 2023 as in, like, like the Xerox part group. Yeah. This was how most of the engineers joined the team as well when, you know, personal computing taken off and and, Xerox Spark came up with object oriented programming and all that. So we were a couple of engineers passionate about building something great. And then, oh, let's let's do something. Let's build the foundations for agent computing, and it's called Elmos. Let's build agent communication protocols, agent computational units. Let's build the scheduler for interacting with language models, while at the same time solve the customer use cases for Deutsche Telekom. So this was the background, and the storyline behind, Elmos.
[00:30:51] Tobias Macey:
One of the other interesting challenges that has become exacerbated by the scale of capabilities of these language models and the high degree of variability in terms of their pricing structures and the complete unpredictability in terms of the number of input and output tokens that are used for the context. How do you think about reporting on and managing the costs and budgeting of these different agents to make sure that you're not going to bankrupt the company in the process of solving their problems?
[00:31:25] Arun Joseph:
Absolutely. This is so, essentially, this is also this need to be fundamentally looked at from a computational point of view. Most of the agent building process these days falls into the category of let's use LLM invocations for every computation. So let's assume if you're if if there is a flow when a customer requests a refund, then call the refund API, then get the response, then then do the, account update API, etcetera, etcetera. What is being observed today is most people are using LLMs all the time for this invocation. When you take a step back and and and and and and think about how computation works in nature for energy conservation, we don't try to do we don't try to, you know, reinvent the same process using our brains all the time once it becomes deterministic or like the habit formation loop, you you put it into the low energy execution phase. So for agents too, this sort of a paradigm need to emerge is is the way I'm thinking as well. So instead of invoking LLMs all the time, if the LLM has figured out or the agent has figured out 80% of the use cases can be solved by this deterministic flow, the agent itself constructs that deterministic flow. And for all the invocations that comes in, uses this deterministic flow and then falls back into this rest 20% flow. But all of this requires rigorous instrumentation into looking into costs. So, essentially, just like in the in the operating system. Right? For each process, you assign CPU and memory, and then you do the stamping and observation, the process IDs.
You would need to think that unit, in this case, is agent, such resource allocation, and then monitoring it, and then massive observability platforms, intelligent decision making platforms keeping in check on this. So two things. It's like the OODA loop. Right? Observe you need to keep observing what is going on, and then you need to make the decisions on what need to be optimized. In this case, computation need to be optimized. It doesn't make any sense to do LLM calls all the time. It is soon going to come crashing is what I what I bet.
[00:33:46] Tobias Macey:
And then in terms of the focus on openness and local control for businesses that want to be able to capitalize on these agent capabilities, the ability to use local models for more of that cost control or more predictability in terms of being able to freeze or pin to a specific model version so that you're not at the whim of whatever the model provider is going to do under the covers of their API. I'm curious how you see the work that you're doing complementing or overlapping with other projects in the ecosystem such as Oomi or OUMI, which is focused more on enabling organizations to build their own foundation models and control more of that element of the life cycle as well.
[00:34:33] Arun Joseph:
Yeah. So this is also one of the reasons. So I'm going into this entrepreneurial journey on on betting on also one of these key constructs. What we have observed is in enterprises, once a pattern has been set in let's assume if somebody has built in with the best model available for some use cases. No. There is no incentive for it to be shifted later. No. In enterprises, typically, once once something has been set, it's very difficult to shift. But the worrisome point might be all this information in terms of the feedback loop. For example, if you had control on the fine tune models or or getting information on the model outputs that you need to rigorously track, This is a wealth of information that you could use as we move forward to do fine tuning and maybe bringing down the computational cost and also building better intelligence for your organization. And that opportunity is missed if you keep betting on these large models. So for that to happen, you need to have a platform or a layer which allows this mitigation strategy to be baked in. So when you call the completions API, instead of directly going into, let's say, OpenAI or Gemini or whatever it is, there should be a way in which this is mitigated by that semi proxy layer, which allows quickly shifting different models to be plugged in. And, essentially, as OpenAI recently has shown with the responses API, which is a higher order API, there is a beautiful example where OpenAI shows with three lines of code, you are writing a call command, call, and say, add this toner pad to my shopping cart, to a Shopify MCP server tools which are plugged into the responses API. This is the only only, input that was given, and you configure the MCP server. There is no additional code. It does the agent decorecastration, calls the search a product, add a product, and, check out a shopping cart. All of that, by that agent deco. This happens underneath the platform. So if organizations start to use these APIs, it is super simple. The the the simplicity of such an API and the value it brings in is tremendous. You don't you could write any any number of use cases in a in any given day. But the problem is if you don't have the mitigation strategy of how your organization is orchestrated to to to some some black box API, you will miss out on a lot of information which you could have preserved for the time when model tuning is gonna become much cheaper, and you would soon be tied to these large model companies and just consuming some black box higher order API. So the layer that I'm also building is something like the responses API. One of the components that we are building is this one, which allows the same OpenAI like structure, but it allows different models to be plugged in while you can be deployed on your premises. And this data is immediately accessible for you to fine tune your models. It's immediately traceable with the same simplicity as OpenAI that in four lines of code, you can build a number of orchestrations.
[00:37:47] Tobias Macey:
One of the other interesting elements of this new world of agent and capabilities is the broad applicability of the problem spaces that they can be applied to. And I'm wondering how you see the potential for something like Elmos to be employed in the context of a data team to be able to build their own agents to help to manage some of the pipeline design implementation, do some of the, maybe, analytical queries that you want to expose to the business or some of the other end user capabilities that are particularly interesting or innovative that you have either conceived of or seen in action?
[00:38:30] Arun Joseph:
Yeah. So, essentially, there was, recently a meetup for the Eclipse software defined vehicle group in, I think it was in Copenhagen. So where we demonstrated with Elmos, so the vehicles are emitting telemetry data. So the software defined vehicle group is a consortium of some of the major companies in Europe. They're focusing on telemetry data emitted from vehicles. Yeah. Software defined vehicles. So we demonstrated a simple agent, which was built with Elmos Arc, the agent developer framework, the agent, reactor framework, which could connect to this telemetry data. It was a time series database, and the query building was being done by the agent using OpenAI APIs. But the fun part is so you could ask questions like, how many of my vehicles are running low on fuel, for example, amongst maybe thousand vehicles, and all of that being returned in a few lines of code. So, essentially, what this actually means is for data analytics teams, it is all about querying.
Right? Because the query language, it could be a SQL or a cipher or, a number of approaches are there. But this acts as a layer on top for them to quickly build this dynamic query creation agents even though the underlying systems won't change. You don't have to change your Athena. You don't have to change your Redshift. You don't have to change your in any of yours, your your snowflake. You could build an agent that constructs these queries because language models are also good in understanding the existing querying languages. And, it could be done pretty easily, as we have seen so far.
[00:40:15] Tobias Macey:
And in your experience of building these agentic frameworks, figuring out how to manage the creation and maintenance of the context corpus to enable these agents to work effectively? What are some of the most interesting or unexpected or challenging lessons that you've learned in the process?
[00:40:36] Arun Joseph:
Yes. Andrew Karpathy, you recently pointed out, yeah, every engineering information engineering is gonna converge into context engineering. So it's a man. Simple questions are easy to handle, but as the system grows in complexity, how do you pass the right context to the same agent or different agents? Yeah. There are many approaches which are emerging. Nobody has fully figured out this pattern yet because as it grows and scales, this this is gonna become challenging. But one of the things is if you have a platform or a layer let's take an ecommerce organization, a company with champions commerce. If that company either if different departments starts to build different agents, then already what you would end up is today's spaghetti systems.
You would lose out on on deriving the true value of AI because the context is now segregated. So the true value of AI in enterprises is going to emerge only from unifying at least the foundation for the context so that other agents can be built. What that would mean is a big shift in enterprise architectures. The microservices world taught that small is good, small, nimble teams. They can go and do whatever that they want. But with AI, there is a problem because if you don't have a unified view of the truth of your enterprise stored somewhere, then the agents that are being built will not be as effective as if it had not been, which brings in the need for a core context and model management platform where you connect your enterprise tools so that your different departments and engineers can build easily without asking for permission.
Hey. Can I use your department's data? Because what will matters is the business outcome. Suddenly, your orchestrator, if you say, I want to optimize my profits, for the teenagers, what suggestions exist. Now if the platform has the tools on, you know, you know, sales projections analysis tool, then buyer patterns tool, then inventory tool, and market analysis tool. Then suddenly with this question, immediately, the agent orchestration picks in. It's able to come up with immediate results. So this layer is the essence. If you're not building your organization for agent dec orchestration, that is the definition of AI native as I would call it, the architectural equivalent of an AI native. What is the definition of an AI native? How to prepare your organization to be a native? Models are gonna get better, cheaper, New paradigms will emerge in programming. But one thing certainly is not gonna change, which is you need to have this context building platform layer for the world which is going to come, And and that is going to be the core of your AI native transformation journey.
[00:43:30] Tobias Macey:
And I think to extending on your reference to the microservices architecture is that we went through a similar problem where as you go from the monolith to the microservices, you still have to figure out where that state gets created and maintained and how it gets distributed. And then that also leads to all of the complexity of reconstituting all of the relevant context in the warehouse context where you pull the data out of all of these different microservices systems, and then you have to figure out what are the linkages, recreate some of the API based connections at the data layer to figure out how to bring it all back together to make it semantically meaningful. And we're in a similar stage with these AI systems where if you have that warehouse catalog, you can use that to a certain extent, but we need to figure out how do we feed a lot of these LLM interactions back into that to help maintain and grow and evolve the corpus without having it shard into these different domains that then have to be reconstituted if you even know that they exist in the first place.
[00:44:38] Arun Joseph:
Absolutely. Absolutely spot on. And that's what most people are I think it's it's from what I've seen and and some of the stories from other places, the spaghetti that we have seen in the previous world, it's only going to get multiplied. Everyone is building three agents on top of the existing microservice. Now the microservice is now four four units of computation, and no one knows the context anymore. My agent can only respond address customer address. Then then now you think about what protocol should I use, a two a, to connect to the order order thing? And and now you have scrum meetings, and it's insanity.
[00:45:22] Tobias Macey:
Alright. For people who are figuring out how to build their own enablement layer for these agentic use cases in their organization, what are the cases where you would say that Elmos is the wrong choice?
[00:45:37] Arun Joseph:
So Elmos has multiple components. So let's assume if you're so Elmos itself has the agent framework per se, which is around Kotlin. So if you're a non Kotlin JVM organization, it doesn't make sense to use that agent framework. But, but then the platform layer, it's still an incubation phase. It's standard Kubernetes. It at least will provide a lot of inspiration for you to pick up irrespective of whether you run Python agents or so. So organizations who are not into JVM and Kubernetes or, if it's entirely based on Python or c sharp, then it might not make sense. But there is still a lot to learn, on how to manage the life cycle of agents. But that's also the reason why the learnings and also where it goes is it's going beyond agent frameworks and which is one of the components which is built under the new startup is is, is, is the Mesioc open responses component.
So, essentially, it does not restrict you from using any framework. As I described earlier, it's the complete the responses API just like OpenAI. It provides once it's deployed this component, every agent framework that you can think of, OpenAI agent SDK and, AutoGen or Kotlin or whatever should work because this addresses this problem of context building for large scale enterprises and model switching. So it goes just if if if the OpenAI developer platform, if it were to be deployed on your premise, what would it be like? That's that's the way we are thinking. That's the essence of what we need to build in your organization, irrespective of whether you use this component or not.
[00:47:20] Tobias Macey:
And as you continue to contribute to Elmos and continue along your own venture that you're building, what are some of the new capabilities that you have planned? What are some of the ecosystem evolutions that you're paying particular attention to or any of the capabilities that you're particularly excited to dig into building and growing?
[00:47:44] Arun Joseph:
Yeah. I think the on the Elmos side, there are now two initiatives. One is Elmos, which is on under, the Eclipse Foundation, and we are listening to the industry as well on what is actually required and not try to make it bloated to become, you know, let's add this feature or that feature. So it's kind of at least the agent framework is very stable. Kotlin is used in production. The Elmos platform, we originally built it as you should be able to deploy any agent to be Python or this into and and convert them into agents as first class citizens in Kubernetes.
This, we are rethinking the approach because if you try to fit for everybody, it will never work. So we just wanna convert that into a minimalistic registry based on Kubernetes, which does ex external orchestration. So agents for JVM stack, Elmos should hold true for today. I would say it's it's it's one of the best frameworks out there, which allows business and engineers to allow the same thing, which is very rare because a business is able to use if you're using Eleazar, the business can define the use case. I want to define the billing dispute use case. The business would write it in English language, and we are able to do it because if you just randomly write prompts, it will never work. So you need a runtime for the English language, meta language, which we created, which is referred to as ADL in ARC, so it works. The part which I'm most excited about is is this open responses component, which is being built by my cofounders, Jasbir and Amanth.
They are building this component to become, like I mentioned, that that orchestration layer for models and context building layer for the whole enterprise so that your developers, your departments, no one is gonna change the department's structure. They should still build agents on their favorite stack, but the minimum guidance is use to to to prepare yourself to change models quickly and don't lose context. You should have the ability to connect to any number of tools, and don't you bring your siloed tool registry and all that. That layer is called the Mosaic, open responses layer. And by the way, Mosaic is the name of the company I'm building, and, Mosaic stands for multi agent systems MAS. So this is the thing which I'm most looking for. It's also open core, open core model as open source, then building on top. Yeah.
[00:50:17] Tobias Macey:
Are there any other aspects of the work that you're doing on Elmos or Mosaic or the overall space of enabling agentic use cases, all of the data requirements that go into it that we didn't discuss yet that you'd like to cover before we close out the show?
[00:50:32] Arun Joseph:
Yeah. I think we went into the idea of orchestration. Yeah. So I would say everything revolves around the idea of orchestration from here on. And the idea of orchestration is simply feedback loops and feedback loops of experimentation, like in real world, yeah, as in how the the Apollo model versus the SpaceX model. In Apollo model, failure was not an option, and, SpaceX model failed fast and learned. So orchestration is a programming engineering, systems engineering view, which is gonna transform organizations. And it's more than a pro programming construct. It's a mindset construct and behavioral change inducing thing is the thing which enterprises need to think through as well if you try to retrofit your existing processes and and your ways of working into using any framework and AI, it will not work. It should be used as a lever to become space excess of the wall so that you focus only on one thing, which is feedback loop shortening and learning and reapplying. That's, AI nativity part, I would rather say.
[00:51:38] Tobias Macey:
Well, for anybody who wants to get in touch with you and follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as the final question, I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today.
[00:51:54] Arun Joseph:
I think I have my personal opinion on see, first of all, I have seen large corporations try to see essentially, it's time to take out things rather than adding more things. This is this is the way I would rather see it. Yeah. Every tooling that has come in has resulted in there is transactional system, then there is data pipeline, then there is some other team sitting somewhere building some new other tools. And when a business stakeholder asks, hey. What is going wrong? Now you're deracing Jira tickets between five teams. And every team would say I have the greatest framework and the tool. So I think it is time for a cleansing, and it's more like adding more but taking things out. For example, can there be new kind of systems that can be built, which can be queried both transactionally and operationally because agents might be able to reconstruct the truth, using computation as well, for example. So I would if I were to frame it as one statement, it's over engineering has crept up the data space as well. The it's time to clean it up, and AI provides a good lever to do it.
[00:52:59] Tobias Macey:
Alright. Well, thank you very much for taking the time today to join me and share the work that you've been doing on building the Elmos and the thought that you've put into how to enable the creation and maintenance of these agentic use cases at organizational scales. Definitely a very interesting and complex and timely problem, so I appreciate the time and energy that you're putting into making that more tractable for everyone else. Thank you, Topias. I really enjoyed the conversation as well. So thanks for the invite, and thanks for the thoughtful questions. Thank you for listening, and don't forget to check out our other shows. Podcast.net covers the Python language, its community, and the innovative ways it is being used. And the AI Engineering Podcast is your guide to the fast moving world of building AI systems.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. And if you've learned something or tried out a project from the show, then tell us about it. Email hosts@dataengineeringpodcast.com with your story. Just to help other people find the show, please leave a review on Apple Podcasts and tell your friends and coworkers.
Introduction to Arun Joseph and His Work
Journey into Data and AI
Building Agentic Systems at Scale
Enterprise Data Management and Agentic Systems
Challenges in Data Contextualization
Elmos: From Enterprise to Open Source
The Vision Behind Elmos as an Operating System
Cost Management in Agentic Systems
Open Source and Local Control in AI
Agentic Systems in Data Teams
The Future of AI and Enterprise Architecture
Closing Thoughts and Future Directions