Mapping The Customer Journey For B2B Companies At Dreamdata - Episode 134


Gaining a complete view of the customer journey is especially difficult in B2B companies. This is due to the number of different individuals involved and the myriad ways that they interface with the business. Dreamdata integrates data from the multitude of platforms that are used by these organizations so that they can get a comprehensive view of their customer lifecycle. In this episode Ole Dallerup explains how Dreamdata was started, how their platform is architected, and the challenges inherent to data management in the B2B space. This conversation is a useful look into how data engineering and analytics can have a direct impact on the success of the business.

Springboard LogoMachine learning is finding its way into every aspect of software engineering, making understanding it critical to future success. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype.

Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. The Data Engineering Podcast is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to and apply today! Make sure to use the code AISPRINGBOARD when you enroll.

Your data platform needs to be scalable, fault tolerant, and performant, which means that you need the same from your cloud provider. Linode has been powering production systems for over 17 years, and now they’ve launched a fully managed Kubernetes platform. With the combined power of the Kubernetes engine for flexible and scalable deployments, and features like dedicated CPU instances, GPU instances, and object storage you’ve got everything you need to build a bulletproof data pipeline. If you go to today you’ll even get a $100 credit to use on building your own cluster, or object storage, or reliable backups, or… And while you’re there don’t forget to thank them for being a long-time supporter of the Data Engineering Podcast!


  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to to add your voice and share your hard-earned expertise.
  • When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • You listen to this show because you love working with data and want to keep your skills up to date. Machine learning is finding its way into every aspect of the data landscape. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. The Data Engineering Podcast is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to and apply today! Make sure to use the code AISPRINGBOARD when you enroll.
  • Your host is Tobias Macey and today I’m interviewing Ole Dallerup about Dreamdata, a platform for simplifying data integration for B2B companies


  • Introduction
  • How did you get involved in the area of data management?
  • Can you start by describing what you are building at Dreamata?
    • What was your inspiration for starting a company and what keeps you motivated?
  • How do the data requirements differ between B2C and B2B companies?
  • What are the challenges that B2B companies face in gaining visibility across the lifecycle of their customers?
    • How does that lack of visibility impact the viability or growth potential of the business?
    • What are the factors that contribute to silos in visibility of customer activity within a business?
  • What are the data sources that you are dealing with to generate meaningful analytics for your customers?
  • What are some of the challenges that business face in either generating or collecting useful information about their customer interactions?
  • How is the technical platform of Dreamdata implemented and how has it evolved since you first began working on it?
  • What are some of the ways that you approach entity resolution across the different channels and data sources?
  • How do you reconcile the information collected from different sources that might use disparate data formats and representations?
  • What is the onboarding process for your customers to identify and integrate with all of their systems?
  • How do you approach the definition of the schema model for the database that your customers implement for storing their footprint?
    • Do you allow for customization by the customer?
    • Do you rely on a tool such as DBT for populating the table definitions and transformations from the source data?
  • How do you approach representation of the analysis and actionable insights to your customers so that they are able to accurately intepret the results?
  • How have your own experiences at Dreamdata influenced the areas that you invest in for the product?
  • What are some of the most interesting or surprising insights that you have been able to gain as a result of the unified view that you are building?
  • What are some of the most challenging, interesting, or unexpected lessons that you have learned from building and growing the technical and business elements of Dreamdata?
  • When might a user be better served by building their own pipelines or analysis for tracking their customer interactions?
  • What do you have planned for the future of Dreamdata?
  • What are some of the industry trends that you are keeping an eye on and what potential impacts to your business do you anticipate?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

  • Thank you for listening! Don’t forget to check out our other show, Podcast.__init__ to learn about the Python language, its community, and the innovative ways it is being used.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
  • Join the community in the new Zulip chat workspace at


The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Click here to read the raw transcript...
Tobias Macey
Hello, and welcome to the data engineering podcast the show about modern data management. What advice do you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I'm working with O'Reilly Media on a project to collect the 97 things that every data engineer should know and I need your help. Go to data engineering slash 97 things to add your voice and share your hard earned expertise. When you're ready to build your next pipeline or want to test out the project you hear about on the show, you'll need some more to deploy it so check out our friends at linode. With 200 gigabit private networking, scalable shared block storage, a 40 gigabit public network fast object storage and a brand new managed Kubernetes platform you get everything you need to run a fast, reliable and bulletproof data platform for your machine learning workloads they've got dedicated CPU and GPU instances. Go to data engineering slash linode. That's Li n od e today to get a $20 credit and launch a new server in under a minute. And don't forget to thank them for their continued support of this show. You listen to this show because you love working with data and want to keep your skills up to date. Machine learning is finding its way into every aspect of the data landscape. And springboard has partnered with us to help you take the next step in your career by offering a scholarship to their machine learning engineering career track program. In this online project based course every student is paired with the machine learning expert who provides unlimited one to one mentorship support throughout the program via video conferences. He'll build up your portfolio of machine learning projects and gain hands on experience in writing machine learning algorithms, deploying models into production and managing the lifecycle of a deep learning prototype. springboard offers a job guarantee meaning that you don't have to pay for the program until you get a job in the space. The data engineering podcast is exclusively offering listeners 20 scholarships a $500 to eligible applicants, it only takes 10 minutes and there's no obligation. Go to data engineering slash springboard today and apply. Make sure to use the code AI springboard when you enroll your host is Tobias Macey. And today I'm interviewing Ola della Roop about dream data, a platform for simplifying data integration for b2b companies. So like, can you start by introducing yourself?
Ole Dallerup
Yeah. Hey, thanks so much for having me. My name is Ola, the CTO and co founder of green data, we build a platform to help between companies gather all the data from marketing sales and growth products into one platform so they can get a unified view of how they are acquiring customers and how they're crying, the customers who's paying the most and Do you remember how you first got involved in the area of data management? Sure. I thought a little bit about that, and it's Actually, I think it was while I was studying, I was studying computer science and, and a lot of kind of my friends, they were playing poker online. And I also caught that and and we ended up being pretty good at it. But one of the tools we used was a tool called poker tracker that was kind of tracking all our online gambling and tracking our opponents. And maybe they kind of gambled all night, I would kind of get up in the morning and look at data, how did I play? How did I kind of act on certain situations, and sometimes, of course, I acted very poorly, and sometimes, hopefully also good. And then we'll kind of use the data to to improve my skills.
Tobias Macey
And so now you're building the dream data platform to simplify the overall integration and visibility of data across the different touchpoints in b2b companies. I'm wondering if you can describe a bit more about what it is that you're building there and some of the ways that it's being used by your customers.
Ole Dallerup
Yes. So some of the things we experienced was that we come from a comp company or we kind of a couple of years back. Two of the founders we came came from a company called trustpilot. And one of the problems we had was we were driving a lot of traffic to sign up for our customers. And it was difficult for us to kind of put any kind of number on what those kind of signups and it was often free signups we like a free business product, signups and it was half us to put any real value on it. When we asked sales, it described this as being really zero value. And when then we went to monitoring and Nitin had a similar experience. They were like, buying expensive leads from like Google ads and Facebook ads and so on. And they couldn't either kind of justify always all that spent and we found that would be like that wouldn't that was a strange thing for us because we were used to in working with like tech products. We were used to tracking all US activities. We knew what all our customers were doing, how they kind of came in to the product the first time how often they used it, what functionalities they used, we had all that kind of insight. And so we found it strange that we couldn't also have that same insight inside from the sales and acquisition point of view. And that's how it kind of thing started. We started building this out, saying, Okay, so now let's kind of start tracking all the users understanding where they are. Let's look at all the different systems we have kind of commercial systems where we have users activities in which is I mean at trustpilot with stuff like Zendesk for support tickets, Salesforce for sales activity, Mize and we're using HubSpot. And we probably had three or four other systems where we had a fraction of, of the journey in and then additional key novel web tracking, but as a kind of a company grows, you suddenly end up having a lot of different websites with different languages. pages and all kinds of stuff where you have to track the uses. And that ended up being relatively complex. And that's when we decided also to start the company. And that's really the profit, we still solve. We pull in data from a lot of different sources, we join it together so that you have one uniform way to kind of view of what the customers are doing, how they're interacting with your product customers. When I say customers, it often means customers or prospects, but we now then we kind of can map a journey. And then we can do apply more interesting analytics on it, which can be a which is often Something about like trying to understand all these touch points at a more aggregated level. And understanding what of those touch points actually driving your revenue. Is it the how you acquire the customer like the first touch, which is sometimes interesting in an IoT perspective, but there's a lot of kind of things happening in most companies and most of our customers, they have a very long sales journey. From the kind of see the customer the first time until they can, they're kind of able to close a deal with them. It's not uncommon that it takes six months.
Tobias Macey
So the company was born out of the frustration of trying to gain visibility about all of the different interactions of customers and how that fed into the overall success of the business that you are in. But what is it that's keeping you motivated as you continue to build out and grow the capabilities and dream data?
Ole Dallerup
I think, partly actually giving that visibility. I think a lot of companies, they don't have a good picture of like how things are going and then they don't have that visibility of the details of how they are helping customers. What are the customers doing, how you're acquiring customers, and that might for some companies, that's not necessarily kind of at the beginning of problem, because what they define the channel sales channel and they succeed in that and they kind of do kind of repeat that logic in tried to do that again and again. But as soon as something comes into the market, that changes the profitability of that channel, then they get into problem. And I think you're seeing that weirdly seeing that with more more companies that they used to go, for example, Google ads that used to be very, like a good business, they would just acquire customers that way. And then the price started increasing. And initially, for a lot of companies, and particularly b2b, that wasn't necessarily the biggest problem because they could actually pay a lot for customers often contracts. The contracts here are like from $10,000, up to several hundred thousand dollars per year. So like whether they paid $100 or maybe $500 to acquire a customer, that didn't matter. But suddenly it stopped working or they couldn't buy the domain because now the price was not like $500 it was several thousand dollars. And so they started going out to other channels, Facebook and Funny enough, you see a lot of b2b companies actually advertising Facebook and some also managed to get to work. But that's a different topic they go out on under channels, and now the complexity becomes much larger. Because now you're not just looking at one channel, you you actually need to combine a lot of different sources. You might also be aggressive on emails. And so I think that's a lot portion of it. And then one thing that is for me important is also silos. I don't like that. I find it problematic when companies put themselves in silos, often you see a marketing sit in one silo and they are tasked with get more leads to sales, just get more leads, no matter what just get more leads, and then sales if in their kind of silo and they're tasked with getting revenue and the management's just selling them pay sales get more revenue, and then it doesn't take very long because before sales, tell my team Hey, we're not getting revenue because you were coming with poor leads. And I think that's, I mean, that relatively poor way of communicating because it doesn't help much They have to work together, 90 are good at certain things. Sales are good at other things. And they need to work together on driving revenue. And that's important. And if you are like a SaaS product maybe and doing for your free products, then I think the product department also needs to fit into that game. They are responsible for driving revenue, not responsible for giving away your product.
Tobias Macey
And there are a number of people who might be familiar with the b2c or businesses, businesses that are selling to consumers directly. But what are some of the ways that the experience of those organizations differ from the b2b or businesses that are selling directly to other businesses? And how does that impact the overall complexity and capability of gaining a useful overview of the customer lifecycle
Ole Dallerup
so often would be to see you have more data, which of course has certain challenges of like scale today. That's of course relatively easier than it used to be. But the good side about a lot of data is you can often do statistical models. And so that's very interesting. And you'll see large companies, they've been doing this for a very long time, I think the use case I've always always heard about is that McDonald's could predict their sales by looking at the weather, which is probably true to a large extent. And so if you have this enormous amount of data and history, then you can actually predict a lot of things with just like, it's not basic statistics, but statistics, at least, when you move into the b2b world, it's often a case where you don't have enough data to do at least this kind of statistics. And then you need to learn a little bit more into the data and AB detail you can get medals. The good part is then also those few people so you can actually start recording at a higher level of detail Who did you talk with and so on, which how often happens in the CM systems. So that's one of the adventures and then A lot of small details like, if you are a b2b company and are doing tracking, well, then you could do a reverse lookup of the IP addresses, and then start understanding a little bit of like, who was actually visiting my website without understanding necessarily, like, without having anyone signing up giving their email address, you can actually start guessing about which companies is kind of a few differences. And then from a tracking point of view, the biggest differences in the b2c world, the you track the user and the users also the one time in the b2b world, you track a company, and that's a lot of users involved in actually making the deal. And the ones the person buying at the end is usually not the decision maker, at least not to my experience. When I was running at least the last acknowledge operations then I was often an important decision maker, but I was rarely the person that that had have the credit card that was our system. So
Tobias Macey
yeah, that manner and you You mentioned to that there are often silos that occur between some of the different responsibilities of people who are interacting with the customers at different points in the interaction cycle. And I'm wondering what you have found to be some of the contributing factors that give rise to those different silos and the challenges that that poses in terms of being able to effectively map the journey of the customer through all those different interaction points to the point where they're actually paying you money.
Ole Dallerup
Yes. So I think I'll start here by talking a little bit about like, like where I come from, because my background is engineering. And when I started engineering, software engineering, I was often told what to do by business people. And I got to learn that that's not always a good way of doing it. I think both kind of companies today they're working in cross functional teams, engineering teams, they obviously kind of worked side by side with the product managers and designers and they together are responsible for delivering a product That is important because the engineers know which technologies can actually make a difference. But the mentors are also the ones and the designers having a good understanding of the users and the customers. And that way we can build the greatest products. For me, this is a little bit the same. In the field we're working at right now, marketing and sales in poor companies, they are silos, and they blame each other. I think that also happens in the poor. And that for sure happened in the old days, like product management or program management, we call it that at that time, right? And engineering. They blamed each other and worked in silos, but they found a way to work together, I think we want we need to see marketing and sales do the same. But to do that, they need a way of measuring the results in the same way. So that right now sales are often measured by like bookings, or like how much sales they do, which makes sense. Now it needs to be the same, but now we need to share the revenue and that's what data comes in, we need to like we work with companies and help them try to find a way where they can actually do some kind of sharing. And so they can easily work together of like, so the marketing are not only responsible for getting more leads, but they're responsible for getting more revenue, because that's what matters. And so quantity is not necessarily a good thing here.
Tobias Macey
Does that make sense? Yeah, definitely. And you mentioned too, that because of these different responsibilities, they might be tracking the information in different systems. And so I'm wondering what you have found to be some of the common sets of information that you are tracking for the different roles and the different types of source systems that you're trying to integrate with. And I'm curious how you're approaching the collection and cleaning of that data in order to be able to build useful insights for the businesses that you're working with.
Ole Dallerup
Yeah. And it also took some time to kind of figure that out what we do As in kind of is we pull out the data using the API's of the different systems and stored, at least at the end inside our data warehouse, we use Google BigQuery as our data warehouse, which is super central to kind of how we do things. I will add here that I used to use Amazon redshift a lot. Amazon redshift would not be able to do what we do, at least not a in a decent cost. And then I think snowflake, I haven't used snowflake so much, but you could, in theory, do this and snowflake, I think, but we pull in the data pretty roll. And pretty much as it is. We of course, have jobs that the neurons and kind of does this as synchronously and we'll use a kind of a different set of projects to make this work, depending on what type of data is often when we pull data from these RSS tools like yours and this Salesforce, HubSpot, those kinds of tools. We use an open source project called singer that I owe, which I'm sure a lot of the listeners are familiar with. And if not, then, I mean, it's a great ETL tool to check out. It's kind of the open source version of stitch data. And we use that to pull in some data. And then we'll have to kind of the raw data, then we do transformations. So we clean up the data into a uniform model. And then that's, of course, a little bit more sophisticated. But, but let me kind of the simple version is in all CRM systems, there's a contact, sometimes they call it different things. But there's a contact and the contact has an email address, they have a name, they have an ID, they have a creation date, and so on. So we find a couple of fields that we kind of need, and we pull that out until we pull it into a uniform way. The same we do with a with companies or accounts, and activity, data and so on, which is also depending on what the system is We kind of pipe it into either activity data, contacts or companies. That's kind of the primary kind of objects we have. Additionally, we need to get out revenue, which is probably the problem we haven't hundred percent really solved yet. So here, we actually often have a custom script per customer that ensures that we take the revenue in the right way. Unfortunately, most systems that contains either bookings or revenue are different. They structured the data in different forms, or customized per customer, which is not always either of us. We're getting closer, but we're not fully and then we've crunched the data. So we build up data models, parse the data, we use a tool called the data for people that might be familiar with DBT. But data foam is kind of a competitor to that, and if you like BigQuery a lot and in particularly, it's very interesting, but they support snowflake and redshift. And probably also other databases as well. But they can really help you like build data models and build up the dependency graph so that when you run your data model, then instead of running kind of doing a lot of check tools that where they all depend on each other, then you just build up the graph. And they ensure that models are run in the right order, which makes at least that job, relatively easy. And then additionally, we pull in a lot of tracking data. So we build our own kind of tracking pipeline. People are maybe familiar with a company called segment comm, which does customer data infrastructure, and I'm a huge fan of them. And we haven't integrated to them to get the data in like that if customers are using segment, unfortunately, not enough customers are using segment. So we build our own and use the data source the analytic tiers, to kind of build our own pipeline where we're piping into data. We're using a lot of clout Data Flow. And data flow is also it's an Apache project called that. It's called Apache Beam, which is also very interesting tool to, to kind of stream data in, do some kind of transformation, and then stream it down into whatever store you have, which fastest then often BigQuery. But could be such anything else at a very interesting with this comparing, like I mean, it's a, it's a product that is very close to tools like Spark, but the difference between them is often that they get out of scales out of the bucks. So today easy to have a lot of data coming in. So definitely connect and stream the data live into, for example pit crew, which is very useful.
Tobias Macey
And one of the challenges that exists beyond just being able to manage the data as it's coming in is ensuring that there are useful data elements that are being generated in the first place to be able to identify all the different ways that you're interacting with the customers at the different points, and what have you found to be some of the difficulties in working with your customers to ensure that they are using the systems that they say they are, and ensuring that all of their interactions are getting recorded so that they are able to get that effective visibility across the different paths that customers might take,
Ole Dallerup
again, to experience this is that the best way of getting data cleanup project started successful is to start using the data before CLI. So many, not all of our customers Definitely not. But a large portion of our customers are coming to us because they actually acknowledge that fact they, they want to clean up the data. But they also understand to actually do that they need to start using the data for something and so they can show people that need to input or actually run the projects to clean up the data or make it kind of consistent, they see a real outcome of doing it. And so in doing That we help the customers like set up the systems they use, we get it in. And then very often with a few minutes, or at least not more than a few hours of work, we can spot the general problems. And and for b2b companies that get our problems are typically stuff like you're not recording this, you're not adding UTM parameters on your links, you have a lot of duplicated accounts or contacts. This is something you want to look at that that's typically the problems we see.
Tobias Macey
And then once you have the data on boarded to your platform, what are some of the approaches that you're using for being able to do something like entity recognition or entity resolution or Master Data Management to be able to determine what are the actual canonical business entities that all these different users and interactions map to being able to build that overall visualization for the for your customers as to what are the useful pieces of have information that they're able to take some sort of action on?
Ole Dallerup
Yes. So that's a good question. I mean, So first, we have now all the data in. And so from our perspective, we start with the content. And we find all the context. And the context for us is typically an email address. But it could also, in theory, be a phone number. That's actually the ID sauce. And that's another benefit of being in the b2b world versus the b2c in the b2c, you will always say, hey, the user could change their email address. But in the b2b world, if you change your email resume typically means you change your job, which is actually often a final indication of this is another person now like technically, of course, it's not another person but but from a tracking point of view. It's a it's a good way of doing it. And then we find that person in all the systems, all the systems where the contact person, we find all the activity for each of these kinds of systems. And for that to be pretty straightforward because the system's up most of these systems are built so that Try to keep a good relation between the activity, for example, send this ticket that's typically very easy to link to who actually kind of who's the customer. And that makes sense, right? Because you can reply back to the customer. So often you have that information. And then the more hard part when we talk, Id resolution is typically tracking data. And to track the users, we set a cookie. And so we can track kind of users. Then when they identify themselves, which is typically they sign up to a form, they log into a product similar, we associate the user's email address with a cookie. And so now we can see also kind of for the past, what has that user done, and that we kind of build up together. So now we have activities on users, users, email addresses, the next step is to figure out which company does the user belong to. And so the first place we look is in the CRM system, which is typically the record of truth, at least for most companies. Kind of the relationship between a user and a company. I mean, that's great. We sometimes we, we can find the user there sometimes. And we can see the use of that, then we'll take the next step. And then we take it see if the user is having an email address, that's actually a business email, let's say it's a business email, then we'll try to take that email or the website and try to find the company with that website. If you see that, that's great. we've succeeded and now we kind of fluffy linked the the fuzzy link the user with the to that company. And then the last which illusion is that we do reverse lookup the IP address, so use have had some activities on the website. And it's either the user is signed up with a Gmail account so we can kind of associated with a with a company all we don't have an email address, so we simply don't know who to get users. We can reverse look up the IP address and then get a website again, try to facilitate the user activity. With with a website. And then third, sometimes if our customers are using enrichment tools, which could be Kibet, for example, then we can sometimes do this a little bit better. I mean, tibbett would sometimes be able to look up an email address, and then actually put it on a specific company with a maybe higher accuracy than us.
Tobias Macey
And you mentioned that you're using singer for a lot of the data ingestion, or optionally using something like segment, I'm wondering what you have found to be some of the challenges in terms of being able to map to a useful, lowest common denominator data model and the ways that you're approaching either dropping data or removing data that is that doesn't map to that model or being able to maintain extra information when it's available, so that you can optionally expose that to the end user who is trying to do deeper research on the datasets that you've compiled for them.
Ole Dallerup
I mean, this can be very difficult sometimes I think the biggest challenge sometimes is that not sometimes it's always the bigger census, you have to actually understand that system pretty well to do this. So if you don't know, Salesforce, it's pretty hard to do the mapping, even looking at the data, because there's always kind of strange things happening. So when you go into a system you don't know, then there's a couple of things that are usually easy. Like contacts and companies, that's usually easy. There's a name field as an email field as a creative field that I mean, you usually would figure that out. But when you stop looking at activity data, it always becomes a more complicated and often the system's also exposing that data in a in a less natural way. And they're more unique. I think the first thing is what I always kind of recommend people to do is if they work at a company they need to analyze let's say Salesforce data that start by getting a login to Salesforce. So you can browse around and see the data in kind of the view that the sales people make, and people are looking at it, that helps a lot. That makes it a lot easier. And that's also what we do often. If we need to integrate to a new system that we don't necessarily know, then often we would ask the customers can we get access? Like, if you access the whatever, it just helps us a lot to to see the data from that perspective
Tobias Macey
when we built integrations. And what does the onboarding process look like for customers who are starting to use your system and the overall process for being able to integrate with their data sources and ensure that you're collecting all of the necessary information for the different ways that they're interacting with customers?
Ole Dallerup
So the first thing we do is, we said I'll close the account but the first thing we ask you to do is add the tracking script on that website. And the reason for this is quite simple. Most companies and I find this scary actually by But it is unfortunately very true. Most companies actually don't have very good tracking on their website, when you call most companies, then you figure out that the track that they have on their website is Google Analytics. And whack blessings is a great tool for kind of looking at some basic web traffic, you can get the data out. And it's not at any fine grained level on who was actually doing the activities. It's aggregated data, which makes it like a really poor to, to stall the, for me crucial business information. So that's the first thing we always ask customers to do. And also, before we start, like deep conversations around how to data look, then often we will, I mean, it depends a little bit on like whether the customers are more or less technical. But if they are more technical, we'll just ask them to connect to all the tools which is just plugging into our product and then typically just doing of authentication with whatever service and that we get a token and so on. We can start pulling the data. And then we'll set up the swings, initial swings takes sometimes a little bit of time, we just closed a customer with the, I think 5 million contracts that will probably take a few hours to synchronize, and so on. So, so that's kind of the first thing we do. But very quickly, when we have the data, we set up a call with the customers to try to understand that process, how you're doing sales. What do you make, how, what's your marketing processes, how early in the funnel? Like what do you how long we ask questions like How long is the sales funnel? Do we expect this to be like very short, said very long, and most of our customers they are like 369 months sales journeys, which means that often you want to look at something that happened earlier. So we try to understand which they have described is often described as marketing qualified leads or sales qualified leads, and you try to understand Okay, how do you map that out? What's the definitions for you? We talked about, how's the revenue mapped? It's a huge Send us systems like Salesforce to, to have this in, you would look at the opportunities you try to understand like how you mapping opportunities. What is the amount that actually says there is a subscription business, a transactional business to try to understand all those kind of things. We try to get a couple of base numbers in, for example, like how much money did they spend on Google Ads last month? How much revenue did they have last month, so on. So we have a couple of base numbers so that we get the data and we have crunched it, we can pan up say, just okay, it's in ballparks. At least though. It's the same numbers until we did it right. We have all the classical problems as well right? Doing SQL, you often you can easily end up duplicating data and so on. So who will cause careful around that,
Tobias Macey
and then it because of the fact that your customers do each have their own specific ways of interacting with their customers and specifics in terms of the data sources that they're trying to use an integration With how does that affect your overall approach to building out new features or your product roadmap to ensure that you're going to be able to fulfill the needs of your customers as you bring them on or the product verticals, or the market verticals that you target in terms of trying to gain your own customers to ensure that you'll be able to work with them effectively?
Ole Dallerup
Yeah, I mean, this is always difficult. I think building products is hard sometimes, because you have to, like, you can't ask the consultant built to customize solutions. On the other hand, you can't generalize so much that it's not useful for the customers who actually have more specific needs. We kind of work every day to find that balance. If we have to stay with the data side of things. It's actually mostly pretty straightforward. We actually are able to map all the data pretty generic, only applying some configurations and the only exception to that is what I mentioned earlier, as well. It's revenue, where we have a small query that kind of normalizes the data per customer. And that's actually the only kind of custom data thing we have to customer. When we talk about the user interface and how we analyze the data. It's all the same. But I think we are challenged sometimes that certain customers want to look at a specific number. We had recently a conversation with a one of our customers will tell you a number that was kind of supposed to be an investment number on your ads, and they were advocating a different number. And those kinds of things always had conversations as such, the numbers would represent the same thing. They were using just a slightly different formula. And that's hard sometimes to kind of question we try in general to go with the what the kind of norm and then in the market to do, but sometimes we also entering a new way of looking at data and then just a little bit Hadn't we kind of had to? Well, we tried to call topic customers all the time. I mean, as often as we can, and try to understand them and didn't pick the best solutions across our customers, right. And then sometimes one of our customers are not super happy with that solution. But that's how it has to be.
Tobias Macey
And in terms of being able to build out reports and visualizations of the information that you're collecting and the analysis that you're providing, what have you found to be some of the useful strategies? And what are the challenges in terms of being able to make sure that those reports are actionable and easy to interpret for your end users to talk with you customers?
Ole Dallerup
It's unfortunately the best. I mean, if you have a product where you can track users and behavior where you have enough data to kind of look at that, then definitely do that. But what we do is we talk with customers all the time, try to understand what they're trying to do try to ask them to kind of go through the report and see what they get out of it. That gives us a lot of insights, then we can often correct the dashboards
Tobias Macey
in terms of your experience of using dream data for your own business and being able to map the journeys of your customers, how has that influenced the overall direction or product roadmap for the business that you're building?
Ole Dallerup
so far? Not so much? I think our head of sales is kind of pushing a lot to do changes. Yeah, I think he has a lot of things that he would like to see in the product, and it's helpful for him. We're a little bit careful about that. Because like when, like partly We, of course, trying to solve our own needs, but but I think mostly we are trying to solve our customers needs. And so, so far, we are careful, but we use our own product a lot. So there's, I think, primarily colorful language we use a lot. So we were building a customer journey tool. And if at least for those who listen carefully, I mean, we have a really good mapping of all the companies These spin on our website. And so we can actually see who's been on our website, like both people that are anonymous, but also companies that has been identified themselves. And that's very interesting sometimes that we had a sales conversation with a company maybe a few months ago, and maybe we stopped the conversation, they were not ready to buy, they were not interested, whatnot, there's many reasons not to buy. But then when you see that they come back to a website, and they have a couple of visits, then it may be time to kind of call them again. So that's one of the ways we use it. And then we of course, use it to kind of keep track of our, like paid and ensure that we don't overspend compared to how much we make,
Tobias Macey
and what are some of the most interesting or surprising insights that you've been able to gain as a result of being able to view the data and the analytics that you're compiling either for yourself or when working with some of your customers. So the insides are really nice. Till Libby kind of common example. So we found the customer. And we saw that if we split
Ole Dallerup
the groups of customers into two buckets, those who had more than one session before the first signup, and those who had only one session, before they signed up, or like designed up in the first session on their website, then we found for that customer that the first bug or those who had more than one session before they signed up, they were five times more likely to actually end up as customers. So that's a type of data to see if we can help customers really understand like, what are the paid media's actually work? Like they do a lot of advertisement spending several hundred thousand dollars a month on showing ads. But like, as all these ads not really bring any money back. The truth is for most companies also that the ads is very useful and they are very important, but just a percentage of that they could just close down without losing any revenue. And then we help in particularly companies understanding what content pieces are driving revenue, one company we helped understand. So they were doing some partnerships with with a, with another competitor, but more like a company related to kind of what they did. And they wrote a couple of articles together. But as such, they were actually writing the articles small to have the partnerships, which also helped them in other cases, but it turned out that those articles were actually driving a lot of revenue, and they were driving so much revenue, they could actually pay for that quantity. And so their company, they should kind of talk with them and try to get them to do more of this like and help them to realize that actually, maybe this is the type of content you need to produce more of, but they had other content pieces that were driving a lot of traffic and in that sense, they look good, but when they looked at how much revenue they got out of it, they didn't make any money. So this is often what we see in the b2b world. But there's, there's, of course, correlation between traffic and revenue. But there may be other things that impacts your revenue more than traffic. And it's interesting to try to find those pieces. And we can help that with that.
Tobias Macey
And in terms of your experience of building and growing both the technical and business elements of dream data, what have you found to be some of the most challenging or interesting or unexpected lessons that you've learned?
Ole Dallerup
Hmm, I think is always have to grow and build a team. And I think maybe in a dream data I haven't had, like, on building the team doing that. I think I haven't learned, like, my lessons are probably more from the past. And definitely here. I mean, to me, I'm an advocate of like, hire people you trust. And if you don't trust people, then I mean, let them go. Like then that's not good. If you don't trust them anymore, then you shouldn't kind of have them on your team, whether that's fair or not fair, that's like, unfortunately, that's how it is. But if you have a team you trust, you can also give them a lot of responsibility and get them to fix things. And that just like builds a good culture. When we talk about clean data, I think is one of my biggest challenges was the commercial side of things. I had definitely had to learn to go into safe conversations and have like safe conversations, avoiding to talk to technical find that right balance between talking code and the talking like more about the value of the product. So and I think I'm partly still struggling on that. And partly, I'm also struggling on like, finding it enjoyable. I do enjoy building data and writing code a lot. And I mean, sometimes other parts of the job dictates that I has to be out of places which is sometimes hard for me and
Tobias Macey
One of the pieces that you were mentioning before too about the content being one of the strongest drivers of revenue in a particular case, I'm wondering how the overall evolution of the marketing landscape and different types of media or content distribution, how that impacts your overall approach to building out your platform, as well as some of the ways that you're approaching trying to grow your own business or what you found to be some of the most useful mediums or channels for being able to grow revenue or grow the audience.
Ole Dallerup
So I think that's very specific to companies to companies, or like what works well here. I think we do see a lot of companies where I would generally say in the b2b world, most companies are trying to produce that content on their own website. Sometimes the cause a good idea for like search engine optimization, and so to have content elsewhere, but like generally, that is, I think the consensus Right now, I will also say right now, I think the consensus is that you want to own your own own stuff. So this like getting the content out on a lot of other platforms is not something we see a crazy amount of companies doing. So so far, it's not really impacting us. But it's very different. Like some companies are doing a lot of content, and they are bidding relatively large amounts that they can write content that will drive traffic and awareness. Other companies are not doing that at all. And they don't even believe that that's a possibility for them. So they come in both times, but I can talk about like ourselves. So like we're doing access, probably most companies are doing when we do Google ads, we get relevant traffic in but not super relevant. When we do Facebook traffic. I would say we are more or less like we are targeting some of the customers like retargeting people and that works. But we are like looking at at like acquiring new customers. I don't think Facebook works particularly well for us. We should like that we're still early and we had doesn't have we don't have that much data to kind of conclude yet. But if I had to conclude right now, I would definitely close it down. But what works well for us is capira. With the review platform for software that works well for us. We see a lot of companies also use p to ground it to I think they just called now with the data we have seems also that that it works quite well. So if you're in that space, I would definitely look towards that. If you can do something. Yeah, I didn't. I think also podcasts and webinars has worked pretty well for us to get some awareness. For us. It also worked really well with content, we managed to produce some content that actually drew a relatively large amount of traffic for us at least.
Tobias Macey
And are there any other areas of the world work that you're doing at dream data, or the overall space of b2b sales and revenue tracking or any of the other challenges that you're facing in the data landscape that we didn't discuss, they'd like to cover before we close out the show.
Ole Dallerup
I think we didn't talk a lot about machine learning and like those kind of things. And to be honest, we don't do a lot about that right now, we'll start testing that pretty heavily. And so we look at a lot of types stuff like Markov chain, and see if we can use that for attribution algorithms and so on, which is very interesting. I think always when kind of doing this, I'm always concerned about like the amount of data that's required to actually get some value out of it. But I do look forward to we can actually get some time to play with that.
Tobias Macey
Well, for anybody who wants to get in touch with you or follow along with the work that you're doing, I'll have you add your preferred contact information to the show notes. And as a final question, I would just like to get your perspective on what you see as being the biggest gap of the tooling or technology that's available for data management. Today,
Ole Dallerup
I think actually, it is actually.
I mean, I have to say this but I think, unfortunately, it's sad necessarily what when data does, but it's part of it at least I like a place where I can get all my data in and actually have it available so that both me and my analytics team and so on, can I create the data and get something out of it without having to ask a lot of people for help? I think everyone that works in large organizations, they've felt that pain that to get data where here, they have to go to some person and get it out. Well, structured data lakes, I think is that's what I require. And I'm arguing that we built that green data, at least for the commercial and revenue operations side of things. But in general, I would like that.
Tobias Macey
All right. Well, thank you very much for taking the time today to join me and discuss the work that you're doing at dream data. It's definitely very interesting business and an interesting problem domain that you're working In so I'm excited to see where it goes for you. So thank you again for your time and I hope you enjoy the rest of the day. Thank you for having me.
Listening. Don't forget to check out our other show it at python To learn about the Python language its community in the innovative ways it is being used. visit the site of data engineering podcast calm to subscribe to the show, sign up for the mailing list and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data engineering podcast comm with your story and to help other people find the show, please leave a review on iTunes and tell your friends and co workers

Liked it? Take a second to support the Data Engineering Podcast on Patreon!