Data Integration

Strategies And Tactics For A Successful Master Data Management Implementation - Episode 301

The most complicated part of data engineering is the effort involved in making the raw data fit into the narrative of the business. Master Data Management (MDM) is the process of building consensus around what the information actually means in the context of the business and then shaping the data to match those semantics. In this episode Malcolm Hawker shares his years of experience working in this domain to explore the combination of technical and social skills that are necessary to make an MDM project successful both at the outset and over the long term.

Read More

Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion - Episode 286

The predominant pattern for data integration in the cloud has become extract, load, and then transform or ELT. Matillion was an early innovator of that approach and in this episode CTO Ed Thompson explains how they have evolved the platform to keep pace with the rapidly changing ecosystem. He describes how the platform is architected, the challenges related to selling cloud technologies into enterprise organizations, and how you can adopt Matillion for your own workflows to reduce the maintenance burden of data integration workflows.

Read More

Build Your Own End To End Customer Data Platform With Rudderstack - Episode 263

Collecting, integrating, and activating data are all challenging activities. When that data pertains to your customers it can become even more complex. To simplify the work of managing the full flow of your customer data and keep you in full control the team at Rudderstack created their eponymous open source platform that allows you to work with first and third party data, as well as build and manage reverse ETL workflows. In this episode CEO and founder Soumyadeb Mitra explains how Rudderstack compares to the various other tools and platforms that share some overlap, how to set it up for your own data needs, and how it is architected to scale to meet demand.

Read More

Open Source Reverse ETL For Everyone With Grouparoo - Episode 254

Reverse ETL is a product category that evolved from the landscape of customer data platforms with a number of companies offering their own implementation of it. While struggling with the work of automating data integration workflows with marketing, sales, and support tools Brian Leonard accidentally discovered this need himself and turned it into the open source framework Grouparoo. In this episode he explains why he decided to turn these efforts into an open core business, how the platform is implemented, and the benefits of having an open source contender in the landscape of operational analytics products.

Read More

Creating A Unified Experience For The Modern Data Stack At Mozart Data - Episode 242

The modern data stack has been gaining a lot of attention recently with a rapidly growing set of managed services for different stages of the data lifecycle. With all of the available options it is possible to run a scalable, production grade data platform with a small team, but there are still sharp edges and integration challenges to work through. Peter Fishman and Dan Silberman experienced these difficulties firsthand and created Mozart Data to provide a single, easy to use option for getting started with the modern data stack. In this episode they explain how they designed a user experience to make working with data more accessibly by organizations without a data team, while allowing for more advanced users to build out more complex workflows. They also share their thoughts on the modern data ecosystem and how it improves the availability of analytics for companies of all sizes.

Read More

Exploring Processing Patterns For Streaming Data Integration In Your Data Lake - Episode 240

One of the perennial challenges posed by data lakes is how to keep them up to date as new data is collected. With the improvements in streaming engines it is now possible to perform all of your data integration in near real time, but it can be challenging to understand the proper processing patterns to make that performant. In this episode Ori Rafael shares his experiences from Upsolver and building scalable stream processing for integrating and analyzing data, and what the tradeoffs are when coming from a batch oriented mindset.

Read More

Exploring The Evolution And Adoption of Customer Data Platforms and Reverse ETL - Episode 235

The precursor to widespread adoption of cloud data warehouses was the creation of customer data platforms. Acting as a centralized repository of information about how your customers interact with your organization they drove a wave of analytics about how to improve products based on actual usage data. A natural outgrowth of that capability is the more recent growth of reverse ETL systems that use those analytics to feed back into the operational systems used to engage with the customer. In this episode Tejas Manohar and Rachel Bradley-Haas share the story of their own careers and experiences coinciding with these trends. They also discuss the current state of the market for these technological patterns and how to take advantage of them in your own work.

Read More

Completing The Feedback Loop Of Data Through Operational Analytics With Census - Episode 231

The focus of the past few years has been to consolidate all of the organization’s data into a cloud data warehouse. As a result there have been a number of trends in data that take advantage of the warehouse as a single focal point. Among those trends is the advent of operational analytics, which completes the cycle of data from collection, through analysis, to driving further action. In this episode Boris Jabes, CEO of Census, explains how the work of synchronizing cleaned and consolidated data about your customers back into the systems that you use to interact with those customers allows for a powerful feedback loop that has been missing in data systems until now. He also discusses how Census makes that synchronization easy to manage, how it fits with the growth of data quality tooling, and how you can start using it today.

Read More

Delivering Your Personal Data Cloud With Prifina - Episode 225

The promise of online services is that they will make your life easier in exchange for collecting data about you. The reality is that they use more information than you realize for purposes that are not what you intended. There have been many attempts to harness all of the data that you generate for gaining useful insights about yourself, but they are generally difficult to set up and manage or require software development experience. The team at Prifina have built a platform that allows users to create their own personal data cloud and install applications built by developers that power useful experiences while keeping you in full control. In this episode Markus Lampinen shares the goals and vision of the company, the technical aspects of making it a reality, and the future vision for how services can be designed to respect user’s privacy while still providing compelling experiences.

Read More

Do Away With Data Integration Through A Dataware Architecture With Cinchy - Episode 216

The reason that so much time and energy is spent on data integration is because of how our applications are designed. By making the software be the owner of the data that it generates, we have to go through the trouble of extracting the information to then be used elsewhere. The team at Cinchy are working to bring about a new paradigm of software architecture that puts the data as the central element. In this episode Dan DeMers, Cinchy’s CEO, explains how their concept of a “Dataware” platform eliminates the need for costly and error prone integration processes and the benefits that it can provide for transactional and analytical application design. This is a fascinating and unconventional approach to working with data, so definitely give this a listen to expand your thinking about how to build your systems.

Read More