Hightouch

Streaming Data Pipelines Made SQL With Decodable - Episode 233

Streaming data systems have been growing more capable and flexible over the past few years. Despite this, it is still challenging to build reliable pipelines for stream processing. In this episode Eric Sammer discusses the shortcomings of the current set of streaming engines and how they force engineers to work at an extremely low level of abstraction. He also explains why he started Decodable to address that limitation and the work that he and his team have done to let data engineers build streaming pipelines entirely in SQL.

Read More

Data Exploration For Business Users Powered By Analytics Engineering With Lightdash - Episode 232

The market for business intelligence has been going through an evolutionary shift in recent years. One of the driving forces for that change has been the rise of analytics engineering powered by dbt. Lightdash has fully embraced that shift by building an entire open source business intelligence framework that is powered by dbt models. In this episode Oliver Laslett describes why dashboards aren’t sufficient for business analytics, how Lightdash promotes the work that you are already doing in your data warehouse modeling with dbt, and how they are focusing on bridging the divide between data teams and business teams and the requirements that they have for data workflows.

Read More

Designing And Building Data Platforms As A Product - Episode 218

The term “data platform” gets thrown around a lot, but have you stopped to think about what it actually means for you and your organization? In this episode Lior Gavish, Lior Solomon, and Atul Gupte share their view of what it means to have a data platform, discuss their experiences building them at various companies, and provide advice on how to treat them like a software product. This is a valuable conversation about how to approach the work of selecting the tools that you use to power your data systems and considerations for how they can be woven together for a unified experience across your various stakeholders.

Read More

Do Away With Data Integration Through A Dataware Architecture With Cinchy - Episode 216

The reason that so much time and energy is spent on data integration is because of how our applications are designed. By making the software be the owner of the data that it generates, we have to go through the trouble of extracting the information to then be used elsewhere. The team at Cinchy are working to bring about a new paradigm of software architecture that puts the data as the central element. In this episode Dan DeMers, Cinchy’s CEO, explains how their concept of a “Dataware” platform eliminates the need for costly and error prone integration processes and the benefits that it can provide for transactional and analytical application design. This is a fascinating and unconventional approach to working with data, so definitely give this a listen to expand your thinking about how to build your systems.

Read More

Let Your Analysts Build A Data Lakehouse With Cuelake - Episode 214

Data lakes have been gaining popularity alongside an increase in their sophistication and usability. Despite improvements in performance and data architecture they still require significant knowledge and experience to deploy and manage. In this episode Vikrant Dubey discusses his work on the Cuelake project which allows data analysts to build a lakehouse with SQL queries. By building on top of Zeppelin, Spark, and Iceberg he and his team at Cuebook have built an autoscaled cloud native system that abstracts the underlying complexity.

Read More

Prepare Your Unstructured Data For Machine Learning And Computer Vision Without The Toil Using Activeloop - Episode 212

The vast majority of data tools and platforms that you hear about are designed for working with structured, text-based data. What do you do when you need to manage unstructured information, or build a computer vision model? Activeloop was created for exactly that purpose. In this episode Davit Buniatyan, founder and CEO of Activeloop, explains why he is spending his time and energy on building a platform to simplify the work of getting your unstructured data ready for machine learning. He discusses the inefficiencies that teams run into from having to reprocess data multiple times, his work on the open source Hub library to solve this problem for everyone, and his thoughts on the vast potential that exists for using computer vision to solve hard and meaningful problems.

Read More

Data Discovery From Dashboards To Databases With Castor - Episode 210

Every organization needs to be able to use data to answer questions about their business. The trouble is that the data is usually spread across a wide and shifting array of systems, from databases to dashboards. The other challenge is that even if you do find the information you are seeking, there might not be enough context available to determine how to use it or what it means. Castor is building a data discovery platform aimed at solving this problem, allowing you to search for and document details about everything from a database column to a business intelligence dashboard. In this episode CTO Amaury Dumoulin shares his perspective on the complexity of letting everyone in the company find answers to their questions and how Castor is designed to help.

Read More

Bringing The Metrics Layer To The Masses With Transform - Episode 206

Collecting and cleaning data is only useful if someone can make sense of it afterward. The latest evolution in the data ecosystem is the introduction of a dedicated metrics layer to help address the challenge of adding context and semantics to raw information. In this episode Nick Handel shares the story behind Transform, a new platform that provides a managed metrics layer for your data platform. He explains the challenges that occur when metrics are maintained across a variety of systems, the benefits of unifying them in a common access layer, and the potential that it unlocks for everyone in the business to confidently answer questions with data.

Read More