Data Warehouses

Declarative Machine Learning Without The Operational Overhead Using Continual - Episode 222

Building, scaling, and maintaining the operational components of a machine learning workflow are all hard problems. Add the work of creating the model itself, and it’s not surprising that a majority of companies that could greatly benefit from machine learning have yet to either put it into production or see the value. Tristan Zajonc recognized the complexity that acts as a barrier to adoption and created the Continual platform in response. In this episode he shares his perspective on the benefits of declarative machine learning workflows as a means of accelerating adoption in businesses that don’t have the time, money, or ambition to build everything from scratch. He also discusses the technical underpinnings of what he is building and how using the data warehouse as a shared resource drastically shortens the time required to see value. This is a fascinating episode and Tristan’s work at Continual is likely to be the catalyst for a new stage in the machine learning community.

Read More

Bring Order To The Chaos Of Your Unstructured Data Assets With Unstruk - Episode 196

Working with unstructured data has typically been a motivation for a data lake. The challenge is imposing enough order on the platform to make it useful. Kirk Marple has spent years working with data systems and the media industry, which inspired him to build a platform for automatically organizing your unstructured assets to make them more valuable. In this episode he shares the goals of the Unstruk Data Warehouse, how it is architected to extract asset metadata and build a searchable knowledge graph from the information, and the myriad ways that the system can be used. If you are wondering how to deal with all of the information that doesn’t fit in your databases or data warehouses, then this episode is for you.

Read More

Paving The Road For Fast Analytics On Distributed Clouds With The Yellowbrick Data Warehouse - Episode 190

The data warehouse has become the focal point of the modern data platform. With increased usage of data across businesses, and a diversity of locations and environments where data needs to be managed, the warehouse engine needs to be fast and easy to manage. Yellowbrick is a data warehouse platform that was built from the ground up for speed, and can work across clouds and all the way to the edge. In this episode CTO Mark Cusack explains how the engine is architected, the benefits that speed and predictable pricing has for the organization, and how you can simplify your platform by putting the warehouse close to the data, instead of the other way around.

Read More

Building Your Data Warehouse On Top Of PostgreSQL - Episode 186

There is a lot of attention on the database market and cloud data warehouses. While they provide a measure of convenience, they also require you to sacrifice a certain amount of control over your data. If you want to build a warehouse that gives you both control and flexibility then you might consider building on top of the venerable PostgreSQL project. In this episode Thomas Richter and Joshua Drake share their advice on how to build a production ready data warehouse with Postgres.

Read More

How Shopify Is Building Their Production Data Warehouse Using DBT - Episode 171

With all of the tools and services available for building a data platform it can be difficult to separate the signal from the noise. One of the best ways to get a true understanding of how a technology works in practice is to hear from people who are running it in production. In this episode Zeeshan Qureshi and Michelle Ark share their experiences using DBT to manage the data warehouse for Shopify. They explain how the structured the project to allow for multiple teams to collaborate in a scalable manner, the additional tooling that they added to address the edge cases that they have run into, and the optimizations that they baked into their continuous integration process to provide fast feedback and reduce costs. This is a great conversation about the lessons learned from real world use of a specific technology and how well it lives up to its promises.

Read More

Using Your Data Warehouse As The Source Of Truth For Customer Data With Hightouch - Episode 168

The data warehouse has become the central component of the modern data stack. Building on this pattern, the team at Hightouch have created a platform that synchronizes information about your customers out to third party systems for use by marketing and sales teams. In this episode Tejas Manohar explains the benefits of sourcing customer data from one location for all of your organization to use, the technical challenges of synchronizing the data to external systems with varying APIs, and the workflow for enabling self-service access to your customer data by your marketing teams. This is an interesting conversation about the importance of the data warehouse and how it can be used beyond just internal analytics.

Read More

Building A Better Data Warehouse For The Cloud At Firebolt - Episode 148

Data warehouse technology has been around for decades and has gone through several generational shifts in that time. The current trends in data warehousing are oriented around cloud native architectures that take advantage of dynamic scaling and the separation of compute and storage. Firebolt is taking that a step further with a core focus on speed and interactivity. In this episode CEO and founder Eldad Farkash explains how the Firebolt platform is architected for high throughput, their simple and transparent pricing model to encourage widespread use, and the use cases that it unlocks through interactive query speeds.

Read More

Data Modeling That Evolves With Your Business Using Data Vault - Episode 119

Designing the structure for your data warehouse is a complex and challenging process. As businesses deal with a growing number of sources and types of information that they need to integrate, they need a data modeling strategy that provides them with flexibility and speed. Data Vault is an approach that allows for evolving a data model in place without requiring destructive transformations and massive up front design to answer valuable questions. In this episode Kent Graziano shares his journey with data vault, explains how it allows for an agile approach to data warehousing, and explains the core principles of how to use it. If you’re struggling with unwieldy dimensional models, slow moving projects, or challenges integrating new data sources then listen in on this conversation and then give data vault a try for yourself.

Read More

SnowflakeDB: The Data Warehouse Built For The Cloud - Episode 110

Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. SnowflakeDB has been leading the charge to take advantage of cloud services that simplify the separation of compute and storage. In this episode Kent Graziano, chief technical evangelist for SnowflakeDB, explains how it is differentiated from other managed platforms and traditional data warehouse engines, the features that allow you to scale your usage dynamically, and how it allows for a shift in your workflow from ETL to ELT. If you are evaluating your options for building or migrating a data platform, then this is definitely worth a listen.

Read More

Building A Real Time Event Data Warehouse For Sentry - Episode 108

The team at Sentry has built a platform for anyone in the world to send software errors and events. As they scaled the volume of customers and data they began running into the limitations of their initial architecture. To address the needs of their business and continue to improve their capabilities they settled on Clickhouse as the new storage and query layer to power their business. In this episode James Cunningham and Ted Kaemming describe the process of rearchitecting a production system, what they learned in the process, and some useful tips for anyone else evaluating Clickhouse.

Read More