Latest Episodes

SnowflakeDB: The Data Warehouse Built For The Cloud - Episode 110

Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. SnowflakeDB has been leading the charge to take advantage of cloud services that simplify the separation of compute and storage. In this episode Kent Graziano, chief technical evangelist for SnowflakeDB, explains how it is differentiated from other managed platforms and traditional data warehouse engines, the features that allow you to scale your usage dynamically, and how it allows for a shift in...

Play Episode

Organizing And Empowering Data Engineers At Citadel - Episode 109

The financial industry has long been driven by data, requiring a mature and robust capacity for discovering and integrating valuable sources of information. Citadel is no exception, and in this episode Michael Watson and Robert Krzyzanowski share their experiences managing and leading the data engineering teams that power the business. They shared helpful insights into some of the challenges associated with working in a regulated industry, organizing teams to deliver value rapidly and reliably, and how they approach career development for data engineers. This was a great conversation for...

Play Episode

Building A Real Time Event Data Warehouse For Sentry - Episode 108

The team at Sentry has built a platform for anyone in the world to send software errors and events. As they scaled the volume of customers and data they began running into the limitations of their initial architecture. To address the needs of their business and continue to improve their capabilities they settled on Clickhouse as the new storage and query layer to power their business. In this episode James Cunningham and Ted Kaemming describe the process of rearchitecting a production system, what they learned in the process, and...

Play Episode

Escaping Analysis Paralysis For Your Data Platform With Data Virtualization - Episode 107

With the constant evolution of technology for data management it can seem impossible to make an informed decision about whether to build a data warehouse, or a data lake, or just leave your data wherever it currently rests. What's worse is that any time you have to migrate to a new architecture, all of your analytical code has to change too. Thankfully it's possible to add an abstraction layer to eliminate the churn in your client code, allowing you to evolve your data platform without disrupting your downstream data...

Play Episode

Designing For Data Protection - Episode 106

The practice of data management is one that requires technical acumen, but there are also many policy and regulatory issues that inform and influence the design of our systems. With the introduction of legal frameworks such as the EU GDPR and California's CCPA it is necessary to consider how to implement data protectino and data privacy principles in the technical and policy controls that govern our data platforms. In this episode Karen Heaton and Mark Sherwood-Edwards share their experience and expertise in helping organizations achieve compliance. Even if you...

Play Episode

Automating Your Production Dataflows On Spark - Episode 105

As data engineers the health of our pipelines is our highest priority. Unfortunately, there are countless ways that our dataflows can break or degrade that have nothing to do with the business logic or data transformations that we write and maintain. Sean Knapp founded Ascend to address the operational challenges of running a production grade and scalable Spark infrastructure, allowing data engineers to focus on the problems that power their business. In this episode he explains the technical implementation of the Ascend platform, the challenges that he has faced...

Play Episode

Build Maintainable And Testable Data Applications With Dagster - Episode 104

Despite the fact that businesses have relied on useful and accurate data to succeed for decades now, the state of the art for obtaining and maintaining that information still leaves much to be desired. In an effort to create a better abstraction for building data applications Nick Schrock created Dagster. In this episode he explains his motivation for creating a product for data management, how the programming model simplifies the work of building testable and maintainable pipelines, and his vision for the future of data programming. If you are...

Play Episode

Data Orchestration For Hybrid Cloud Analytics - Episode 103

The scale and complexity of the systems that we build to satisfy business requirements is increasing as the available tools become more sophisticated. In order to bridge the gap between legacy infrastructure and evolving use cases it is necessary to create a unifying set of components. In this episode Dipti Borkar explains how the emerging category of data orchestration tools fills this need, some of the existing projects that fit in this space, and some of the ways that they can work together to simplify projects such as cloud...

Play Episode

Keeping Your Data Warehouse In Order - Episode 102

Managing a data warehouse can be challenging, especially when trying to maintain a common set of patterns. Dataform is a platform that helps you apply engineering principles to your data transformations and table definitions, including unit testing SQL scripts, defining repeatable pipelines, and adding metadata to your warehouse to improve your team's communication. In this episode CTO and co-founder of Dataform Lewis Hemens joins the show to explain his motivation for creating the platform and company, how it works under the covers, and how you can start using it...

Play Episode

Fast Analytics On Semi-Structured And Structured Data In The Cloud - Episode 101

The process of exposing your data through a SQL interface has many possible pathways, each with their own complications and tradeoffs. One of the recent options is Rockset, a serverless platform for fast SQL analytics on semi-structured and structured data. In this episode CEO Venkat Venkataramani and SVP of Product Shruti Bhat explain the origins of Rockset, how it is architected to allow for fast and flexible SQL analytics on your data, and how their serverless platform can save you the time and effort of implementing portions of your...

Play Episode

Join The Mailing List