Data Architecture

An Exploration Of What Data Automation Can Provide To Data Engineers And Ascend's Journey To Make It A Reality - Episode 320

The dream of every engineer is to automate all of their tasks. For data engineers, this is a monumental undertaking. Orchestration engines are one step in that direction, but they are not a complete solution. In this episode Sean Knapp shares his views on what constitutes proper automation and the work that he and his team at Ascend are doing to help make it a reality.

Read More

An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications - Episode 317

Data has permeated every aspect of our lives and the products that we interact with. As a result, end users and customers have come to expect interactions and updates with services and analytics to be fast and up to date. In this episode Shruti Bhat gives her view on the state of the ecosystem for real-time data and the work that she and her team at Rockset is doing to make it easier for engineers to build those experiences.

Read More

Re-Bundling The Data Stack With Data Orchestration And Software Defined Assets Using Dagster - Episode 309

The current stage of evolution in the data management ecosystem has resulted in domain and use case specific orchestration capabilities being incorporated into various tools. This complicates the work involved in making end-to-end workflows visible and integrated. Dagster has invested in bringing insights about external tools’ dependency graphs into one place through its “software defined assets” functionality. In this episode Nick Schrock discusses the importance of orchestration and a central location for managing data systems, the road to Dagster’s 1.0 release, and the new features coming with Dagster Cloud’s general availability.

Read More

Charting the Path of Riskified's Data Platform Journey - Episode 306

Building a data platform is a journey, not a destination. Beyond the work of assembling a set of technologies and building integrations across them, there is also the work of growing and organizing a team that can support and benefit from that platform. In this episode Inbar Yogev and Lior Winner share the journey that they and their teams at Riskified have been on for their data platform. They also discuss how they have established a guild system for training and supporting data professionals in the organization.

Read More

Maintain Your Data Engineers' Sanity By Embracing Automation - Episode 305

Building and maintaining reliable data assets is the prime directive for data engineers. While it is easy to say, it is endlessly complex to implement, requiring data professionals to be experts in a wide range of disparate topics while designing and implementing complex topologies of information workflows. In order to make this a tractable problem it is essential that engineers embrace automation at every opportunity. In this episode Chris Riccomini shares his experiences building and scaling data operations at WePay and LinkedIn, as well as the lessons he has learned working with other teams as they automated their own systems.

Read More

The View From The Lakehouse Of Architectural Patterns For Your Data Platform - Episode 304

The ecosystem for data tools has been going through rapid and constant evolution over the past several years. These technological shifts have brought about corresponding changes in data and platform architectures for managing data and analytical workflows. In this episode Colleen Tartow shares her insights into the motivating factors and benefits of the most prominent patterns that are in the popular narrative; data mesh and the modern data stack. She also discusses her views on the role of the data lakehouse as a building block for these architectures and the ongoing influence that it will have as the technology matures.

Read More

Strategies And Tactics For A Successful Master Data Management Implementation - Episode 301

The most complicated part of data engineering is the effort involved in making the raw data fit into the narrative of the business. Master Data Management (MDM) is the process of building consensus around what the information actually means in the context of the business and then shaping the data to match those semantics. In this episode Malcolm Hawker shares his years of experience working in this domain to explore the combination of technical and social skills that are necessary to make an MDM project successful both at the outset and over the long term.

Read More

Designing And Deploying IoT Analytics For Industrial Applications At Vopak - Episode 290

Industrial applications are one of the primary adopters of Internet of Things (IoT) technologies, with business critical operations being informed by data collected across a fleet of sensors. Vopak is a business that manages storage and distribution of a variety of liquids that are critical to the modern world, and they have recently launched a new platform to gain more utility from their industrial sensors. In this episode Mário Pereira shares the system design that he and his team have developed for collecting and managing the collection and analysis of sensor data, and how they have split the data processing and business logic responsibilities between physical terminals and edge locations, and centralized storage and compute.

Read More

Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way - Episode 289

Designing a data platform is a complex and iterative undertaking which requires accounting for many conflicting needs. Designing a platform that relies on a data lake as its central architectural tenet adds additional layers of difficulty. Srivatsan Sridharan has had the opportunity to design, build, and run data lake platforms for both Yelp and Robinhood, with many valuable lessons learned from each experience. In this episode he shares his insights and advice on how to approach such an undertaking in your own organization.

Read More