This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Support the show!Listen in your favorite app:
FountainHere are shows you might like
Every business collects data in some fashion, but sometimes the true value of the collected information only comes when it is combined with other data sources. Data trusts are a legal framework for allowing businesses to collaboratively pool their data. This allows the members of the trust…
Every business collects data in some…
03 February 2020 | 00:56:53
Data pipelines are complicated and business critical pieces of technical infrastructure. Unfortunately they are also complex and difficult to test, leading to a significant amount of technical debt which contributes to slower iteration cycles. In this episode James Campbell describes how he…
Data pipelines are complicated and…
27 January 2020 | 00:46:31
Building a reliable data platform is a neverending task. Even if you have a process that works for you and your business there can be unexpected events that require a change in your platform architecture. In this episode the head of data for Mayvenn shares their experience migrating an…
Building a reliable data platform is a…
20 January 2020 | 00:39:00
Summary The modern era of software development is identified by ubiquitous access to elastic infrastructure for computation and easy automation of deployment. This has led to a class of applications that can quickly scale to serve users worldwide. This requires a new class of data storage which can accomodate that demand without having to…
Summary The modern era of software development is identified by ubiquitous access to elastic…
13 January 2020 | 01:01:17
Databases are useful for inspecting the current state of your application, but inspecting the history of that data can get messy without a way to track changes as they happen. Debezium is an open source platform for reliable change data capture that you can use to build supplemental systems…
Databases are useful for inspecting the…
06 January 2020 | 00:53:01
DataDog is one of the most successful companies in the space of metrics and monitoring for servers and cloud infrastructure. In order to support their customers, they need to capture, process, and analyze massive amounts of timeseries data with a high degree of uptime and reliability. Vadim…
DataDog is one of the most successful…
30 December 2019 | 00:45:55
Transactional databases used in applications are optimized for fast reads and writes with relatively simple queries on a small number of records. Data warehouses are optimized for batched writes and complex analytical queries. Between those use cases there are varying levels of support for…
Transactional databases used in…
23 December 2019 | 00:48:07
Building clean datasets with reliable and reproducible ingestion pipelines is completely useless if it’s not possible to find them and understand their provenance. The solution to discoverability and tracking of data lineage is to incorporate a metadata repository into your data…
Building clean datasets with reliable and…
16 December 2019 | 01:01:53
Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. SnowflakeDB has been leading the charge to take advantage of cloud services that…
Data warehouses have gone through many…
09 December 2019 | 00:58:57
The financial industry has long been driven by data, requiring a mature and robust capacity for discovering and integrating valuable sources of information. Citadel is no exception, and in this episode Michael Watson and Robert Krzyzanowski share their experiences managing and leading the…
The financial industry has long been…
03 December 2019 | 00:45:50