This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Support the show!Listen in your favorite app:
FountainHere are shows you might like
Every organization needs to be able to use data to answer questions about their business. The trouble is that the data is usually spread across a wide and shifting array of systems, from databases to dashboards. The other challenge is that even if you do find the information you…
Every organization needs to be able to…
07 August 2021 | 00:52:47
Data lake architectures have largely been biased toward batch processing workflows due to the volume of data that they are designed for. With more real-time requirements and the increasing use of streaming data there has been a struggle to merge fast, incremental updates with large,…
Data lake architectures have largely been…
03 August 2021 | 01:09:37
Companies of all sizes and industries are trying to use the data that they and their customers generate to survive and thrive in the modern economy. As a result, they are relying on a constantly growing number of data sources being accessed by an increasingly varied set of users. In order…
Companies of all sizes and industries are…
31 July 2021 | 00:51:23
Everyone expects data to be transmitted, processed, and updated instantly as more and more products integrate streaming data. The technology to make that possible has been around for a number of years, but the barriers to adoption have still been high due to the level of technical…
Everyone expects data to be transmitted,…
28 July 2021 | 01:00:13
Collecting and cleaning data is only useful if someone can make sense of it afterward. The latest evolution in the data ecosystem is the introduction of a dedicated metrics layer to help address the challenge of adding context and semantics to raw information. In this episode Nick Handel…
Collecting and cleaning data is only…
23 July 2021 | 01:01:17
Data quality is a concern that has been gaining attention alongside the rising importance of analytics for business success. Many solutions rely on hand-coded rules for catching known bugs, or statistical analysis of records to detect anomalies retroactively. While those are useful tools,…
Data quality is a concern that has been…
20 July 2021 | 01:01:07
There is a wealth of tools and systems available for processing data, but the user experience of integrating them and building workflows is still lacking. This is particularly important in large and complex organizations where domain knowledge and context is paramount and there may not be…
There is a wealth of tools and systems…
16 July 2021 | 01:12:35
We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of cloud…
We have been building platforms and…
13 July 2021 | 00:49:02
Every data project, whether it’s analytics, machine learning, or AI, starts with the work of data cleaning. This is a critical step and benefits from being accessible to the domain experts. Trifacta is a platform for managing your data engineering workflow to make curating, cleaning,…
Every data project, whether it’s…
09 July 2021 | 01:07:13
At the core of every data pipeline is an workflow manager (or several). Deploying, managing, and scaling that orchestration can consume a large fraction of a data team’s energy so it is important to pick something that provides the power and flexibility that you need. SaaSGlue is a…
At the core of every data pipeline is an…
05 July 2021 | 00:55:31