This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
Support the show!Listen in your favorite app:
FountainHere are shows you might like
The software applications that we build for our businesses are a rich source of data, but accessing and extracting that data is often a slow and error-prone process. Rookout has built a platform to separate the data collection process from the lifecycle of your code. In this episode, CTO…
The software applications that we build…
14 April 2020 | 00:26:00
Knowledge graphs are a data resource that can answer questions beyond the scope of traditional data analytics. By organizing and storing data to emphasize the relationship between entities, we can discover the complex connections between multiple sources of information. In this episode John…
Knowledge graphs are a data resource that…
07 April 2020 | 00:45:20
Building and maintaining a system that integrates and analyzes all of the data for your organization is a complex endeavor. Operating on a shoe-string budget makes it even more challenging. In this episode Tyler Colby shares his experiences working as a data professional in the non-profit…
Building and maintaining a system that…
30 March 2020 | 00:44:36
There are a number of platforms available for object storage, including self-managed open source projects. But what goes on behind the scenes of the companies that run these systems at scale so you don’t have to? In this episode Will Smith shares the journey that he and his team at…
There are a number of platforms available…
23 March 2020 | 00:35:53
CouchDB is a distributed document database built for scale and ease of operation. With a built-in synchronization protocol and a HTTP interface it has become popular as a backend for web and mobile applications. Created 15 years ago, it has accrued some technical debt which is being…
CouchDB is a distributed document…
17 March 2020 | 00:55:25
Data governance is a complex endeavor, but scaling it to meet the needs of a complex or globally distributed organization requires a well considered and coherent strategy. In this episode Tim Ward describes an architecture that he has used successfully with multiple organizations to scale…
Data governance is a complex endeavor,…
09 March 2020 | 00:54:08
Building applications on top of unbounded event streams is a complex endeavor, requiring careful integration of multiple disparate systems that were engineered in isolation. The ksqlDB project was created to address this state of affairs by building a unified layer on top of the Kafka…
Building applications on top of unbounded…
02 March 2020 | 00:43:36
Misaligned priorities across business units can lead to tensions that drive members of the organization to build data and analytics projects without the guidance or support of engineering or IT staff. The availability of cloud platforms and managed services makes this a viable option, but…
Misaligned priorities across business…
25 February 2020 | 00:46:09
One of the biggest challenges in building reliable platforms for processing event pipelines is managing the underlying infrastructure. At Snowplow Analytics the complexity is compounded by the need to manage multiple instances of their platform across customer environments. In this episode…
One of the biggest challenges in building…
18 February 2020 | 00:49:01
Designing the structure for your data warehouse is a complex and challenging process. As businesses deal with a growing number of sources and types of information that they need to integrate, they need a data modeling strategy that provides them with flexibility and speed. Data Vault is an…
Designing the structure for your data…
09 February 2020 | 01:06:22