Building A New Foundation For CouchDB - Episode 124

Summary

CouchDB is a distributed document database built for scale and ease of operation. With a built-in synchronization protocol and a HTTP interface it has become popular as a backend for web and mobile applications. Created 15 years ago, it has accrued some technical debt which is being addressed with a refactored architecture based on FoundationDB. In this episode Adam Kocoloski shares the history of the project, how it works under the hood, and how the new design will improve the project for our new era of computation. This was an interesting conversation about the challenges of maintaining a large and mission critical project and the work being done to evolve it.

Many data engineers say the most frustrating part of their job is spending too much time maintaining and monitoring their data pipeline. Snowplow works with data-informed businesses to set up a real-time event data pipeline, taking care of installation, upgrades, autoscaling, and ongoing maintenance so you can focus on the data.

Snowplow runs in your own cloud account giving you complete control and flexibility over how your data is collected and processed. Best of all, Snowplow is built on top of open source technology which means you have visibility into every stage of your pipeline, with zero vendor lock in.

At Snowplow, we know how important it is for data engineers to deliver high-quality data across the organization. That’s why the Snowplow pipeline is designed to deliver complete, rich and accurate data into your data warehouse of choice. Your data analysts define the data structure that works best for your teams, and we enforce it end-to-end so your data is ready to use.

Get in touch with our team to find out how Snowplow can accelerate your analytics. Go to dataengineeringpodcast.com/snowplow. Set up a demo and mention you’re a listener for a special offer!


Altinity LogoEnabling real-time analytics is a huge task. Without a data warehouse that outperforms the demands of your customers at a fraction of cost and time, this big task can also prove challenging. But it doesn’t have to be tiring or difficult with ClickHouse — an open-source analytical database that deploys and scales wherever and whenever you want it to and turns data into actionable revenue. And Altinity is the leading ClickHouse software and service provider on a mission to help data engineers and DevOps managers. Go to dataengineeringpodcast.com/altinity to find out how with a free consultation.


linode-banner-sponsor-largeDo you want to try out some of the tools and applications that you heard about on the Data Engineering Podcast? Do you have some ETL jobs that need somewhere to run? Check out Linode at dataengineeringpodcast.com/linode or use the code dataengineering2019 and get a $20 credit (that’s 4 months free!) to try out their fast and reliable Linux virtual servers. They’ve got lightning fast networking and SSD servers with plenty of power and storage to run whatever you want to experiment on.


Announcements

  • Hello and welcome to the Data Engineering Podcast, the show about modern data management
  • When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, a 40Gbit public network, fast object storage, and a brand new managed Kubernetes platform, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. And for your machine learning workloads, they’ve got dedicated CPU and GPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • Are you spending too much time maintaining your data pipeline? Snowplow empowers your business with a real-time event data pipeline running in your own cloud account without the hassle of maintenance. Snowplow takes care of everything from installing your pipeline in a couple of hours to upgrading and autoscaling so you can focus on your exciting data projects. Your team will get the most complete, accurate and ready-to-use behavioral web and mobile data, delivered into your data warehouse, data lake and real-time streams. Go to dataengineeringpodcast.com/snowplow today to find out why more than 600,000 websites run Snowplow. Set up a demo and mention you’re a listener for a special offer!
  • Setting up and managing a data warehouse for your business analytics is a huge task. Integrating real-time data makes it even more challenging, but the insights you obtain can make or break your business growth. You deserve a data warehouse engine that outperforms the demands of your customers and simplifies your operations at a fraction of the time and cost that you might expect. You deserve ClickHouse, the open-source analytical database that deploys and scales wherever and whenever you want it to and turns data into actionable insights. And Altinity, the leading software and service provider for ClickHouse, is on a mission to help data engineers and DevOps managers tame their operational analytics. Go to dataengineeringpodcast.com/altinity for a free consultation to find out how they can help you today.
  • You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media, Corinium Global Intelligence, ODSC, and Data Council. Upcoming events include the Software Architecture Conference in NYC, Strata Data in San Jose, and PyCon US in Pittsburgh. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today.
  • Your host is Tobias Macey and today I’m interviewing Adam Kocoloski about CouchDB and the work being done to migrate the storage layer to FoundationDB

Interview

  • Introduction
  • How did you get involved in the area of data management?
  • Can you starty by describing what CouchDB is?
    • How did you get involved in the CouchDB project and what is your current role in the community?
  • What are the use cases that it is well suited for?
  • Can you share some of the history of CouchDB and its role in the NoSQL movement?
  • How is CouchDB currently architected and how has it evolved since it was first introduced?
  • What have been the benefits and challenges of Erlang as the runtime for CouchDB?
  • How is the current storage engine implemented and what are its shortcomings?
  • What problems are you trying to solve by replatforming on a new storage layer?
    • What were the selection criteria for the new storage engine and how did you structure the decision making process?
    • What was the motivation for choosing FoundationDB as opposed to other options such as rocksDB, levelDB, etc.?
  • How is the adoption of FoundationDB going to impact the overall architecture and implementation of CouchDB?
  • How will the use of FoundationDB impact the way that the current capabilities are implemented, such as data replication?
  • What will the migration path be for people running an existing installation?
  • What are some of the biggest challenges that you are facing in rearchitecting the codebase?
  • What new capabilities will the FoundationDB storage layer enable?
  • What are some of the most interesting/unexpected/innovative ways that you have seen CouchDB used?
    • What new capabilities or use cases do you anticipate once this migration is complete?
  • What are some of the most interesting/unexpected/challenging lessons that you have learned while working with the CouchDB project and community?
  • What is in store for the future of CouchDB?

Contact Info

Parting Question

  • From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Click here to read the raw transcript...
Tobias Macey
0:00:13
Hello, and welcome to the data engineering podcast the show about modern data management. When you're ready to build your next pipeline or want to test out the project to hear about on the show, you'll need somewhere to deploy them. So check out our friends at linode. With 200 gigabit private networking, scalable shared block storage, 40 gigabit public network fast object storage and a brand new managed Kubernetes platform. You've got everything you need to run a fast, reliable and bulletproof data platform. And for your machine learning workloads. They've got dedicated CPU and GPU instances. Go to data engineering podcast.com slash linode. That's Li n o d today to get a $20 credit and launch a new server in under a minute. And don't forget to thank them for their continued support this show and setting up and managing a data warehouse for your business analytics is a huge task. Integrating real time data makes it even more challenging, but the insights you obtain could make or break your business growth. You deserve a data warehouse engine that outperforms the demands of your customers and simplifies your operations at a fraction of the time and cost you you might expect. You deserve click house the open source analytical database that deploys and scales wherever and whenever you want it to end turns data into actionable insights. And I'll tennety the leading software and service provider for click house is on a mission to help data engineers and DevOps managers tame their operational analytics. Go to data engineering podcast.com slash l tennety. That's a lt i and i t y for a free consultation to find out how they can help you today. Are you spending too much time maintaining your data pipeline, snowplow empowers your business with a real time event data pipeline running in your own Cloud account without the hassle of maintenance. snowplow takes care of everything from installing your pipeline in a couple of hours to upgrading and auto scaling so you can focus on your exciting Data projects, your team will get the most complete, accurate and ready to use behavioral web and mobile data delivered into your data warehouse, data lake and real time data streams. Go to data engineering podcast.com slash snowplow today to find out why more than 600,000 websites run snowplow, set up a demo and mention your listener for a special offer. Your host is Tobias Macey, and today I'm interviewing Adam Kocoloski about couchdb and the work being done to migrate the storage layer to foundation dB. So Adam, can you start by introducing yourself?
Adam Kocoloski
0:02:31
Sure, Tobias. Thanks for having me on. My name is Adam Kocoloski. I work at IBM and I'm one of the members of the project management committee for Apache couchdb.
Tobias Macey
0:02:39
And do you remember how you first got involved in the area of data management?
Adam Kocoloski
0:02:42
Yeah, I was really on the sort of practitioner side more than the vendor side I in graduate school, I was doing work in experimental particle physics, and that is a very data intensive exercise. You're looking for needles in very large haystacks collecting tons of telemetry from these very large detectors about collisions between fundamental particles that are happening millions of times a second, and looking for really rare signals. And, you know, the practicalities of that involve a lot of the kind of analytics that will be familiar to the folks who do a lot of work in spark and things of that nature. And that was really my first introduction to data management, the database side of it came from kind of managing all the metadata about those data sets and the compute jobs that were processing them. And that's when I kind of got deeper into the way that people managed databases across universities across national laboratories across different regions of the globe, which is what got me interested in projects like couchdb.
Tobias Macey
0:03:40
And my understanding of things like particle physics is that part of the challenge of the managing the volume of data is that you have to understand what data to actually even keep because it's coming at you so fast. You can't even actually store everything and so you have to have useful heuristics for knowing what to throw away.
Adam Kocoloski
0:03:58
That's exactly right. That's one of them. contentious parts of any particle physics, collaboration is when you are making those online triggering decisions. And there's several levels to it, some of that triggering decision, work has to happen in FPGAs, and things that respond incredibly quickly. And then you go to the next level that has a little bit more time to make a decision, maybe it can run on a commodity Linux host and do some processing. And then you know, you ultimately get to a level where you say, Yep, we're going to accept this event, we're going to write it to tape, we're gonna write it to disk. And that's when all the reconstruction efforts go in to see how well your online triggering actually captured events of interest.
Tobias Macey
0:04:31
And you said that because of the challenges of being able to handle sharing of these datasets across different universities and organizations as part of what led you to couchdb. So I'm wondering if you can just start by giving a bit of an introduction to how you got involved in the community and your current role.
Adam Kocoloski
0:04:50
Sure. I got involved in couchdb because some of my colleagues from that physics research group and I decided we should take a stab at starting a company together. We saw A lot of the new innovations that were happening in the world of data management, this was around 2007, when you had papers being published by the big web companies taking non traditional approaches to Data Management at scale, and we thought to ourselves, well, we were certainly embarking on some non traditional applications of data management at scale in our own work. And we can foresee that other companies beyond the, you know, large scale web companies would be interested in this kind of approach, as they become more data driven themselves. And so when we thought about trying to start a company, we, we zeroed in this world of scalable distributed databases and said, Okay, this seems like a place where we can really make an impact and where there doesn't seem to already be a, you know, a leader in the market. And we did a little bit of a review of projects that were out in the open source world to see if there were things that we ought to extend rather than starting from scratch in couchdb at the time seemed to have a lot of the right mentality on its approach. data distribution, its approach to, you know, empowering internet web applications and its approach to scalability. It just seemed really well aligned with what we wanted to do with our company.
Tobias Macey
0:06:11
And so if you can now give a bit more detail on the couchdb project itself and a bit about what it is and some of the main use cases that it's well suited for and some of the ways that people are using it.
Adam Kocoloski
0:06:23
Absolutely. couchdb is a project maintained by the Apache Software Foundation and has been for the past decade. It's imminently issuing its third major release the three Dotto release, which I think is a culmination of several years worth of work optimizing hardening and and really, you know, rounding out the support for some of the key use cases it has served really well over the past several years. We find couchdb to be a an excellent general purpose database for building web and mobile applications. You know, there's this sort of in many of these projects. You know, 80% of the functionality can be covered by a large variety of databases and couchdb is certainly one of those databases. So we we don't see reasons, you know, to disqualify it from from many types of, you know, sort of web applications that are being built today. Where it has a particular strength, though, is in its mechanisms for active active data replication across a wide variety of topologies systems that might exist in an on premises data center and other one in a cloud deployments across different clouds. systems that want to maintain a local offline, you know, editable repository on a mobile device or tablet, some kind those kinds of situations where the data lives in multiple places and needs to be synchronized ones that that couchdb finds itself especially well suited for.
Tobias Macey
0:07:49
Yeah, and I know that there are also projects like pouch DB and couchdb Lite for being able to use the same type of interface and take advantage of the synchronization capability. And things like browsers or in mobile devices. And so that was part of what I was initially attracted to when I first found out about couchdb is not having to deal with writing my own synchronization primitives to be able to take advantage of that same capability.
Adam Kocoloski
0:08:17
That's exactly right. Because that that replication protocol is something that is, you know, running as JSON documents over an HTTP interface, the sort of bar to implementing a client that speaks that protocol is, is lower than you might imagine. And that's where the pouch DB project got up and running, I think provides a very nice, you no option for folks who are running in the browser like that, or running in sort of a browser like environment in a in a mobile device. And, you know, I think that's something that I think speaks to the power of like open standards, open API's and internet protocols for fostering innovation at that level.
Tobias Macey
0:08:52
Yeah, it's definitely interesting how having a standard interface for a given application or particular use case frees people up to be able to spend their energy, much more valuable enterprises. And the types of projects and products that come out as a result of that were, in some cases, having a standardized API can be seen as a bit of a constraint that might be limiting in the capabilities that you have. But it's amazing the types of innovation that people can come up with even given those types of constraints.
Adam Kocoloski
0:09:25
Yeah, I'm encouraged to hear you, you know, sort of point out the challenges of implementing one's own synchronization protocol, when you really get into the details of active active synchronization of two different repositories that are being edited simultaneously, it becomes a pretty fun project pretty quickly. And, you know, they're, I would say, there's still a ton of active research in the field and looking at, you know, data structures that can intelligently emerge edits from different environments, data structures that can intelligently track lineage between these different environments. So it's, you know, it's a, I guess the phrase can of worms come from Mind, right?
Tobias Macey
0:10:01
Yeah, definitely. Yeah, the CR DT or conflict free replication, or
Unknown
0:10:08
I forget exactly what it is. Yeah.
Tobias Macey
0:10:11
Yes, exactly. That's definitely an interesting area of research as well. And the types of use cases that that enables. But as you said, it's challenging. And that's part of why it's not being used quite so ubiquitously as one might think. And in terms of the history of couchdb, you mentioned that you first started looking at it in around the 2007 timeframe. And it was adopted by Apache in 2010. And that was in sort of the peak of the no SQL movement where people were trying to figure out different ways of having web scale or you know, massively distributed data storage layers and some of the aspects of things like acid and transactions that they are willing to forego, in order to be able to handle these scaling capabilities. But then in recent years, people have started going the other way. The direction of moving back towards relational and transactional interfaces because of different breakthroughs that have been made in some of the architecture and compute environments. And so I'm wondering if you can just talk through a bit of the history and the role that couchdb played in that NO SEQUEL movement and some of the ways that it has maintained relevancy and grown over the past several years.
Adam Kocoloski
0:11:21
I think that's an excellent summary of the past 10 years in the world of distributed databases. You know, we sacrificed a lot in those early years in order to quickly achieve, you know, new new heights in scalability, as it were. And over the past decade has progressively tried to recover some of those richer isolation semantics and transactional capabilities that truly are pretty powerful primitives for application developers to rely upon in the first versions of couchdb. We had a system like the one dot x release, right, would run on a single server. It intentionally did not support like add emissivity across updates to multiple documents because We knew that that would be a particularly thorny problem to tackle as we got towards native clustering in the two Dotto release. But it had a model that, you know, was providing good isolation between updates to individual documents on a single deployment in the next major release the two Dotto release and this is something that continues in three Dotto, we adopted an eventually consistent system for replicating database, you know, documents in databases across shards, and use the same basic revision tracking mechanisms that are implemented in couchdb support for replication across instances as the way that we would converge on one view of the database in a couchdb cluster. And you know, that that reuse had plenty of benefits. It meant that the protocol we use and frankly, the implementation that we use, was very battle tested. But it also had its downsides. You know, the the fact that these systems weren't executing any consensus protocol, they weren't providing any sort of rollback mechanism meant that anytime you were concurrently issuing edits to a single document from multiple writers, you ran the possibility that these folks could race. And that both could be accepted by different replicates within the cluster. So we certainly live through a number of the downsides of sacrificing traditional levels of database isolation in the name of scalability. And I think, you know, over the years, our users have understood the kinds of patterns that can be employed in application design in order to avoid that kind of contention. Nevertheless, the fact that they have to employ those kinds of patterns limits the the ways in which they can design their applications to make the best use of the database. So I think we've, you know, we've we've, we've had a front seat, I guess, at that, you know, that movie, of how, you know, isolation and scaling Ability have been in tension with one another in the world of distributed databases over the past decade. And I think we are, as a result, deeply appreciative of the power that, you know, strong consistency and transactional support, can can can provide application developers.
Tobias Macey
0:14:16
And this is probably getting a little deep in the weeds. But my understanding of the way that couchdb was architected, and the use cases that it was optimizing for was to prioritize write throughput. And one of the ways that it did that was by forgoing indexing on right and requiring those indexes to be built on read requests, which could sometimes lead to high latency in those read requests. So I'm curious what the current state of affairs is, and some of the trade offs that have been accepted in the name of high write throughput.
Adam Kocoloski
0:14:50
Yeah, you've done your homework there for sure. I couchdb is materialized views have always been indexed on read and new and that is very much a performance optimum. Because those indexes are defined in, you know, JavaScript functions that the user would upload, we needed an environment that could pipeline the indexing requests through a, you know, a configured JavaScript process and firing up one of those for every single concurrent write was something that was was harder to deliver high throughput out of the trade off being that those indexes were, you know, potentially inconsistent with one another, each copy of the data, like each copy of a database shard would be indexing independently. And that introduced additional some, you know, potential for in consistencies in the observability there, like, you might hit one copy of a secondary index on one request and another copy in another index. And guaranteeing that those things had sort of a view that was progressing through time was an extra challenge for us in the clustering layer.
Tobias Macey
0:15:47
And I'm wondering if you can talk a bit more about the current architecture and implementation of couchdb and some of the ways that it has evolved in those major revisions and the upcoming three Dotto release.
Adam Kocoloski
0:16:00
Yeah, we can dig into some of those things. The the three Dotto release continues to bring with it the same fundamental architecture that the two Dotto release had, right. So we still have databases that are split into shards, each shard is replicated documents are accepted by the cluster when a majority of the copies of those shards accepted right indexes are built on read. What three dot o brings with it is, you know, a lot of things that ease the administration of, you know, a couchdb cluster by having auto indexing demons running in the background by having automatic demons that are, you know, vacuuming or compacting the database files based on you know, analysis and the workload. It brings with it a new you know, Full Text Search integration that gives people additional flexibility in defining their secondary indexes. It also brings with it one scalability improvement which is essentially a support for compound price. Ricky's where the user has a little bit of additional control over colocation of documents. The way we have always answered view index queries in couchdb has been a scatter gather mechanism, each of the replicas of a database shard built its own copy of the index. And when you go to query that index, we have to go ask each of the shards, what its contribution to that query is because we don't know a priori, which shards host relevant data for that query, what the new partitioning feature allows is for a user to say, every document sharing this prefix a device ID, for example, or a user account ID in the case of a you know, a SaaS application should be co located together. And any query that specifies that that first portion of the key in its query can be satisfied just by that one shard, rather than having to do the scatter gather. So this is a much more scalable approach to indexing and it's one that we find meets the needs a lot of the different use cases for review queries in couchdb. So that's a nice improvement.
Tobias Macey
0:18:01
And couchdb being a document oriented database, I've always been a little intrigued in the data modeling requirements of those types of storage engines were at face value. When you're first starting off doing the Hello World tutorial, you say, Oh, of course, documents are easy, I just have everything in one record. But then as you start to scale and want to try to do more complex analyses, or try and figure out joins, or how to be able to compose different records together, it starts to become much more challenging, and you have to get much more detailed and figure out how you're going to handle generating and then using these documents up front, rather than what is proposed as the sweet spot of these where you can just throw documents in and then figure out after the fact what you're going to do with them. And I'm curious what your experience has been in that regard.
Adam Kocoloski
0:18:52
Yeah, I think that's a great perspective. I would say that the view engine has been a powerful For assist to our users in that regard, because it gives you like the ability to define an arvest, Treasury JavaScript function that's executed over these different documents. If you end up with an evolving schema, it's possible to address those things, not by a large scale data migration, but by some, you know, some extra handling in the view code. We've certainly had users do that at scale. It's also the kind of thing that can do some simplistic kinds of joins by picking out you know, the related attributes of different documents that are different, you know, sort of represent their own classes of objects, and pulling them together into one view, that then, you know, you can issue a range request against that view to get a blog post and all of its comments in one query to the database. So the flexibility of the view system is something has been something that has given people the ability to recover from, you know, changes in the data model over time, right without a large scale migration. And it's also given them you know, kind of the escape hatch to be able to do Cover sort of unexpected requirements in the application layer, you know, without without making fundamental changes to the model. But I would say, you know, our lot of our users have found that the trade off of being able to get to market faster has been a worthwhile one. For many of these people, like the time to market has been, you know, of the utmost importance. We had a lot of gaming customers for quite some time where, you know, they could had a very predictable measurement of the revenue associated with, you know, launching additional games on the App Store and sort of turning the marketing machinery on it. And I can clearly remember conversations with them where they said, Look, we know and evolution of the data model is really the right approach here. But when we do the math, it doesn't make financial sense for us to delay our release timeline in order to go through that, that that remodeling of the data, so we'd rather just scale this thing out a little bit further and see how much longer we can make this last.
Tobias Macey
0:20:58
Yeah, the fields spacetime trade off. And another implementation detail of couchdb that I'm intrigued by is the fact that it's written at least primarily in Erlang, which I also know the React engine was written in and a few other layers such as rabbit mq. And I'm curious what you have found to be both the benefits and the challenges of that being the runtime for couchdb.
Adam Kocoloski
0:21:22
Yeah, maybe challenges first, or lying is not particularly well suited to high throughput number crunching. You know, there are a lot of things that can be computationally intensive that you've done in pure Erlang. And so we've had to be fairly careful about what kinds of processing we do, and be judicious about pushing things down into C code, you know, as as needed in order to hit our throughput and latency targets. That's been the main downside. I guess the other downside is like the pool of developers from which one can hire is not as large as it might be in other languages. On the flip side, the people who know it tend to know it pretty darn well. And so you know, there's a fairly high cost Have developers in that community, other benefits that we've seen, it's excellent for building cloud services for building web services, there's a great degree of isolation, you know, if a sort of road client process comes in and make some crazy request, it can take down that TCP connection. But it's unlikely to have, you know, a bigger blast radius taking out other parts of the stack. You know, that that's something that has been a fundamental design principle of the Erlang system from its days power and telecommunications, which is, which it still does today. The other thing that I think has been a real boon for us is the operational visibility into the system. You know, the fact that you can kind of poke around in the running VM and gather a whole bunch of interesting diagnostics. And and frankly, if you're feeling especially adventuresome, make changes to the running VM is something that is is pretty darn useful for chasing down the kinds of bugs that are you know, heisenbugs, right. They don't seem to lend themselves to an easily reproducible test case, but that are cropping up in these kind of strange in production, the Erlang VM, in my experience is quite a bit more amenable to that kind of operational visibility than then other runtime systems that I've been working with.
Tobias Macey
0:23:10
Yeah, I've heard some pretty remarkable statistics in terms of the reliability that you can get out of Erlang, such as I believe there was one company that's managed to have a system that had something like five minutes of downtime over the course of maybe a dozen years, which is pretty crazy.
Adam Kocoloski
0:23:25
It's, you know, it's fascinating, it is possible to design those kinds of systems and do things like you know, upgrade the code of a running process without actually restarting the lightweight Erlang process inside the virtual machine. And we've done many of those kinds of things over the years, there's a little bit of attention with all of the world of container based cloud native programming and Kubernetes. where, you know, Kubernetes and its ilk have their own opinions about how one ought to upgrade running services. And, you know, if you're really aspiring to upgrade the running code of process without hanging up the connection or like has tremendous tools in that space. But they don't necessarily lend themselves toward that kind of hands off. declarative notion of let me go now upgrade this deployment in my Kubernetes cluster.
Tobias Macey
0:24:12
And so that brings us a bit into the re platforming work that you're trying to do with foundation DB to replace the storage layer. And before we get too far into that, I'm wondering if you can give a bit of detail into the current way that couchdb handles storage and some of the benefits that you're hoping to achieve with this re platforming?
Adam Kocoloski
0:24:35
Sure. So let's talk about the current storage and this is a this is a storage engine that is entirely of our own invention. It is a fully copy on write storage model, changes to you know, a B tree index and one of our files involve rewriting the entire path from that leaf node up to the root and appending, all of that to the end of the file. This is a very robust design. It always leaves The database in a consistent state, we don't have to do anything like replay redo logs after an unclean shutdown, you know, you can kill minus nine a couchdb process and start it back up and it will automatically seek from the end of the file to find the last consistent snapshot of that file and use that was beautiful for operations, right, it very much simplifies a lot of the recovery processes that one would otherwise have to undertake in a more traditional database design. But that's just a description of the storage engine used for one replica of one chart file. And then you kind of climb up into the level of the eventually consistent clustering architecture and you know, that's a whole other ball of wax. So when we were looking at foundation dB, our interest was not primarily in what you would think of as the storage engine, you know, rocks dB, or a level DB or a sequel light or, you know, couchdb trees. We didn't want to regress on that front. We thought it was really important to have a bulletproof, reliable, well tested storage engine. Our interest though was in how do we provide you know serializable isolation at that storage layer while also getting horizontal scalability. And I feel like that's the problem that foundation DB as a project has just basically spent all of its time trying to tackle, let me provide a basic key value interface. And let me provide strict serializable isolation over updates to those keys. And let me do it in a way that allows for horizontal scalability across a cluster of machines. Those primitives are things that we looked at from a couchdb perspective that, wow, if we really had that underpinning, couchdb, we can deliver richer API's to our users, we can improve the scalability of some of our operations. And you know, we can sort of refocus our efforts on more of the things that you know, couchdb does that are truly like uniquely differentiating the replication
Tobias Macey
0:26:48
layer. And in terms of this re platforming, I'm curious how you're planning to handle the sort of redistribution of features and capabilities. Within couchdb, where it seems that in terms of the data replication that can be relegated to foundation dB, but I also know that couchdb has out of the box support for change capture feeds. And I'm wondering just sort of what the overall re architecting will look like and where the capabilities will end up lying between foundation DB and what role the couchdb front end will serve?
Adam Kocoloski
0:27:25
Yeah, it's a great question. So our view here is that you can you can look at it as as a set of layers and that's how foundation DB often talks about consumers of its you know, its data store at the lowest layer foundation DB provides the durability, all of the data that couchdb is storing is stored in a foundation DB cluster, but that foundation DB cluster is not directly exposed to consumers of couchdb. On the top couchdb is providing you know, the the the familiar JSON HTTP interface, the materialized view indexes, other types of secondary indexes, search indexes, and so on and so forth. Those are all being written into foundation DB as the needs of an application using couchdb grow, right as its throughput requirements grow, that foundation DB cluster can horizontally scale to accommodate more data and to serve more throughput. And the couchdb nodes on top can horizontally scale because they're stateless. They're essentially an application layer over top of foundation DB at this point. So we have good scalability stories for both the storage and the compute, if you like, providing couchdb user interfaces, if you but but all that said, that's one instance one deployment of couchdb. It's one endpoint that a user would interact with in their own application. If now you're saying, Well, I'd like to have one of these instances running in my on prem data center, or I'd like to have one running in US East and one running in US West. We view those as separate deployments of couchdb, each with its own foundation DB environment under the hood. And couchdb takes care of synchronizing the data between those different environments.
Tobias Macey
0:28:51
And so it seems that this is going to impact the operational characteristics of running a couchdb cluster as well and some of The ways that users will need to think about how to actually deploy and maintain their systems. And I'm curious what in the sort of current states of how you've been working through and experimenting with this, the operational characteristics have improved and any new edge cases or new considerations that operators need to be thinking about as they're designing these deployments?
Adam Kocoloski
0:29:24
Ah, yeah, that's a great topic. And it's one that when we at IBM first proposed this direction for the future, the couchdb project, we certainly got some questions on that front that said, what's all well and good for IBM to say, Hey, we're going to run foundation dB, you have the skills to go, you know, put a team around that and make sure you can do it. What about the users who are running their own couchdb? How are we going to make sure that they're comfortable with the administration of this project? And I guess what I would say is a couple of things. One, foundation DB is some of the best tested software I've ever seen. And that testing is not just about sort of the static, you know, details of unit and integration and system testing. It's also in Reducing and injecting all kinds of random faults into running clusters and ensuring that the system ultimately, you know, never sacrifices its acid semantics and gets itself back up into a running state. It does make some different trade offs, you know, it's not, it will take itself offline if there is a chance of, you know, data corruption. And, you know, it makes that that particular trade off in the cap triangle. But we, you know, we looked at it not just from the perspective of the functional capabilities that it was providing in terms of key value transactions, but also in terms of whether the project had the same commitment to, you know, data safety, data protection, data durability that users have come to expect from couchdb. And I think there's a really good match there. And, you know, I would say that our focus so far has been on the development work to bring over all of the capabilities of couchdb on this, you know, like architectural foundation of foundation dB, but we recognize that you know, getting in the run up to the Ford auto release, one of the things we're gonna have to do is make sure that there's a familiar set of operational tooling for our existing users. So that even though they're running a different distributed storage engine under the hood, you know, they, they, they're able to sort of be able to manage it in a way that makes sense to them. So yeah, I mean, it's clearly part of what we're thinking about in terms of the must haves for the photo release.
Tobias Macey
0:31:22
And another challenge that I'm curious about how you're handling is the migration path for people who have an existing installation. They've already got possibly sizable amounts of data in their existing clusters and being able to move that to this new deployment with foundation DB as the layer that's actually handling that storage.
Adam Kocoloski
0:31:42
That's right. So as a project, we have not typically tried too hard to do in place major version migrations. We didn't have that from one Dotto to two. We do anticipate that being something people can do from 200 to 300. But we think folks, you know, we think as a posture of saying jumping from three to four Dotto requires a data migration is something that I think people will find acceptable. We do have plans in place to optimize the data replication capabilities specifically for this cutover from 300 to 400.
Tobias Macey
0:32:16
And I think that's a smart allocation of resources on our part. Because anything we do that makes data replication faster, more resource efficient, more reliable, is something that helps out with a whole bunch of use cases, not just the three out of four data migration. And so for somebody who is moving, would it just be a matter of deploying a new set of instances, joining them to cluster running three Dotto and then letting all of the data replicate and then phasing out the three Dotto cluster as you hit certain points of the different shards being replicated?
Adam Kocoloski
0:32:48
Yeah, we would look at it not so much as joining to the existing cluster but just standing up as a cluster next door and setting up a replication job to sync the data from the old cluster to the new cluster and then having a load balancer over top three point the end. points to point to the new cluster.
Tobias Macey
0:33:00
And then as far as the actual work of migrating the code base and integrating with foundation DB and figuring out what the separation of concerns are what have been some of the biggest challenges and some of the most interesting experiences that you and the other developers have had,
Adam Kocoloski
0:33:18
if I go with challenges first, one of the things foundation DB has done a good job of is being very explicit about the kinds of limits that are in place, and the kinds of key sizes, value sizes, transaction durations, and so on are supported by the engine in couchdb, we've been a little bit more lacks on that sort of thing. And so it has forced us to kind of come face to face with some of our relaxed postures around, well, how big of a document could you store in the database? Or how long could that request actually last? And, you know, it's, it's frankly been, I think, a conversation that we needed to have as a community, because at some point, you just end up in a situation where you sort of say, Well, I guess we didn't explicitly say you couldn't Do this, you know, 10 gigabyte document. But as a practical matter, it's not going to be very good experience for you. Now, we're getting much more explicit about the kinds of limits that that, you know, are the sort of safe operating envelope for users of couchdb. But it's never pleasant to to introduce restrictions where there weren't any before, right, because inevitably, you end up breaking some edge case, you know, in the user base. Um, so that's been, you know, just something that's occupied a lot of time and energy on our part, as we, you know, go through the audits to see like, okay, where can we put this limit? So and who can we work with to make sure that we, you know, bring all of our users forward as we get ourselves onto sort of a stronger footing overall? I think on the on the benefit side, I would say, one, the foundation DB user community and developer communities is incredibly sort of talented and helpful. You know, we we've been amazed every time we reach out with any sort of design question we get this like, three paragraph answer that anticipates the other thing we hadn't yet thought of that we were going to run into, you know, the next time we came back to them. So that's been that's been super, super helpful. And I think as people have gotten deeper into the implementation of the couchdb functionality on foundation dB, I think there's just been a growing awareness of an appreciation for all of the powers of transactions inside couching. People are starting to come up with new ideas for ways that that, you know, like that underlying capability can make our lives better as database developers.
Tobias Macey
0:35:29
And in terms of once this migration is complete, and people are running for Dotto release of couchdb. With the underlying foundation DB engine, what are some of the new use cases that this might enable or some new considerations in terms of the ways that people are designing their applications on top of couchdb we'll need to think about or any thoughts in terms of exposing the underlying foundation DB engine as another interface for being able to interact with the data stored within couchdb.
Adam Kocoloski
0:36:03
That is an excellent topic. So we have, as a community said, Look, our goal here first and foremost is to try to look at all of the different parts of the existing API and make sure that this is a relatively non disruptive upgrade for our existing user base. That being said, It's incredibly tempting to look at some of the things that we can't do today in the couchdb API that we might be able to do in the future with the foundation DB storage layer. And, you know, as we have started having some of those conversations in the community that say, what could we do from a transactional perspective, you know, now that we have this underlying, strictly serializable storage layer that scales across multiple machines, what could we do that, you know, we don't do today, and nothing committed on that front right now. But I think more and more people are appreciating that like, there's a real interesting opportunity there. You also talked about actually exposing the foundation DB API. And I think that's a place where a lot of people go, they kind of look at it. So as this is this like a multi model kind of database where I can use the document API on the one side Graph API on the other side, I think what you find is that anytime you're working with foundation dB, when you're building a layer on top, if your goal is to build the best document layer possible, you're gonna make a number of design decisions that will be intention with something that says, hey, I'm trying to build a multi model database. Therefore, I'm going to create this completely generic data model under the hood that allows it to be accessed from different types of query languages and different types of API's. So I have I've looked at it and said, You know, I think the right way to think about this is foundation DB as an internal implementation detail. And then you know, you as a tool in our tool belt, under the hood, that allows us to deliver a great document database, and if over time, someone else should say, hey, I want to build a great graph layer over top of foundation dB, that makes the foundation DB project better. But I'm not a huge proponent of thinking that like foundation DB is the right level. abstraction to allow people to have a common data model that can be accessed by different types of API's and different types of query languages. I guess the last piece you said was, well, what about the foundation API API directly. And that was an interesting one, too. If you look at FTB, today, it's got this kind of very tight coupling between the client layer and the server layer. That is, you know, a very different experience than say, using couchdb REST API. The clients need to be kind of participating in the upgrade of the server. And they kind of need to be very cognizant of the version of the server that's running. And there's all kinds of details that go into that there is some talk in that world that talk around introducing a more stable API, g RPC API, or something of that nature that will allow for that kind of more direct exposure foundation dB. But that's not something that we as a product are really looking at. I think our view in couchdb is that couchdb is the kind of database that can be exposed directly as a cloud service that can be exposed, you know, powering web applications that has the kind of security model that has the kind of access control model that people need to have there. And FTB, frankly, just works at a lower level than that.
Tobias Macey
0:39:07
And one of the interesting things about using foundation DB as the storage layer for couchdb is also the fact that because it has this capability of implementing different types of data storage engines or databases on top of foundation DB is being able to have a common cluster of the foundation DB engine with multiple different use cases on top of it. And I'm wondering what you have seen as the viability of that or if it would just cause too much conflict in terms of the operational characteristics and what the different use cases are trying to optimize for in terms of compute versus memory versus network, etc.
Adam Kocoloski
0:39:51
Yeah. That's that's something that we've turned over in our heads a couple of times here. I would say we have seen users of foundation at scale, embrace a single foundation DB cluster for a number of use cases. But those are situations where it's kind of one development team one SRU team that's cognizant of the different use cases. And they look at it as a, you know, a scalability improvement in efficiency improvement for them to have one storage layer powering these different micro services with different data models. When you get a little further afield, and the workloads really are very distinct from one another. They're not micro services powering one solution, but they really are like totally different databases designed for totally different user communities. That's the point where I would probably caution and say that the benefits of consolidating the storage layer onto a single instance of foundation DB probably are not worth the potential for, you know, clashes and mismatches like impedance mismatches in terms of the resource requirements of those different databases. Nevertheless, I think, as you know, an organization thinks about developing an expertise in the founder DB layer, I think there's still a ton of benefits to saying, hey, an organization that says we'd like to run the document layer of foundation DB and the couchdb layer and the graph layer, we have different users that want to do that. And we see benefits in an organization of saying, We've got one operational team that runs the foundation DB system reliably. And at scale. I think that's something that makes a ton of sense, because I do think that these different data models, they still can be translated at that level of abstraction of foundation DB into a set of, you know, a transactional throughput, a key value, read throughput, write throughput and amount of data stored. And you know, that kind of translation into those more low level resources is one that I think provides a lot of operational benefits and a lot of capacity planning benefits for an organization. So I guess that's a long winded way of saying I would stop short of running multiple disparate workloads on the same foundation DB cluster, but there are still a ton of benefits from different workloads all compiled noun as it were to the common abstraction layer of foundation DB itself.
Tobias Macey
0:42:04
And another interesting thing to explore is with couchdb. There are different use cases that are obviously beneficial to the way it's designed. But what are some of the cases where it's the wrong choice, and it might at first glance look like it's useful because it's just a document engine, and you want to be able to store documents, but you would actually be better served using something else, whether it's a different document database, or going with more of a relational use case or some other type of data model.
Adam Kocoloski
0:42:33
I think the number one that we see today is cases where you have very short lived data, you know, couchdb spends a decent amount of energy on ensuring that data written into a database can be replicated to any peer at any point in time. And so there's a fairly long live set of metadata around each record that you put into couchdb. And if you're intending to use it is a you know, a place where you know, the record only has to live for a short period of time and then you want to completely purge That will very much be at odds with the all of the metadata tracking that says, Well, you know, we have to prepare for the case that some replication peer comes in a month later. And we want to make sure that they know that this document existed. These tombstones that we keep around, we keep around forever today. And I think, you know, as a community, we can debate the merits of that. And maybe we enhance things downline to allow people to configure databases that are optimized for that short lived use case and don't care about multi region replication of that particular class of data. But as it stands today, if you have short lived data couchdb is an expensive proposition. For that use case when it comes to you know, sort of the the relational side of things. I think that detailed ad hoc analytics are something that couchdb is not particularly well suited for. It's true of other document databases in general as well. I think it's well suited to delivering low latency responses to a set of queries that power a web application. It's not so well suited for mixed ad hoc x murmuration where you need all of the data from a certain set of fields, and then you need to go kind of compute a bunch of aggregations on the fly and, you know, merge that in with another data set. the relational model shines at being able to adequately serve unanticipated queries, right document databases don't do as well on that front.
Tobias Macey
0:44:18
And what are some of the most interesting or unexpected or innovative ways that you've seen couchdb used?
Adam Kocoloski
0:44:24
I think we're always heartened by the kinds of use cases that involve bringing workers out into the field, like we've had insurance companies that had couchdb powered on devices to go out and perform, you know, claims analysis and the aftermath of a natural disaster. You know, one of our collaborators in the community built an application for healthcare workers to respond to the old crisis, where you could go through a day without any kind of connectivity, but you wanted to enable accurate record keeping as you were going out on on on tour, but you still needed to roll that back up into a complete view of the state of you know, the response So that you can make the right decisions about where to allocate resources going forward. Those kinds of use cases have been an awesome fit for couchdb. And it's just really, I guess, it's not unexpected. I mean, that's the kind of thing we anticipated. But it's always great to see people kind of choose the right technology for the job and tackle something that is quite far afield from kind of us sitting in the major tech hubs of, you know, developed countries and building out more enterprise, you know, SaaS applications. So those kinds of things have always been really heartening for us to see. And, you know, it's turned into something like trend there. I think we're also now seeing quite a lot of uptick in, you know, retail environments where they're starting to look at ways to have a more distributed view of the datasets at their different, you know, warehouses, shipping centers, points of presence, retail environments, and they're starting to look at the kind of replication mesh that you can get from something like couchdb and saying that that's actually probably a really reasonable fit for us and has, you know, material improvements for like, be available. of our system for the independence of the stores. It's just, you know, our our world in retail is distributed and why shouldn't our database be distributed?
Tobias Macey
0:46:07
And one other thing that I was just thinking about is we discussed a little bit about the changes in the operational characteristics once this shift to foundation DB is complete. But are there going to be any changes in the client interface where there will be some exposure of things like the transactionality of the underlying data? Or do you anticipate that the user facing interface is going to remain largely unchanged?
Adam Kocoloski
0:46:33
Our goal has been to make sure that this work in four Dotto is, you know, an upgrade of the experience for our users. Right. So our first focus is making sure we don't break existing user applications and we give everybody a path forward. We are super excited about exposing some more of those rich semantics up through the couchdb API. And we've started those discussions about what would it look like to expose transactions in a database that is fundamentally you know, asking you to make an HTTP request. For every interaction with the data store, so I look at that as a pretty interesting design opportunity, frankly. And I think there's a lot that we can do there, that gives people transactional semantics and a cloud native world without, you know, sort of breaking it all down into sequel. But, you know, we kind of have to keep our focus. And I think our goal right now is to use foundation DB to eliminate the eventual consistency of a cluster to improve the scalability of our systems with an eye towards in future releases, surfacing, more new semantics that take advantage of the transactional capabilities that we'll have under the hood.
Tobias Macey
0:47:35
And what have been some of the most interesting or challenging lessons that you have learned in the process of working with the couchdb project and community and being involved with it for so long.
Adam Kocoloski
0:47:46
I would say that over the past decade, I've just gained an increasing appreciation for the all of the realities of bringing a database to production. As a user, okay. And you know, writing the actual code that managing manage the transactions is just a small part of the overall picture. We've we've put a lot more effort in recent years into making sure our users understand how to deploy and manage and monitor the database. And you know, have just been a lot more explicit about the way it works and the way it can be used. And I would say, like, my experience on that front has led me to place more and more premium on documenting our design decisions, ensuring that we have full documentation of what we're doing when we're doing it. Because, you know, we've sort of gained a greater appreciation as our client, user base has has grown and matured, we've gained a greater appreciation for how much we can make their lives better up front with all of those sort of non code assets that go along with the evolution of the project. And I think that's actually a really nice kind of echo of some of the things that go On the Apache Software Foundation, like people sometimes think of committers to an open source project as the folks who are fundamentally changing all the core data structures on disk. But in fact, we invite committers of all shapes and sizes, lot of people who don't program in Erlang. But they make these contributions in all other facets of the project that make it you know, something that's more consumable and more reliable and more approachable for our users. And I think as you look at the project's evolution, over time, more and more of our investment has gone into all of those other surrounding assets that make it a more well rounded, more approachable project as a whole.
Tobias Macey
0:49:34
And circling back. One other thing I forgot to ask about is because of the fact that this is such a drastic change in the implementation and architecture of couchdb, there's definitely a lot of opportunity for introducing new bugs or regressions. And I'm curious what your approach has been to handling the testing and validation of the functionality of couchdb to ensure that you don't have have too many breaking changes.
Adam Kocoloski
0:50:01
Yeah, I mean, I don't know that we have any sort of special sauce there, we've got a robust test suite that we've maintained over time. And because we're committing to largely preserving the existing API, the existing test suite is something that we expect to be able to, you know, run on adulterated. We've also got like a good performance engineering team at IBM who's looking at all of the others kinds of scalability issues and chasing down any cases where, you know, the new system may be a regression from the old one in some particular aspect. The Erlang community has done a lot of work in things like property based testing, you know, kind of really sophisticated types of fuzzing to introduce all kinds of unexpected inputs and see how the system responds. And we have some people who have done some work on that front as well. Usually, it's about testing internal data structures more than testing external API's because it is quite expensive to generate this kind of like huge face base of test cases programmatically, but I feel like there's maybe some opportunity for that as well. And you know, I guess We also just have, you know, the benefit of having within IBM, a large and diverse user base of people who are friendlies and are willing to work with us on this journey because they appreciate the benefits that we're bringing to bear. And so we've been able to lean on them early and often to help assess out where anything might be happening here. That's unexpected.
Tobias Macey
0:51:19
Are there any other aspects of the couchdb project and its ecosystem and community or foundation dB, or the work that you're doing to re platform onto the foundation DB engine that we didn't discuss that you'd like to cover? Before we close out the show?
Adam Kocoloski
0:51:34
No, I think we've covered a whole bunch of really interesting Gorky aspects of building databases. I think that this approach of kind of a separation of concerns, right, and the, the API that foundation DB offers is an incredibly powerful tool in the data service developers toolkit, and I'm excited to see more people adopt that line of thinking, I think, as a database community as a whole, we'd be better off for it. So Frankly, I'm just you know, I'm jazzed about the project and where Thanks.
Tobias Macey
0:52:00
Well, for anybody who wants to follow along with the work that you're doing or get in touch, I'll have you add your preferred contact information to the show notes. And as a final question, I would just like to get your perspective on what you see as being the biggest gap and the tooling or technology that's available for data management today.
Adam Kocoloski
0:52:16
That is an excellent question. When I think of data management a little more broadly, I think that we've got a ton of people who are struggling to kind of gain as much insight as they can from the data that they've collected. And a lot of that comes down to kind of like, the availability of good data catalogs that are integrated into the entire workflow. Like we've got all these users, you know, all these clients who have databases scattered across the organization. And they know that in order to build a machine learning model off the data they're collecting from their application, they're going to need to do a whole bunch of data, you know, engineering, right data curation and transformation to get it into a shape where it's ready for the training of the model. At every step in that journey. You're creating more data and you don't always understand how those datasets relate to one another, you don't always understand, you know, when there's some correction, the source data, what types of downstream effects that has. And so I feel like we spend an inordinate amount of time and energy manually curating all of these data sets, because of like, the lack of capture of lineage information and how they relate to one another. And it's a super hard problem, unless you're operating at a scale where you have like ruthless rules about how data gets processed, because, you know, there's no other way to do it than to use the one standard tool. Most of our organizations don't like that people use whatever tools they want, but you end up with this explosion of datasets and and very little idea about how they all relate to one another. So when I see you know, a gap in tooling or technology, for me, it's you know, it comes down to this this metadata management at scale, it just becomes a huge problem for everybody.
Tobias Macey
0:53:52
Yeah, that's definitely a common theme that has been something that's come up a lot of times in my conversations, but increasingly frequent In recent episodes, and in the past several months and probably past year or so, well thank you very much for taking the time to join me and share the your experience of the work that you've done with couchdb and your work that's ongoing for replac forming on to foundation dB. It's definitely an interesting database and product and it's an interesting undertaking for something that has been in production for so long. So I appreciate you taking the time to share your insight and experience with that and I hope you enjoy the rest of your day.
Adam Kocoloski
0:54:31
Thank you, Tobias, you as well.
Tobias Macey
0:54:38
Listening Don't forget to check out our other show podcast.in it at Python podcast comm to learn about the Python language, its community in the innovative ways it is being used and visit the site at data engineering podcast comm to subscribe to the show, sign up for the mailing list and read the show notes. If you've learned something or tried out a project from the show, then tell us about it. Email hosts at data entry Hearing podcasts calm with your story. And to help other people find the show. Please leave a review on iTunes and tell your friends and coworkers
Liked it? Take a second to support the Data Engineering Podcast on Patreon!