Yet another product blog that doesn't manage to prominently link to the product.
If you want to make my life a tiny bit easier:
- blog.<project>.org needs a link to <project>.org because half of you visitors want to read the "it" before they read the "news".
- github.com/<you>/<Transmogrifier>For<ThatProject> should have a link to <ThatProject> in the description or the readme's first paragraph. I often come across interesting plugins without knowing anything about projects' they are for.
(and yes, I know there are actual problems in the world. But these are easy to solve)
So true.
Half of the time when I find myself on a engineering blog for a startup, I am curious about their product but there is no link to their main page. Clicking on the logo redirects me to blog.<startup>.com
As an low-profile developer who's been using CouchDB for a long time, some weeks ago I've written some quick personal opinions on what CouchDB has become: http://fiatjaf.on.flowi.es/about-couchdb/
Early CouchDB contributor here. I agree with much of your write up, but I think the picture for apps built around replication is brighter than it seems. PouchDB has crazy momentum. Couchbase Mobile is getting baked into the next generation of infrastructure at places like GE and big airlines.
If you want filtered replication, we designed Couchbase Sync Gateway because we thought db-per-user was to heavy. What's fun is to think about the options for mix-and-match across the stack.
That's what keeps me liking the thing. My point was just that it could be have been so much better (not that I know HOW that would be, but it was the general idea of everyone involved in the past, it seems).
It would be cool to read more about where "replication apps" are being used and how.
I'm sad that the Couchapp thing never took off or is getting de-emphasized. That was one of the really brain-bendy ideas about Couch that I loved, that these DB-side applications could also replicate to other people.
I'm only just catching up with Couch and finding the db-per-user stuff. Do you know of anyone shipping a Couch instance inside thick clients, like a traditional desktop application?
I guess CouchDB is not the easiest thing to install in a non-technical person computer, but PouchDB, on the other hand, is as easy as it can get.
I don't understand how projects like remotestorage, for example, aren't using Pouch and Couch and instead try to develop their own replication protocol without success for many years.
My thoughts exactly. I'm not sure who CouchDB is supposed to be for anymore. The original developers who fell in love with it are moving on. Who is the audience for 2.0? Who does IBM hope to reach with Cloudant?
BTW, you are spot on about continuous replication. Try using continuous replication on Cloudant and you will end up with a fat bill.
There are a lot of use cases, particularly around multi-region/DC master-master replication, where CouchDB/Cloudant is a great fit. IBM also offers MongoDB, RethinkDB, Postgres as managed services and the MVCC/conflict model in Couch is still a standout feature. The other use case I see a lot of is offline-capable apps backed by Couch replication-compatible datastores (PouchDB, Cloudant Sync, Couchbase Lite).
Cloudant recently changed its pricing structure [1] (via IBM Bluemix) which should significantly reduce the cost of continuous replication - that may be worth a look if the current model is problematic.
One thing that keeps me on CouchDB is Cloudant's pricing. It just destroys most of the competition. For example, the only RethinkDB hosting I know of is Compose and their pricing is $22/gb. Cloudant is $1/gb.
Some environments push you toward using it, in the case of IBM for example. Their Bluemix offering is very powerful and fast, they have Cloudant running all over that thing. It becomes the best and fastest way to store files.
That doesn't mean it's the best technology but it got me using it. It has been an enjoyable enough experience, not a lot of complaints. A better querying interface and the addition of an update operation would go a long way but those both seem to be included with this release.
> The replication protocol, which supports multi-master, has changed little
Basically true, though with many interoperating, independent datastores it's a tricky thing to evolve. 2.0 adds an additional endpoint, _bulk_get which can significantly reduce the number of requests when CouchDB is paired with an on-device database such as PouchDB, Cloudant Sync or Couchbase Lite (the endpoint was inspired by the same feature in Couchbase). The CouchDB replicator itself has had many performance and stability improvements [1] and continues to be a significant focus for active development.
Also, CouchDB 2.0 introduces internal cluster replication using distributed Erlang. If you currently use CouchDB replication to bi-directionally replicate between machines on the same network for HA, replacing those with a CouchDB 2.0 cluster should be a big win.
> In other words: everybody seem to be looking at CouchDB as just a very poor and limited MongoDB.
Query doesn't pretend to be MongoDB-compatible - it provides a syntax that should be familiar to MongoDB users and more query flexibility than views allow. I think Query still has a fair way to go - this is the first release - but it's a move in the right direction.
As to whether CouchDB is viewed as "a very poor and limited MongoDB", they are very different databases. CouchDB is a good choice if you want a rock-solid JSON datastore which comfortably scales up to multiple TBs / many machines, with multi-master replication over unreliable networks. Query support, as you say, is not as rich as some other databases, so if that's more important to you, there are probably better options.
> Filtered replication was implemented, but it is slow to the point that no one recommends that you use them.
The new _selector filter [[2] in CouchDB 2.0 offers a significant performance improvement for filtered _changes. It should be a small change for replicators such as PouchDB can take advantage of this.
> About Couchapps, the special database features that powered them in the first place were left aside
I don't speak for the project, but it seems there has been much debate about this in the CouchDB community and the conclusion was that there are better solutions to most Couchapp-shaped problems than running application logic in the database. The features that combine to enable Couchapps haven't gone away and will benefit from the general improvements in 2.0, but they haven't been explicitly developed.
Yes. Eventual consistent doesn't mean inconsistent, but eventually it will be. That means under a network partition (P of th CAP Theorem) things should converge after the network converges. This is precisely the type of semantic that Aphyr wrote Jepsen for. See his testing of Cassandra (eventually consistent) or the CockroachDB team (eventually consistent) or Risk (eventually consistent) running Jespsen to prove things work as expected.
On their site they do say however: "And we care a lot about your data. CouchDB has a fault-tolerant storage engine that puts the safety of your data first."
So, yeah, I think even if eventually consistent, and not immediately consistent, they want you to feel they have taken care of your data when designing the thing, so seems prudent to do.
As a rule, yes. I'm not at all an expert on (or even all that familiar with) CouchDB, but if you make any guarantees at all about your distributed storage engine, then those guarantees are somehow testable.
Congrats to the CouchDB team! I remember playing around with writing a NodeJS driver for it all the way back in 2011. Sadly, I decided to use and make a driver for MongoDB instead simply because it was easier at the time. Despite that choice, I have always admired CouchDB for pushing the frontiers on stuff like Offline-First with PouchDB way ahead of its time (and also what inspired me a little bit for our Open Source Firebase alternative http://gun.js.org/ ).
I am particularly really excited to see the announcement of the Mango query language, since querying Couch was one of lesser-easy things to do back then. I'm also very excited to hear about performance improvements, as this has been particularly interesting to me as I've been tracking various system's performance as I have worked on our own (Mongo with Wired Tiger, Cassandra, Redis, and even Chrome V8 engine as we have built towards 30M+ ops/sec, see https://github.com/amark/gun/wiki/100000-ops-sec-in-IE6-on-2... ). However clicking through on the performance links didn't lead to any numbers or benchmarks. I would love to see that!
Really happy that Couch is on the homepage of Hacker News. I really feel like they made lots of correct decisions that got passed over by the NoSQL craze, and have lately not been receiving the type of attention as it should compared to (what I biasly think) unfavorable but hyped up Master-Slave systems. People should really check into Couch's Master-Master replication!
Thank you both. I assumed that the messages indicated critical error which would mean that CouchDB was not in a running state but indeed the system is running and the admin interface is reachable at 127.0.0.1:5984/_utils/ where the setup can be completed.
No worries. Deciding it was time to play with CouchDB once again, I just spent the last few hours moving a little side-project I'm working on over from PostgreSQL to CouchDB 2.0. I liked CouchDB last time I tried it but then forgot about it for a while, and aside from that little bit of confusion, CouchDB 2.0 has held up to your motto -- "time to relax!".
Well, "moving over" might be a slight exaggeration seeing as how my little project is in such early inception that all I had prior to switching it over to CouchDB 2.0 was a few SQL-files defining the schema of various tables, as well as a couple of other SQL-files with queries to run and a bit of sample data. But anyway, the point still stands that said little project is now using CouchDB.
A fews things I wanted out of CouchDB years back:
- faster bulk indexing
- space reduction, I think a simple couchdb to psql json was 1/10th of size
- ES6 or even ES5 - * its been a few years since I last looked but I remember you had to tread carefully - Object.keys maybe was one?
When I started with CouchDB it wrong choice for so many reasons, client had <30gb of data, couchdb was cooler than node.js, and I was frustrated with SQL Server. In hindsight sticking with SQL Server or Postgresql would of been better - older/wiser today.
But I think the "logging" case is hard to argue vs. "manually" grepping (or ag-ing using the silver searcher) over in-place log files and aggregating/rendering dashboards via static files.
I used it to store user-uploaded images in a photo contest website years ago. It was a happy medium between storing images directly on the filesystem or as a blob/binary in a relational database. The image requests were AJAX, so I could include the clients screen dimension in the request. My app would resize on the fly if necessary, store resized image to CouchDB, and redirect the image request to CouchDB, which was publicly readable.
Another good use case I had was for JSONP request for an auto-complete input field on a webpage. Again, the database was publicly readable.
I have also used it to aggregate data for graphs. The data changed daily, so the ability to cache the results of a view until the data changed was nice. But the first request of data each day still took a while. I don't think I got a performance boost in this case, but I did get free caching.
All my other uses of CouchDB were mostly for fun and could've been implemented in traditional SQL.
Not using Couch directly at the moment, but have several projects I'm working on that are using Pouch as an easy to sync, offline-first data store. (These are largely targeting Cordova/Electron.)
That was a great book that got me into CouchDB. I think the original authors of that book, primarily Damien Katz and J. Chris Anderson, left for Couchbase. Today's CouchDB has different motives than the one that book was written for. Couch Apps are no longer the killer feature.
I don't "need" anything in a book, but it sure was handy to have a full, detailed breakdown with a lot of pictures when I read the first edition of the book without first having to installing some software (CouchDB) that, at the time when I bought the book to learn about it, I did not know if I had actual use for, and did not want to spend time finding my way through its interface in order to find out.
The admin interface of 2.0 will probably be familiar since I've already learned my way around older releases. Nonetheless, I think that there is no question about the fact that if there comes a second edition of the book that is going to cover 2.0, and seeing as how the first edition has some stuff about the admin interface, that's obviously going to need to be updated, no question about it.
So when I buy the next edition of the book, I will probably just skim through the admin interface stuff, but not all readers will have used a previous version of CouchDB, so to some of them, having that covered and up to date will be valuable like it was to me the first time I wanted to know about CouchDB.
So if you think I'm saying that I need the book to learn the admin interface, I think you did not understand what I meant, because that is not what I meant. I meant that I want to learn about mango and clustering, and that I also expect the next edition of the book to have the parts about the admin interface updated.
Ok, I don't think you meant you need the book just for the admin interface, I just made a jerky comment.
I was, however, under the impression that you actually needed a book for a lot of things and I didn't like that feeling.
However, now that you explained, I understand how good can a good book be. I also read that CouchDB book almost entirely before going on to use the database, before thinking about it, or having a need for it, and that made wonders to me. I had forgotten about this fact of my life. Thank you for reminding me.
It’s MongoDB inspired obviously, but at the time MongoDB asked IBM/Cloudant to not call it that. They settled on Mango. The Cloudant product then became “Cloudant Query”, so CouchDB easily inherited the Mango nickname. We like it :)
CouchDB 2.0 is a direct descendant of CouchDB 1.x. The API is 99% compatible, same community working on it, you can replicate between them and so on.
I haven't used Couchbase, but I understand besides the "Couch" prefix and that CouchDB's original author working there for a few years, it doesn't have much in common with CouchDB project.
Well Couchbase Lite is largely compatible as far as I can attest. Not sure how it is with Couchbase proper, but I was under the assumption that it is API compatible to CBL.
I remember CouchDB when it was initially released, it was one of the most promising and innovative DB's around. It's a shame it didn't live up to the hype.
I wouldn't say that it didn't live up to the hype. I think it just has a steeper learning curve than a lot of other NoSQL databases and is often overlooked.
MongoDB makes a lot of sense to folks already familiar with relational databases. Collections are like tables, documents are like rows, and it's easy to use document IDs to establish relations and perform queries like you would in a relational database. There are a lot of problems with using it like a RDBMS, but still a developer with zero knowledge of MongoDB can become productive with it very quickly.
When learning CouchDB, the first WTF moment is when you realize there are no tables and that all the documents regardless of the type of data they contain go in the same place. Eventually you learn that you can add a type field to the documents to distinguish between them, which does feel hackish. The next WTF moment is when you want to query the data and realize you have to use map-reduce to do what would seem trivial in any other database. The early version of the admin tool definitely didn't make this easy since it required writing a JavaScript function and escaping it so that it could be stored as a single-line JSON string. It also didn't help that the map-reduce code was stored in special documents that used magical document IDs to distinguish them from other documents, which again feels hackish but makes sense eventually.
That said, I am a big fan of CouchDB, and I hope that with the query language and new UI in 2.0 that CouchDB will earn the respect it deserves.
Yeah MongoDB kind of dumbed down whole 'rethinking db' attempt.
Compared to how much attention and developers others much simpler db's got, CouchDB definitely didn't get as much as it should and could.
I am looking forward to see new features for sure.
Hype or not, I inherited a project that had CouchDB baked into the bones. While I found its disk space usage and its time to reindex large databases frustrating, I did appreciate REST-like queries, how it just saved whatever you wanted and how the map/reduce queries could actually be quite powerful for ETL and reporting (though it took a good while to index every time we had a change). We abandoned replication early on as just too chatty over high latency networks.
I love the CouchApp idea because it's a natural extension of "save whatever you want." As long as we're saving json, why not save static files too? It lets you prototype quickly and, in our case, let us develop and test out specific features for clients by quarantining risky crap code on the fly into one database. I like to think we're beyond that stage, but at the time it was invaluable to keep multiple versions of our app running concurrently.
I have 3 solid years of experience working with CouchDB and in the end I find myself longing for PostgreSQL with one bson column for the unknown or "volatile" attributes. Our data was semi-relational, as I would argue is most data. Meaning there were honest to goodness has-many or belongs-to type queries that could have been simplified and easier to maintain outside the application code.
I know that's just a style, some people would use key-value stores for everything with no validation on anything. If you're not that far gone and you still like freedom then CouchDB might be for you. As for me, I wanted an error from my database if my application code asked for something or tried to store something invalid. But for rapid prototyping for us I don't think we could have gotten a better database.
> As for me, I wanted an error from my database if my application code asked for something or tried to store something invalid
It is not enough for actually relational data, but surely you know that CouchDB has validate_doc_updates functions specifically designed to check user rights and document structure ?
> , it was one of the most promising and innovative DB's around. It's a shame it didn't live up to the hype.
It promised master-to-master replication, so has that. Doesn't lose your data (actually fsyncs, yay!). Has a helpful integrated web interface, is HTTP + JSON interface.
I don't know I've shipped lots of products on it. I say it lived up pretty well to the hype...
If you want to make my life a tiny bit easier:
- blog.<project>.org needs a link to <project>.org because half of you visitors want to read the "it" before they read the "news".
- github.com/<you>/<Transmogrifier>For<ThatProject> should have a link to <ThatProject> in the description or the readme's first paragraph. I often come across interesting plugins without knowing anything about projects' they are for.
(and yes, I know there are actual problems in the world. But these are easy to solve)