Difference between all_nodes VS cluster_nodes CouchDB 2.X cluster membership

couchdb

The _membership endpoint in CouchDB (http://user:pass@domain:5985/_membership) returns the following json to me:

{
  all_nodes: [
    "couchdb@n1.domain.me",
    "couchdb@n2.domain.me",
    "couchdb@n3.domain.me"
  ],
  cluster_nodes: [
    "couchdb@n1.domain.me",
    "couchdb@n2.domain.me",
    "couchdb@n3.domain.me"
  ]
}

When would all_nodes NOT be the same as cluster_nodes?

I managed to setup a cluster incorrectly when building CouchDB from source – there were 3 nodes in 'all_nodes', but none in 'cluster_nodes'.

An answer points to the Server Fault question BigCouch all_nodes vs cluster_nodes, where using BigCouch the question author found _membership returned a list of cluster_nodes but nothing in all_nodes. (So similar to the question I'm asking, although opposite to what I've seen where only all_nodes had values).

I don't understand the answer:

I just heard from Robert Newson on IRC that BigCouch nodes are connected lazily.

Because:

  1. I thought I read in CouchDB somewhere that nodes to be online at the time they are added to a cluster; and
  2. This is at best one reason why there may be differences in the all_nodes and cluster_nodes results.

I'm effectively asking why there are 2 fields for what seems to be the same thing in CouchDB 2.x? My (very limited) experience is that a cluster that shows nodes in the all_nodes field but not in the cluster_nodes field doesn't work.

It's either not an answer to this question, or so far above my level that I don't understand it.

Best Answer

Your question has possibly been answered (in part) in the Server Fault Q & A:

BigCouch all_nodes vs cluster_nodes

The official documentation states that when you add a node to the CouchDB cluster, the output of the query _membership will return a list of all_nodes and cluster_nodes. No indication that this information may differ unless you account for:

to see the name of the node and all the nodes it knows about and are connected too.

...which slightly implies that there may be differences.

The linked Server Fault answer refers to Robert Newson, who is a developer of CouchDB:

I just heard from Robert Newson on IRC that BigCouch nodes are connected lazily.

I guess that is a quite an authoritative answer in that case.

When you request the _membership page, CouchDB might have connected all the nodes, or it might not have. This would explain why you noted differences in the all_nodes and cluster_nodes listings.


In your specific case where all_nodes contains more information than cluster_nodes the cause is explained in the official documentation:

  • all_nodes are all the nodes thats this node knows about.
  • cluster_nodes are the nodes that are connected to this node.

You had a situation where the node you were querying was aware of other nodes (all_nodes), but none of them were part of your cluster (clustered_nodes).


It seems like only the developer or the community on Github could answer your question completely. Consider asking your question on Github, or opening an issue on the apache/couchdb Github page.