Which allows for faster queries in a graph database: a node with multiple properties, or a node that points to multiple nodes

database-designdatabase-recommendationgraph-dbmsneo4j

I just started using Neo4j and I don't know when it'd be best to add properties to a node vs when to create more nodes. I'd assume for simple things like x and y coordinate data, it's better to use node properties. Then for stuff like user information it's better to use additional nodes.

Is there a threshold where it becomes more efficient to use one or the other? For instance, if all the properties in question don't have any children, is it always better to use node properties rather than additional nodes? Is there some cutoff like 10 or more properties it's better to use multiple nodes?

Here's the specific example I'm working with if it helps anyone understand my question. Users post blocks which in turn have a type, x coordinate, y coordinate, a url destination, and a number of upvotes. The left graph uses a single node for the block with multiple properties, and the right graph uses a new node for each property:

Users post blocks which in turn have a type, x coordinate, y coordinate, a url destination, and a number of upvotes. The left graph uses a single node for the block with multiple properties, and the right graph uses a new node for each property.

Best Answer

With the given information, I find it difficult to give you a simple answer. As you have discovered, data can be hierarchical with many different logical groupings that could work.

Here is a checklist of things you can do to get a better answer for yourself:

  1. What are the full set of queries you will want to do? (Among your schema options, which one(s) minimizes the number of times you're going to disk to retrieve data for your set of queries?)
  2. How will your data set scale? (Among your schema choices which one(s) scale gracefully, minimizing the number of fat nodes or other degenerates that will not scale well?)
  3. What kind of churn do you expect? How much are you willing to slow writes in order to accelerate reads?

I hope this helps.