Ultimately, it depends on the architecture that your machine has.
(background) Nodes can solely store data in their properties. Their properties are stored using a key-value store. (per here)
The value in each property is limited to Java primitives (ints, floats, etc.), strings, and arrays of primitives/strings.
Therefore, the maximum amount of data a particular property can hold would be limited to the maximum size for a string or the maximum size for an array of strings (that's per node). This limit (for 32-bit machines) is 4GB. (Note that this may be limited to 2-3 GB.)
(Also, having said this, there was a bug previously that limited string size to 1 MB. I expect that this is resolved.)
Of course, this raises the question of whether multiple properties could store more than 4GB per node. Since the properties list is essentially a key-value store, it would expect that the maximum size would be limited by disk space and key selection. I can't find anything to support or deny this, however.
That doesn't definitively answer your question, but from what I understand you should be able to store large amounts of data per node (up to disk space capacity).
How can I model this in a graph database like Neo4j?
Modeling neo4j graphs from relational data is quite simple:
- Decide your vertexes (nodes, objects) and edges (relationships).
- Convert relational data to cypher, declaring all items and all relationships explicit.
Note: Mapping from relational to graph could take only selected entities from relational model, and single table rows can explode into multiple vertexes and multiple edges.
Is this the right way to structure this data in a graph database?
Yes, it looks OK. Assuming that file
, author
, company
, user
, and image
are nodes, and date
is only an attribute, this
file: 11425646.pdf
author: bob
company: abc co
date: 1/1/2011
mentioned_users: [alice,sue,mike,sally]
images: [1958.jpg,535.jpg,35735.jpg]
should convert to this
MERGE (f :File {name:'11425646.pdf', date:'1/1/2011'})
MERGE (a :Author {name:'bob'})
MERGE (c :Company {name:'abc co'})
MERGE (u1 :User {name:'alice'})
MERGE (u2 :User {name:'sue'})
MERGE (u3 :User {name:'mike'})
MERGE (u4 :User {name:'sally'})
MERGE (i1 :Image {name:'1958.jpg'})
MERGE (i2 :Image {name:'535.jpg'})
MERGE (i3 :Image {name:'35735.jpg'})
MERGE (f)-[:WRITTEN_BY]->(a)
MERGE (f)-[:FROM_COMPANY]->(c)
MERGE (f)-[:MENTIONS]->(u1)
MERGE (f)-[:MENTIONS]->(u2)
MERGE (f)-[:MENTIONS]->(u3)
MERGE (f)-[:MENTIONS]->(u4)
MERGE (f)-[:HAS_IMAGE]->(i1)
MERGE (f)-[:HAS_IMAGE]->(i2)
MERGE (f)-[:HAS_IMAGE]->(i3)
Useful links: data modeling guide and Cypher reference
Best Answer
I contacted uk@neo4j.com to ask for clarification and the answer they gave is that:
and
The release announcement for 3.0 mentions a 34bn limit that no longer applies:
So my guess is that 'Dynamic pointer compression' is an Enterprise Edition feature.