I am just getting started with Accumulo and NoSQL databases and I am looking for some discussion on table design. I get the key value structure that is seen in the manual. However, if I am trying to recreate a relational database, I am not sure how relationships work. Can someone explain to some degree how to setup and "Hello World" database (i.e., manager-employee database). I want to use key-value implementation.
Accumulo table design methodology
database-designnosql
Related Question
- CAP Theorem vs. BASE – Understanding NoSQL Concepts
- EAV structure explained in Layman’s terms
- 3 way junction table and redundant value
- Design: Fitting variable hierarchy of text files into a relational db
- What is difference between RDBMS and NoSql on basis of how information is store in them
- NoSQL – Modeling Document Sharing Permissions
Best Answer
The first thing you should realize is that Accumulo is not a relational database. If you want to make it work like one, you need to create that functionality on top.
One way to do this would be to create rows that look like rows in a relational table, complete with foreign keys that you link together in your application's (an Accumulo client) code. Your client code must enforce the cascading updates/deletes and the other maintenance aspects of relations you may have come to expect from a relational database. Accumulo just stores and sorts Key/Value pairs for you. You decide the semantics of these Key/Value pairs.
Examples:
Note: For illustration purposes, Accumulo Key/Value pairs will be represented as:
Employee table:
Manager table:
See how I've established foreign keys in both directions? Scanning the
Manager
table, you can quickly see which employees are managed by whom, but you'll have to look up eachEmployeeID
in theEmployee
table to see their information. Scanning theEmployee
table, you can quickly see information about a particular employee, including the ID of their manager, but you'll have to look up the manager'sEmployeeID
in theManager
table, and then look that up in theEmployee
table if you want the name of their manager.This could be drastically simplified by combining the two tables into a single table and avoiding a separate
ManagerID
by usingEmployeeID
for both purposes. Something like:One Employee table:
However, depending on your application, you may find that in a NoSQL database, you are better off flattening and duplicating your data, rather than normalizing with relations.
Flattened Employee table: