HBase – Support for Spatial Functionality

hadoophbasespatial

I see mention for spatial functional in HBase. For example "HBaseSpatial: A Scalable Spatial Data Storage Based on HBase".

What spatial functionality does HBase support and where is this documented?

Best Answer

HBase is simply a non-relational database that runs on HDFS which is essentially a filesystem abstraction layer on top of Hadoop which is map-reduce framework.

Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.

So the breakdown is like this,

  • HDFS is a distributed file system that is well suited for the storage of large files. Its documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files.
  • HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables. This can sometimes be a point of conceptual confusion. HBase internally puts your data in indexed "StoreFiles" that exist on HDFS for high-speed lookups.

Moreover, HBase doesn't support data types,

HBase supports a "bytes-in/bytes-out" interface via Put and Result, so anything that can be converted to an array of bytes can be stored as a value. Input could be strings, numbers, complex objects, or even images as long as they can rendered as bytes.

So If you want to store GIS objects you would simply ignore HBase and use the underlying Hadoop MapReduce framework with Spatial Hadoop.


HBaseSpatial is just a dead research project. There are millions of them. It's quite possible the source code was never published. The only entry on GitHub for it is here. That's from the author of the paper too, Ningyu Zhang