Data Versioning – Is It Used in DBMS Other Than Spatial?

data-versioningspatial

ESRI's spatial database management systems, called geodatabases (more), use a technology called versioning.

A version represents a snapshot in time of the entire geodatabase and
contains all the datasets in the geodatabase.

Versions are not separate copies of the geodatabase. Instead, versions
and the transactions that take place within them are tracked in system
tables. This isolates a user's work across multiple edit sessions,
allowing users to edit without locking features in the production
version or immediately impacting other users and without having to
make copies of the data.

state tree
http://help.arcgis.com/en/geodatabase/10.0/sdk/arcsde/concepts/versioning/basicprinciples/state.htm

When you register a dataset (a feature class, feature dataset, or
table) as versioned, two delta tables are created: the A (or adds)
table, which records insertions and updates, and the D (or deletes)
table, which stores deletions. Each time you update or delete a record
in the dataset, rows are added to one or both of these tables. A
versioned dataset, therefore, consists of the original table (referred
to as the base or business table) plus any changes in the delta
tables. The geodatabase keeps track of which version you were
connected to when you made the edits that populated the delta tables.
When you query or display a dataset in a version, ArcGIS assembles the
relevant rows from the original table and the delta tables to present
a seamless view of the data for that version.

enter image description here

To be honest, I find the documentation to be rather vague; it doesn't tell me much about how the technology actually works, or what part of traditional database theory it is based on.

I don't imagine many DBA SE community members would have experience with ESRI's versioning technology. So I won't ask something like 'how does it work?'.

Instead, I'm wondering, are there any technologies in the non-spatial database world that are similar to ESRI versioning?

Best Answer

Oracle has implemented MMVC almost from its inception. Its main purpose is to reduce contention during updates: make it possible for readers and writers to not block each other. But that is not the same as the multi-versioning offered by ESRI. The traditional database transactions are short transactions: they only last a short time: sub-second, seconds ... maybe minutes or even hours for some batch processes.

The versioning of the kind ESRI offers is for long transactions. Here a transaction can last days, weeks, months ... A database could in theory do this using short transactions: just do not commit, and hope your sessions stays up for days or months. Clearly an impossible guarantee.

So Oracle has implemented its own long-transactions/multi-versioning mechanism. It is calle Oracle Workspace Manager (OWM). See http://docs.oracle.com/database/122/ADWSM/ for details.

In OWM, users work in workspaces (what ESRI calls a version). Workspaces are collaborative: multiple users can work in the same workspace. Updates done in a workspace are only visible to users in that workspace. Changes in a workspace are applied to the parent workspace via a merge operation. A child workspace can start seeing the changes made in its parent via a refresh operation. OWM includes all the mechanics for conflict detection and resolution, long-term locking, change detection, history, "what-if" scenarios ...

Like in ESRI's versions, OWM's workspaces form a hierarchy of nested workspaces, the top one being the LIVE (or "as-built" workspace).

OWM is fully transparent to applications: they interact with tables and views as usual. Referential integrity is maintained within the context of each workspace.

Note that OWM is not specifically related to GIS: the multi-versioned tables can be of any kind - not just "spatial" tables. However, it is used by a number of GIS tools that need a safe long transaction mechanism: Autodesk Map, Bentley Maps, Geomedia Transaction Manager are examples. A number of customers also use it outside of any GIS.

Conceptually it is very much like ESRI's approach: fundamental operations are the same (but may use a different terminology). The implementation is different however: where ESRI keeps changes in separate tables (the "A" and "D" tables), Oracle Workspace Manager keeps them all in the main table(s). Views make the use transparent.

So yes, multi-versioning / long transactions are definitely used in databases in other contexts that spatial databases.