Why does the Tuple Mover in C-Store consider only rows older than LWM

columnstorevertica

In the Tuple Mover section of the C-Store paper by Michael Stonebraker (link: http://db.csail.mit.edu/projects/cstore/vldb.pdf) the following is described:

MOP (merge out process) will find all records in the chosen WS segment with an insertion time at or before the LWM (low water mark; a timestamp order/epoch value) […] The most recent insertion time of a record in RS’ becomes the segment’s new t_lastmove and is always less than or equal to the LWM. […] Hence, LWM “chases” HWM (high water mark), and the delta between them is chosen to mediate between the needs of users who want historical access and the WS space constraints.

I could not understand, when moving records from WS (write optimized storage) to RS (read optimized storage), why the tuple mover considers only records older than LWM? Doesn't this mean that all the rows that were inserted in the system after LWM would only be in WS? In a system, with small LWM, i.e, in a system where old historical queries are supported, this may mean that much of the records would be in WS only and we would miss out all the optimizations provided by the read optimized storage.

Am I missing something?

Best Answer

Given that the referenced paper is 10 years old, I would recommend looking at a The Vertica Analytic Database: C-Store 7 Years Later since Vertica has more automatic epoch advancement mechanisms.

For reference, the acronyms used now are:

  • WOS - Write Optimized Store
  • ROS - Read Optimized Store
  • AHM - Ancient History Marker (Low Water Mark)
  • LGE - Last Good Epoch

A quick overview of how epoch's work in Vertica:

I could not understand, when moving records from WS (write optimized storage) to RS (read optimized storage), why the tuple mover considers only records older than LWM?

Vertica will automatically advance the epoch as a background process. In the example below, once data is committed, it will belong to the current epoch.

-- Get the current epoch
dbadmin=> SELECT CURRENT_EPOCH FROM system;
 CURRENT_EPOCH
---------------
           238
(1 row)

-- Insert a row into the table without committing (WOS)
dbadmin=> INSERT INTO tbl (a) VALUES (1);
 OUTPUT
--------
      1
(1 row)

-- Get the epoch for the row
dbadmin=> SELECT a, epoch FROM tbl;
 a | epoch
---+-------
 1 |
(1 row)

-- Commit the insert
dbadmin=> COMMIT;
COMMIT

-- Get the epoch for the row
dbadmin=> SELECT a, epoch FROM tbl;
 a | epoch
---+-------
 1 |   238
(1 row)

Doesn't this mean that all the rows that were inserted in the system after LWM would only be in WS?

It does not. WOS is just a temporary storage location until the data gets moved to ROS. The epoch is just a way to manage transactions.