So i was reading MySQL innoDB, which apparently stores table data by default on a clustered Index (b+tree) based on primary key, and the tuples are in the leafs of that b+tree
https://blog.jcole.us/2013/01/10/btree-index-structures-in-innodb/
i was wondering do all the famous Relational DBMSes like PostgreSQL, Oracle, SQL server, SQLite, store it the same way? by making a clustered Index (b+tree) based on primary key and store data on the leafs of tree?
and does Clustered index on primary key in databases basically mean storing the table based on primary key on a B+tree table?
(sorry if this is a general question but i cant make 4-5 seperate questions asking the same for each of the databases i mentioned, if you only know about one or few of them please tell)
also if its a database does not use clustered by default, can you explain how does it structure the files, for example do they make a b+tree on primary key which the leafs point to some sort of address inside a heap or…?
EDIT :
so far we got the answers for all of them except SQLite, if anyone have any info on how tables are actually stored in SQLite by default please do tell.
Best Answer
No, not all. Lets take them one by one:
MySQL. MySQL has several "engines" and depending on what engine a table is defined to use, the storage is:
SQL Server: Yes, the default behaviour for tables that have a PK is to be clustered on the PK. This can be overridden though by declaring that the PK is
NONCLUSTERED
. You can also define another index (not the PK) to be the clustered index of the table. If the PK is deined asNONCLUSTERED
and none of the indexes is defined asCLUSTERED
, then the table is a heap. In recent versions, a 3rd option was added (besides clustered index and heap): columnstore which is a different way of organizing the table data.PostgreSQL: No. All the tables are heaps, period. You can create additional indexes of various types (btree, hash, gin, gist, brin, etc) but the table data are stored in a heap.
Oracle: No. By default the tables are heaps unless created as Index Organized Tables (Oracle's term for clustered indexes).
Additional clarifications about table data organization:
Clustered Index: There is a btee+ index - based on some column(s) - and the leafs contain the table's data.
Heap: There is no btree index for the heap (other indexes like the PK may still use a btree). The data are stored as unordered lists of records. When a new row is inserted, it is usually added in a disk page with available space. Rows are referenced by some special
Page/RowID
(the details are implementation dependend and surely differ from DBMS to DBMS), so other indexes will include this reference.Columnstore: Types of structure where data are stored as columns rather than rows.
Some more details can be found in the documentation of each DBMS and in: - Wikipedia: Database Storage Structures - Wikipedia: Columnstore databases