Mysql – What does having the primary key as the last column in a composite secondary index in an InnoDB table do

clustered-indexinnodbMySQLprimary-key

Say I have a 1-to-N relationship (person_id, pet_id). I have a table where pet_id is the primary key.

I understand that an InnoDB secondary index is essentially a B-tree where the values are the corresponding primary key values for the row.

Now, suppose one person can have thousands of pets and I often want a person's pets in order of pet_id. Then it would matter if records in the secondary index are sorted by (person_id, pet_id) or just person_id with the pet_id's for that person_id being unsorted. Guessing the later.

So, if person_id is non-unique, are records physically sorted by (person_id, pet_id) or JUST pet_id?

Thanks

Best Answer

No. If your table has the InnoDB engine and the PRIMARY KEY is (pet_id), then defining a secondary index as (person_id) or (person_id, pet_id) makes no difference.

The index includes the pet_id column as well so values are sorted as (person_id, pet_id) in both cases.

A query like the one you have:

SELECT pet_id FROM yourtable 
WHERE person_id = 127 
ORDER BY pet_id ;

will need to access only the index to get the values and even more, it won't need to do any sort, as the pet_id values are already sorted in the index. You can verify this by looking at the execution plans (EXPLAIN):


First, we try with a MyISAM table:

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id)
 ) ENGINE = myisam ;

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;

mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using filesort
1 row in set (0.00 sec)

Notice the filesort!

Now, MyISAM with composite index:

 DROP TABLE IF EXISTS pets ;

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id, pet_id)            -- composite index
 ) ENGINE = myisam ;

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;


mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using index
1 row in set (0.00 sec)

Filesort is gone, as expected.


Now lets try the same with InnoDB engine:

 DROP TABLE IF EXISTS pets ;

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id)            -- simple index
 ) ENGINE = innodb ;                      -- InnoDB engine

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;

mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using index
1 row in set (0.00 sec)

No filesort either! Even though the index does not explicitly have the pet_id column, the values are there and sorted. You can check that if you define the index with (person_id, pet_id), the EXPLAIN is identical.

Lets actually do it, with InnoDB and the composite index:

 DROP TABLE IF EXISTS pets ;

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id, pet_id)    -- composite index
 ) ENGINE = innodb ;                      -- InnoDB engine

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;

mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using index
1 row in set (0.00 sec)

Identical plans with the previous case.


To be 100% sure, I also run the last 2 cases (InnoDB engine, with single and composite indexes) enabling the file_per_table setting and adding a few thousands rows in the table:

DROP TABLE IF EXISTS ... ;
CREATE TABLE ... ;

mysql> INSERT INTO pets (person_id) 
       VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;
Query OK, 12 rows affected (0.00 sec)
Records: 12  Duplicates: 0  Warnings: 0

mysql> INSERT INTO pets (person_id) 
       VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3),(127) ;
Query OK, 13 rows affected (0.00 sec)
Records: 13  Duplicates: 0  Warnings: 0

mysql> INSERT INTO pets (person_id) 
       VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3),(127) ;
Query OK, 13 rows affected (0.00 sec)
Records: 13  Duplicates: 0  Warnings: 0

mysql> INSERT INTO pets (person_id) 
       SELECT a.person_id+b.person_id-1 
       FROM pets a CROSS JOIN pets b CROSS JOIN pets c ;
Query OK, 54872 rows affected (0.47 sec)
Records: 54872  Duplicates: 0  Warnings: 0

In both cases, checking the actual file sizes, yields identical results:

ypercube@apollo:~$ sudo ls -la /var/lib/mysql/x/ | grep pets
-rw-rw----  1 mysql mysql     8604 Apr 21 07:25 pets.frm
-rw-rw----  1 mysql mysql 11534336 Apr 21 07:25 pets.ibd