Mysql – Primary Index VS Unique Index Performance

database-designindexMySQLperformanceprimary-key

I have the following table:

CREATE TABLE `actions` (
  `user_id_from` int(11) unsigned NOT NULL,
  `user_id_to` int(11) unsigned NOT NULL,
  `action_type` int(11) NOT NULL,
  `counter` smallint(11) unsigned NOT NULL,
  `updated` datetime NOT NULL,
  PRIMARY KEY (`user_id_from`,`user_id_to`,`action_type`),
  KEY `user_id_to` (`user_id_to`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Currently it has around 1.3 Billion Rows, Data size is 109GB and Index Size is 44.5GB.

The INSERT queries on this table come in chunks of ~1k rows all with the same user_id_to. Most of the SELECTS have user_id_to in the WHERE condition, user_id_from is also used but less frequently, and results are mostly sorted by counter DESC and grouped by user_id_from.

Currently both INSERTs and SELECTs are running much slower than I'd expect. I have a theory a possible improvement will be to replace between the two indexes and making the new indexes:

  PRIMARY KEY (,`user_id_to`, `user_id_from`,`action_type`),
  KEY `user_id_from` (`user_id_from`)

The hope is that all writes will happen in the same area on the disk, and index changes therefore will also be minimal, and same thing for fetching the data when having one specific user_id_to as a WHERE condition, all data will be in the same place on disk, so it will run much faster.

Does my theory hold water? Any other suggestion?

UPDATE: My Queries:

SELECT COUNT( DISTINCT user_id_from) FROM actions WHERE user_id_to=:user_id


SELECT locations_users.*, location_country, countryCode, location_state, location_city
                    FROM actions INNER JOIN locations_users ON actions.user_id_from = locations_users.user_id
                        INNER JOIN countries ON locations_users.countryId = countries.countryId
                        LEFT JOIN states ON locations_users.stateId = states.stateId
                        LEFT JOIN cities ON locations_users.cityId = cities.cityId
                    WHERE actions.user_id_to = :user_id


SELECT actions.counter, actions.user_id_from, actions.action_type 
            FROM actions 
            LEFT JOIN users ON actions.user_id_from=users.user_id
            WHERE actions.user_id_to=:user_id AND users.something>50  
            ORDER BY actions.counter DESC LIMIT 40

Best Answer

Indexes with the columns in the order given:

actions: INDEX(user_id_to, user_id_from) -- 'covers' first query
actions: INDEX(user_id_to, counter) -- may subsume ORDER BY and LIMIT in 3nd query

Either index will be useful for 2nd query. DROP KEY user_id_to (user_id_to) as redundant when you add my two suggestions.

For various reasons, it may be better to change to PRIMARY KEY(user_id_to, user_id_from, action_type). This would

  • replace the first index suggested above
  • mostly eliminate the need for the second index above, thereby speeding up inserts (by having fewer indexes)
  • If you need user_id_from in the WHERE, but without user_id_to, then you would need some index starting with user_id_from. (I don't yet see a SELECT that begs for such.)

In summary (for actions indexes):

PRIMARY KEY(user_id_to, user_id_from, action_type)
INDEX(user_id_from, ...) -- if needed.

"action_type" -- Do you really need 4 bytes for it? How many distinct values are there? If very few, consider an ENUM, otherwise a TINYINT UNSIGNED or SMALLINT UNSIGNED (1,1,2 bytes, respectively). Shrinking 4 bytes to 1 would save 8GB (3 bytes * 2 copies (data+secondary key) * 1.3B rows).

There is a perfectly nice 2-byte countryCode; don't use a separate Id:

countryCode CHAR(2) CHARACTER SET ascii

How many "locations" are there? If not millions, I would not bother normalizing city, state, country. If there are a lot of locations rows in each city, then consider normalizing only city, then have the state and country spelled out in the city table. (I oppose "over-normalizing".)

If users has PRIMARY KEY(user_id), I have no further suggestion for indexing that table.

The 3rd query says LEFT JOIN users and users.something.... You probably did not want LEFT.