MySQL query very slow with multiple concurrent users in a short time

MySQLperformancequery-performance

I have a site with a MySQL 5.5 database where most of the traffic happens at a fixed time every single day when thousands of users are loading the site in a period of 2-3 hours.

My slow query log (tracking >10 sec) gets hundreds (or even a thousand) of log entries on a daily basis, where the queries can be even up to 150 seconds long. In these cases the site is obviously very unusable for some of the users.

In 90% of the cases the problem is a query for fetching items which are shared to the user (but not created by user). Items can be shared either globally, per user, per group or not at all.

The problem happens only during that traffic peak. At slower times of the day the query usually only takes around 60 milliseconds.

Tables

CREATE TABLE `users` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `locationId` int(10) unsigned NOT NULL,
    -- *snip*
    PRIMARY KEY (`id`),
    KEY `fk_users1_idx` (`locationId`),
    CONSTRAINT `fk_users1` FOREIGN KEY (`locationId`) REFERENCES `locations` (`id`) ON UPDATE CASCADE
);

CREATE TABLE `items` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `ownerUserId` int(10) unsigned NOT NULL,
    `type` varchar(45) DEFAULT NULL,
    `shareToEveryone` tinyint(1) unsigned NOT NULL DEFAULT '0',
    -- *snip*
    PRIMARY KEY (`id`),
    KEY `fk_items1_idx` (`ownerUserId`),
    KEY `index_type` (`type`),
    CONSTRAINT `fk_items1` FOREIGN KEY (`ownerUserId`) REFERENCES `users` (`id`) ON UPDATE CASCADE
);

CREATE TABLE `itemgroupshares` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `itemId` int(10) unsigned NOT NULL,
    `objectId` varchar(255) NOT NULL,
    PRIMARY KEY (`id`),
    KEY `itemId` (`itemId`,`objectId`),
    CONSTRAINT `ItemGroupShares_ibfk_3` FOREIGN KEY (`itemId`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE `itemusershares` (
    `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `itemId` int(10) unsigned NOT NULL,
    `objectId` varchar(255) NOT NULL,
    PRIMARY KEY (`id`),
    KEY `itemId` (`itemId`,`objectId`),
    CONSTRAINT `ItemUserShares_ibfk_1` FOREIGN KEY (`itemId`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
);

users table has 15k rows, items has 5k rows, itemgroupshares has 1k rows and itemusershares has 3k rows.

Sizes:

  • users: data 4.5MB, index 3.4MB
  • items: data 1.5MB, index 448KB
  • igs: data 96KB, index 64KB
  • ius: data 352KB, index 224KB

Slow query

SELECT i.*
FROM items i
JOIN users u ON u.id = i.ownerUserId
LEFT JOIN itemgroupshares igs ON igs.itemId = i.id
LEFT JOIN itemusershares ius ON ius.itemId = i.id
WHERE
    u.id != :userId
    AND
    u.locationId = :locationId
    AND
    i.type = 'CUSTOM'
    AND
    (
        i.shareToEveryone = 1
        OR
        ius.objectId = :userGuid
        OR
        igs.objectId IN (:groupGuid1, :groupGuid2, :groupGuidN)
    )
GROUP BY i.id
ORDER BY i.id

:userId is current user's id, :locationId is current user's location id, :userGuid is current user's guid and :groupGuid# are current user's group guids.

Other details

Guids are obtained (and saved in session) on login from 3rd party API. Users can have up to 20 groupGuids so the IN-clause can get quite long.

EXPLAIN for an example user with 20 group guids returns this:

enter image description here

I am not very good at reading the result of that, but I think the problem lies with that 1072 rows and using temporary + filesort.

Is there any way to optimize this query?

  1. I have tried selecting only a subset of columns, it doesn't improve the speed. I still should do it.
  2. Indexes on items: ownerUserId, type; itemgroupshares: itemId+objectId (composite to maintain uniqueness), itemusershares: itemId+objectId (same as previous).
  3. I could do ordering in application. Removing the ORDER BY changes the EXPLAIN slightly, but it still has the first entry with 1072 rows etc.
  4. I have tried thinking about how to reduce the where clause without returning wrong results but haven't come up with anything yet.

New EXPLAIN after trying out Rick James' suggestions (EXISTS):

enter image description here

And the improved query so far:

SELECT i.onlyRequiredColumnsHere 
FROM items i
JOIN users u ON u.id = i.ownerUserId
WHERE u.id != :userId
    AND u.locationId = :locationId
    AND i.type = 'custom'
    AND (
        i.shareToEveryone = 1
        OR EXISTS(SELECT 1 FROM itemusershares ius WHERE ius.itemId = i.id AND ius.objectId = :userGuid)
        OR EXISTS(SELECT 1 FROM itemgroupshares igs WHERE igs.itemId = i.id AND igs.objectId IN (:groupGuid1, :groupGuid2, :groupGuidN))
        )

The suggested indexes on users and items didn't seem to change anything, and INDEX(itemId, objectId) is already on igs and ius.

I also updated MySQL server from 5.5 to 8.0.

Best Answer

(just reformatting)

SELECT  i.*
    FROM  items i
    JOIN  users u  ON u.id = i.ownerUserId
    LEFT JOIN  itemgroupshares igs  ON igs.itemId = i.id
    LEFT JOIN  itemusershares ius  ON ius.itemId = i.id
    WHERE  u.id != :userId
      AND  u.locationId = :locationId
      AND  i.type = 'CUSTOM'
      AND  ( i.shareToEveryone = 1
              OR  ius.objectId = :userGuid
              OR  igs.objectId IN (:groupGuid1, :groupGuid2, :groupGuidN) 
           )
    GROUP BY  i.id
    ORDER BY  i.id

===

Instead of LEFT JOIN ius ON .. ... OR ugs..., do

    OR EXISTS( SELECT 1 FROM ius WHERE ... )

This will (1) look into ius only after failing the previous test(s) in the OR, and (2) will do a faster "semi-join". (Ditto for igs.)

Text columns that you don't need hinder performance when doing SELECT *.

The Optimizer might like these indexes:

users:  INDEX(locationId, id)  -- covering and composite
items:  INDEX(type, ownerUserId)  -- composite  (and DROP KEY(type))
igs:    INDEX(itemId, objectId)  -- covering and composite
ius:    INDEX(itemId, objectId)  -- covering and composite

(If you are using InnoDB, you effectively have (locationId, id).)

If igs and ius are many-to-many mapping tables, see advice here: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table

How big is the data? UUIDs/GUIDs have a preformance problem when the tables are much bigger than RAM. I have softened the impact of that some by (1) decreasing how often those tables are accessed and (2) making them "covering" (aka "Using index")