MySQL Queue – How to Merge Multiple Queues Efficiently

mariadbMySQLqueueselect

Consider the following table:

CREATE TABLE `multiqueue` (
    `ID` BIGINT(20) NOT NULL AUTO_INCREMENT,
    `CustomerID` BIGINT(20) NOT NULL,
    `Volume` INT(11) NOT NULL,
    `Content` MEDIUMTEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
    `PublishedTS` DATETIME NULL DEFAULT NULL,
    PRIMARY KEY (`ID`) USING BTREE
)
COLLATE='utf8mb4_unicode_ci'
ENGINE=InnoDB
;

This table serves as a multi-queue, meaning that it aggregates the queues of requests coming from multiple customers (denoted by CustomerID), each request having a certain Volume of work.

How to write a query that will select top N rows from the table, interleaving the rows from different customers?

If customer 1 sends 100 requests, each of volume 1000, then customer 2 sends 20 requests, each of volume 300, I would like that the query doesn't force any of the customers to starve on responses while my program is busy handling the requests of another customer. It should take 1 request of customer 1 and 3-4 requests of customer 2 in the first fetch, process them, then take 1 more request of customer 1 and 3-4 requests of customer 2, and so on.

What I have tried so far:

SET @runtot := 0;
SELECT q1.id1, q1.customerId1, q1.volume1, q1.content1, (@runtot := @runtot + q1.volume1) AS rt
FROM (
  SELECT ID AS id1, CustomerID AS customerId1, Content AS content1
  FROM multiqueue
  ORDER BY id1
) AS q1
WHERE @runtot < 2000

As described here, the code above limits the number of items selected with a running total of some field in the rows selected. In the scenario above, customer 2 would starve when the above query is in use.

The database in use is MariaDB (version 10.4.13, but I can upgrade to the latest if needed), though a solution for MySQL should also work.

Best Answer

For recent versions, you can use window functions:

SELECT ID, CustomerID, Volume, Content, runtot
FROM ( 
    SELECT ID
         , CustomerID
         , Content
         , sum(Volume) over (
                partition by CustomerID 
                order by ID
           ) as runtot
    FROM multiqueue
) AS q1 
WHERE runtot < 2000;

EDIT:

If you want at least 1 row for each customer, you can add another window function ROW_NUMBER() that enumerates the result, and use that in your select:

SELECT ID, CustomerID, Volume, Content, runtot
FROM ( 
    SELECT ID
         , CustomerID
         , Content
         , sum(Volume) over (
                partition by CustomerID 
                order by ID
           ) as runtot
         , row_number() over (
                partition by CustomerID 
                order by ID
           ) as rn

    FROM multiqueue
) AS q1 
WHERE runtot < 2000
   OR rn = 1;