Consider the following table:
CREATE TABLE `multiqueue` (
`ID` BIGINT(20) NOT NULL AUTO_INCREMENT,
`CustomerID` BIGINT(20) NOT NULL,
`Volume` INT(11) NOT NULL,
`Content` MEDIUMTEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
`PublishedTS` DATETIME NULL DEFAULT NULL,
PRIMARY KEY (`ID`) USING BTREE
)
COLLATE='utf8mb4_unicode_ci'
ENGINE=InnoDB
;
This table serves as a multi-queue, meaning that it aggregates the queues of requests coming from multiple customers (denoted by CustomerID
), each request having a certain Volume
of work.
How to write a query that will select top N
rows from the table, interleaving the rows from different customers?
If customer 1 sends 100 requests, each of volume 1000, then customer 2 sends 20 requests, each of volume 300, I would like that the query doesn't force any of the customers to starve on responses while my program is busy handling the requests of another customer. It should take 1 request of customer 1 and 3-4 requests of customer 2 in the first fetch, process them, then take 1 more request of customer 1 and 3-4 requests of customer 2, and so on.
What I have tried so far:
SET @runtot := 0;
SELECT q1.id1, q1.customerId1, q1.volume1, q1.content1, (@runtot := @runtot + q1.volume1) AS rt
FROM (
SELECT ID AS id1, CustomerID AS customerId1, Content AS content1
FROM multiqueue
ORDER BY id1
) AS q1
WHERE @runtot < 2000
As described here, the code above limits the number of items selected with a running total of some field in the rows selected. In the scenario above, customer 2 would starve when the above query is in use.
The database in use is MariaDB (version 10.4.13, but I can upgrade to the latest if needed), though a solution for MySQL should also work.
Best Answer
For recent versions, you can use window functions:
EDIT:
If you want at least 1 row for each customer, you can add another window function
ROW_NUMBER()
that enumerates the result, and use that in your select: