Mysql – UPDATE random rows

mysql-5.7randomupdate

I'm working on creating a more efficient way to UPDATE random rows in a ticket allocation website we have.

Currently, we pre populate the tickets table with ticket numbers in subsequent order, i.e 1,2,4 etc. Each ticket number being a row.

When a customer then places their order we use the following SQL query to reserve their random tickets.

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL 
ORDER BY RAND()
LIMIT n

To begin with the query above worked fine but as the table has grown we're now experiencing performance issues.

The concept I have in mind is to change the population of the tickets table to populate the tickets table in a random order and then use the primary ID key in the tickets table instead to order by like so:

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL 
ORDER BY id
LIMIT n

My question is, how much more efficient is this method compared to the RAND() function or is there any better ways of doing what I'm trying to achieve?

UPDATE

Because the ticket numbers are already pre populated in the tickets table at random when a competition is launched, i'm thinking the below could be a better solution?

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = '12102'
AND order_item_id IS NULL
AND id IN (SELECT id
         FROM tickets   
         WHERE lottery_id = '12102'
         AND order_item_id IS NULL
)
ORDER BY id
LIMIT 3

Example of tickets table after tickets have been pre populated.

id ticket_number order_item_id
163895 1 NULL
163896 500 NULL
163897 1221 NULL
163898 20 NULL
163899 403 NULL
163900 8650 NULL
163901 2456 NULL

Ideally, it would be good it the first person who ordered 1 ticket got ticket number 1. And then if the next customer orders 2 tickets, they will get 500 and 1221

ANOTHER UPDATE

After some experimenting i found this query to work quite well but would welcome some thoughts!

UPDATE tickets
SET order_item_id = '120'
WHERE lottery_id = '12094'
AND order_item_id IS NULL
AND id >= (SELECT RAND())
LIMIT 3

Best Answer

Assuming you have an index on the id field, ordering by it should definitely be significantly faster than ordering by RAND(), the only thing is by randomly generating the id you'll likely end up with a lot of index fragmentation which could also affect performance depending on the size of your table.

Why not just populate your tickets table in a normal order, and generate the N number of random numbers you want in a separate statement first, then your UPDATE query becomes:

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL
AND id IN (RandomNumber1, RandomNumber2, RandomNumber3)

That would likely be one of the fastest ways to do it, by decoupling the random number generation first, and then using those numbers as part of your WHERE predicate (again assuming id is indexed) so index seeks can occur.

Not sure what database system you're using (you should tag it and the version) but in most modern RDBMS, you can even make the random number generation it's own subquery to your WHERE clause predicate. Here's an example with a user-defined function (again this will depend on your database system) that returns a table with N number of random numbers in it:

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL
AND id IN (SELECT RandomNumber FROM fn_GetRandomNumbersTable(10)) -- Returns a table of 10 random numbers

As far as generating the actual random numbers, again that'll depend on your specific database system for what capabilities it has to do so. One way in SQL (though there's no reason this couldn't be done in procedural code of the consuming application) is to use a loop and fill a temp table with the numbers. While SQL is best at relational logic, it can handle certain procedural tasks pretty well too such as just running the RAND() function N number of times (assuming N is a reasonable number, e.g. less than 1 billion). So an example of what that could look like is this (take this as pseudo-code since I don't know what system you're on, but this is T-SQL):

CREATE TABLE #RandomNumbers (RandomNumber INT)
DECLARE @LoopCounter = 0

WHILE (@LoopCounter < N) -- Fill in the number you want for N
BEGIN
    INSERT INTO #RandomNumbers
    SELECT RAND()

    SET @LoopCounter = @LoopCounter + 1
END

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL
AND id IN (SELECT RandomNumber FROM #RandomNumbers)