I'm working on creating a more efficient way to UPDATE random rows in a ticket allocation website we have.
Currently, we pre populate the tickets table with ticket numbers in subsequent order, i.e 1,2,4 etc. Each ticket number being a row.
When a customer then places their order we use the following SQL query to reserve their random tickets.
UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X
AND order_item_id IS NULL
ORDER BY RAND()
LIMIT n
To begin with the query above worked fine but as the table has grown we're now experiencing performance issues.
The concept I have in mind is to change the population of the tickets table to populate the tickets table in a random order and then use the primary ID key in the tickets table instead to order by like so:
UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X
AND order_item_id IS NULL
ORDER BY id
LIMIT n
My question is, how much more efficient is this method compared to the RAND() function or is there any better ways of doing what I'm trying to achieve?
UPDATE
Because the ticket numbers are already pre populated in the tickets table at random when a competition is launched, i'm thinking the below could be a better solution?
UPDATE tickets
SET order_item_id = X
WHERE lottery_id = '12102'
AND order_item_id IS NULL
AND id IN (SELECT id
FROM tickets
WHERE lottery_id = '12102'
AND order_item_id IS NULL
)
ORDER BY id
LIMIT 3
Example of tickets table after tickets have been pre populated.
id | ticket_number | order_item_id |
---|---|---|
163895 | 1 | NULL |
163896 | 500 | NULL |
163897 | 1221 | NULL |
163898 | 20 | NULL |
163899 | 403 | NULL |
163900 | 8650 | NULL |
163901 | 2456 | NULL |
Ideally, it would be good it the first person who ordered 1 ticket got ticket number 1. And then if the next customer orders 2 tickets, they will get 500 and 1221
ANOTHER UPDATE
After some experimenting i found this query to work quite well but would welcome some thoughts!
UPDATE tickets
SET order_item_id = '120'
WHERE lottery_id = '12094'
AND order_item_id IS NULL
AND id >= (SELECT RAND())
LIMIT 3
Best Answer
Assuming you have an index on the
id
field, ordering by it should definitely be significantly faster than ordering byRAND()
, the only thing is by randomly generating theid
you'll likely end up with a lot of index fragmentation which could also affect performance depending on the size of your table.Why not just populate your tickets table in a normal order, and generate the N number of random numbers you want in a separate statement first, then your
UPDATE
query becomes:That would likely be one of the fastest ways to do it, by decoupling the random number generation first, and then using those numbers as part of your
WHERE
predicate (again assumingid
is indexed) so index seeks can occur.Not sure what database system you're using (you should tag it and the version) but in most modern RDBMS, you can even make the random number generation it's own subquery to your
WHERE
clause predicate. Here's an example with a user-defined function (again this will depend on your database system) that returns a table with N number of random numbers in it:As far as generating the actual random numbers, again that'll depend on your specific database system for what capabilities it has to do so. One way in SQL (though there's no reason this couldn't be done in procedural code of the consuming application) is to use a loop and fill a temp table with the numbers. While SQL is best at relational logic, it can handle certain procedural tasks pretty well too such as just running the
RAND()
function N number of times (assuming N is a reasonable number, e.g. less than 1 billion). So an example of what that could look like is this (take this as pseudo-code since I don't know what system you're on, but this is T-SQL):