Mysql – UPDATE random rows

mysql-5.7randomupdate

I'm working on creating a more efficient way to UPDATE random rows in a ticket allocation website we have.

Currently, we pre populate the tickets table with ticket numbers in subsequent order, i.e 1,2,4 etc. Each ticket number being a row.

When a customer then places their order we use the following SQL query to reserve their random tickets.

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL 
ORDER BY RAND()
LIMIT n

To begin with the query above worked fine but as the table has grown we're now experiencing performance issues.

The concept I have in mind is to change the population of the tickets table to populate the tickets table in a random order and then use the primary ID key in the tickets table instead to order by like so:

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL 
ORDER BY id
LIMIT n

My question is, how much more efficient is this method compared to the RAND() function or is there any better ways of doing what I'm trying to achieve?

UPDATE

Because the ticket numbers are already pre populated in the tickets table at random when a competition is launched, i'm thinking the below could be a better solution?

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = '12102'
AND order_item_id IS NULL
AND id IN (SELECT id
         FROM tickets   
         WHERE lottery_id = '12102'
         AND order_item_id IS NULL
)
ORDER BY id
LIMIT 3

Example of tickets table after tickets have been pre populated.

id	ticket_number	order_item_id
163895	1	NULL
163896	500	NULL
163897	1221	NULL
163898	20	NULL
163899	403	NULL
163900	8650	NULL
163901	2456	NULL

Ideally, it would be good it the first person who ordered 1 ticket got ticket number 1. And then if the next customer orders 2 tickets, they will get 500 and 1221

ANOTHER UPDATE

After some experimenting i found this query to work quite well but would welcome some thoughts!

UPDATE tickets
SET order_item_id = '120'
WHERE lottery_id = '12094'
AND order_item_id IS NULL
AND id >= (SELECT RAND())
LIMIT 3

Best Answer

Assuming you have an index on the id field, ordering by it should definitely be significantly faster than ordering by RAND(), the only thing is by randomly generating the id you'll likely end up with a lot of index fragmentation which could also affect performance depending on the size of your table.

Why not just populate your tickets table in a normal order, and generate the N number of random numbers you want in a separate statement first, then your UPDATE query becomes:

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL
AND id IN (RandomNumber1, RandomNumber2, RandomNumber3)

That would likely be one of the fastest ways to do it, by decoupling the random number generation first, and then using those numbers as part of your WHERE predicate (again assuming id is indexed) so index seeks can occur.

Not sure what database system you're using (you should tag it and the version) but in most modern RDBMS, you can even make the random number generation it's own subquery to your WHERE clause predicate. Here's an example with a user-defined function (again this will depend on your database system) that returns a table with N number of random numbers in it:

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL
AND id IN (SELECT RandomNumber FROM fn_GetRandomNumbersTable(10)) -- Returns a table of 10 random numbers

As far as generating the actual random numbers, again that'll depend on your specific database system for what capabilities it has to do so. One way in SQL (though there's no reason this couldn't be done in procedural code of the consuming application) is to use a loop and fill a temp table with the numbers. While SQL is best at relational logic, it can handle certain procedural tasks pretty well too such as just running the RAND() function N number of times (assuming N is a reasonable number, e.g. less than 1 billion). So an example of what that could look like is this (take this as pseudo-code since I don't know what system you're on, but this is T-SQL):

CREATE TABLE #RandomNumbers (RandomNumber INT)
DECLARE @LoopCounter = 0

WHILE (@LoopCounter < N) -- Fill in the number you want for N
BEGIN
    INSERT INTO #RandomNumbers
    SELECT RAND()

    SET @LoopCounter = @LoopCounter + 1
END

UPDATE tickets
SET order_item_id = X
WHERE lottery_id = X 
AND order_item_id IS NULL
AND id IN (SELECT RandomNumber FROM #RandomNumbers)

OPTION #1 : INSERT INTO ... ON DUPLICATE KEY UPDATE

INSERT INTO newtable
SELECT * FROM oldtable
ORDER BY tmstamp_value
ON DUPLICATE KEY UPDATE email=VALUES(email), address=VALUES(address) ... ;

OPTION #2 : REPLACE INTO

REPLACE INTO newtable
SELECT * FROM oldtable
ORDER BY tmstamp_value;

It mechanically performs a DELETE and an INSERT.

CAVEAT

Notice I applied the ORDER BY against the old table. This should help get the latest data for a give LastName, FirstName.

You should also add a UNIQUE INDEX on the new table as follows:

ALTER TABLE newtable ADD UNIQUE INDEX name_ndx (lastname,firstname);

UPDATE 2013-04-08 17:13 EDT

If you are doing updates, there are two ways to do this:

1. UPDATE JOIN

UPDATE
(SELECT lastname,firstname,phone,email
FROM oldtable ORDER BY lastname,firstname) A
LEFT JOIN newtable B USING (lastname,firstname)
SET B.phone=A.phone,B.email=A.email;

2. UPDATE JOIN with GROUP BY

UPDATE
(
    SELECT BB.id,BB.phone,BB.email FROM 
    (SELECT lastname,firstname,MAX(id) id FROM oldtable GROUP BY lastname,firstname) AA
    INNER JOIN oldtable BB USING (lastname,firstname)
) A INNER JOIN newtable B USING (id)
SET B.phone=A.phone,B.email=A.email;

Give it a Try !!!

Both options will take full advantage of the UNIQUE index on name.

Mysql – How to limit the range of an insert-update to only look at certain rows/ rows with field value: X

Unfortunately, MySQL does not allow for filtered indexes (otherwise you can just filter on open tickets only). The next best thing is to do something similar to the two table approach you describe: Partitioned tables.