Mysql – How to make MySQL auto increment in random order

auto-incrementMySQLorder-byrandom

I have a table that contains several keys into other tables (where each key is comprised of multiple columns). I would like to create a new column for each key that would be an integer such that value i represents the ith occurrence of that key (but I want that ordering to be random). I thought perhaps some sort of auto increment would work, but I have multiple keys and thus would need multiple columns (I believe only one auto increment is permitted).

The solution I came up with doesn't seem to be working. First I created a new table to store the columns I am interested in, plus one extra column, rnd. I would use rnd later as a way to to ORDER BY rand() in a context where I need an actual column and not a function. Here's how I accomplished the first bit:

SET @rnd = 0;
INSERT INTO new_table SELECT col1, ..., colN, @rnd := @rnd+1 FROM original_table ORDER BY rand();

Next I would add an auto increment based upon the key, and I would make use of the rnd column I just created. The ALTER TABLE...ORDER BY does not permit ordering by a function (like rand()), so that's why I needed to create the rnd column first:

ALTER TABLE new_table ADD first_index INT UNSIGNED NOT NULL AUTO_INCREMENT, ADD KEY(col1, col2, first_index), ORDER BY rnd;

I would then remove the auto increment status and repeat for each key:

ALTER TABLE new_table MODIFY COLUMN first_index INT UNSIGNED NOT NULL;

At first this seemed to do the trick. However, upon closer inspection, it appears as though the random ordering was not respected. The index columns appeared to be created in the same order that the original table happened to use. I'm rather frustrated as I went through several very awkward steps to ensure a random ordering, and that did not even work. Why didn't my solution work, and is there a more elegant way to do this?

EDIT: It was apparently not clear what I was looking for, so I'll try a simple example:

Col1   Col2   <other columns>
1997   A
1997   B
2001   B
1997   A
2001   B
1997   A
2001   A
1997   B

The above ordering reflects the original ordering which is neither truly predictable nor truly random. After adding my rnd column and sorting by it I would have something like this:

Col1   Col2   <other columns>   rnd
1997   B                        1
2001   A                        2
1997   B                        3
2001   B                        4
1997   A                        5
1997   A                        6
1997   A                        7
2001   B                        8

I would then get the first index like this:

Col1   Col2   <other columns>   rnd    first_index
1997   B                        1      1
2001   A                        2      1
1997   B                        3      2
2001   B                        4      1
1997   A                        5      1
1997   A                        6      2
1997   A                        7      3
2001   B                        8      2

So 1997 with B appeared 2 times so those values are numbered from 1-2, whereas 1997 with A appeared 3 times so those values rows are numbered 1-3. I used the ORDER BY rnd so that it was random as to which row of 1997-A got assigned index 1, which one got 2, and which one got 3 (but it would be deterministic after that initial assignment of course).

EDIT: I'll add another example since my goal seems unclear. The MySQL docs show a method to get almost there:

CREATE TABLE animals (
    grp ENUM('fish','mammal','bird') NOT NULL,
    id MEDIUMINT NOT NULL AUTO_INCREMENT,
    name CHAR(30) NOT NULL,
    PRIMARY KEY (grp,id)
) ENGINE=MyISAM;

INSERT INTO animals (grp,name) VALUES
    ('mammal','dog'),('mammal','cat'),
    ('bird','penguin'),('fish','lax'),('mammal','whale'),
    ('bird','ostrich');

SELECT * FROM animals ORDER BY grp,id;

This creates a table which, although not what I want, gets close:

+--------+----+---------+
| grp    | id | name    |
+--------+----+---------+
| fish   |  1 | lax     |
| mammal |  1 | dog     |
| mammal |  2 | cat     |
| mammal |  3 | whale   |
| bird   |  1 | penguin |
| bird   |  2 | ostrich |
+--------+----+---------+

This close, but I think there are at least two reasons why this doesn't work for me. First of all, I need to do it on an existing table, not on one that I'm creating. Secondly I don't want dog to show up first in the mammal group just because it was first in the underlying table. I want to, one time, randomize that order. From then on the ordering will be deterministic, but I want it to be random first.

If you are wondering about my motivation for wanting to do this, here is a question explaining what I'm trying to do with this. Also note that the solution should be in MySQL.

Best Answer

Here's what I would do. I hope I did understand your question correctly. It's a workaround and perhaps not very elegant:

  • create a random field, and then an AUTO_INCREMENT field ("ID") as the single field of a primary key, order by RND etc
  • create a temporary table with MIN(ID) for each combination of Col1/2 (first, but randomly):

SELECT
  MIN(ID) AS MIN_ID, Col1, Col2 
FROM 
  table 
GROUP BY 
 Col1, Col2
  • add a second INT field to be the first_index
  • update the table and set first_index to be the difference between the minimum value and the running value of the autoincremented field:

UPDATE 
  table t, temp_table tmp
SET 
  t.first_index = (t.ID - tmp.MIN_ID) + 1
WHERE 
  t.Col1=tmp.Col1 AND t.Col2=tmp.Col2 
  • optionally remove the random and ID field, and add a new PK with the three fields Col1/Col2/first_index (auto incremented) for following inserts