Here's what I would do. I hope I did understand your question correctly. It's a workaround and perhaps not very elegant:
- create a random field, and then an AUTO_INCREMENT field ("ID") as the single field of a primary key, order by RND etc
- create a temporary table with MIN(ID) for each combination of Col1/2 (first, but randomly):
SELECT
MIN(ID) AS MIN_ID, Col1, Col2
FROM
table
GROUP BY
Col1, Col2
- add a second INT field to be the first_index
- update the table and set first_index to be the difference between the minimum value and the running value of the autoincremented field:
UPDATE
table t, temp_table tmp
SET
t.first_index = (t.ID - tmp.MIN_ID) + 1
WHERE
t.Col1=tmp.Col1 AND t.Col2=tmp.Col2
- optionally remove the random and ID field, and add a new PK with the three fields Col1/Col2/first_index (auto incremented) for following inserts
What is interesting about this situation can be solved using the MyISAM storage.
I answedred a question like this back in April 2012 : How can you have two auto-incremental columns in one table? You need to create one table whose sole purpose is the create sequences of work order for each site
CREATE TABLE site_workorder_seq
(
SiteID int not null,
SiteWorkorderNum int not null auto_increment,
PRIMARY KEY (SiteID,SiteWorkorderNum)
) ENGINE=MyISAM;
Here is a sample loading into this table:
mysql> DROP DATABASE david;
Query OK, 1 row affected (0.01 sec)
mysql> CREATE DATABASE david;
Query OK, 1 row affected (0.00 sec)
mysql> USE david
Database changed
mysql> CREATE TABLE site_workorder_seq
-> (
-> SiteID int not null,
-> SiteWorkorderNum int not null auto_increment,
-> PRIMARY KEY (SiteID,SiteWorkorderNum)
-> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO site_workorder_seq (SiteID) VALUES
-> (1),(1),(2),(3),(3),(3),(3),(4),(4),(4),
-> (5),(5),(4),(2),(2),(2);
Query OK, 16 rows affected (0.00 sec)
Records: 16 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM site_workorder_seq;
+--------+------------------+
| SiteID | SiteWorkorderNum |
+--------+------------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 2 | 4 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 4 | 1 |
| 4 | 2 |
| 4 | 3 |
| 4 | 4 |
| 5 | 1 |
| 5 | 2 |
+--------+------------------+
16 rows in set (0.00 sec)
mysql>
Let's look at the the last WorkorderNum from each site
mysql> SELECT SiteID,MAX(SiteWorkorderNum) SiteWorkorderNum
-> FROM site_workorder_seq GROUP BY SiteID;
+--------+------------------+
| SiteID | SiteWorkorderNum |
+--------+------------------+
| 1 | 2 |
| 2 | 4 |
| 3 | 4 |
| 4 | 4 |
| 5 | 2 |
+--------+------------------+
5 rows in set (0.05 sec)
mysql>
Now, suppose you want to get the next SiteWorkorderNum for SiteID 3. You could do this:
INSERT INTO site_workorder_seq (SiteID) VALUES (3);
SELECT MAX(SiteWorkorderNum) INTO @nextworkordernum
FROM site_workorder_seq WHERE SiteID=3;
SELECT @nextworkordernum;
Let's run this and see what happens
mysql> INSERT INTO site_workorder_seq (SiteID) VALUES (3);
Query OK, 1 row affected (0.00 sec)
mysql> SELECT MAX(SiteWorkorderNum) INTO @nextworkordernum
-> FROM site_workorder_seq WHERE SiteID=3;
Query OK, 1 row affected (0.00 sec)
mysql> SELECT @nextworkordernum;
+-------------------+
| @nextworkordernum |
+-------------------+
| 5 |
+-------------------+
1 row in set (0.03 sec)
mysql> SELECT * FROM site_workorder_seq;
+--------+------------------+
| SiteID | SiteWorkorderNum |
+--------+------------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 2 | 4 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 3 | 5 |
| 4 | 1 |
| 4 | 2 |
| 4 | 3 |
| 4 | 4 |
| 5 | 1 |
| 5 | 2 |
+--------+------------------+
17 rows in set (0.00 sec)
mysql> SELECT SiteID,MAX(SiteWorkorderNum) SiteWorkorderNum
-> FROM site_workorder_seq GROUP BY SiteID;
+--------+------------------+
| SiteID | SiteWorkorderNum |
+--------+------------------+
| 1 | 2 |
| 2 | 4 |
| 3 | 5 |
| 4 | 4 |
| 5 | 2 |
+--------+------------------+
5 rows in set (0.00 sec)
mysql>
As long as you use this one MyISAM apart from all InnoDB tables, you can generate workordernums per site to your hearts content.
Best Answer
You are absolutely right that a surrogate key (ID or otherwise) won't solve your root problem which is more fundamental than the database schema - "How do you tell speakers apart in the real world?" That's a business question, and full name may or may not be good enough. You should ask yourself - If I have 2 speakers named John Doe, how do I tell them apart in the real world, and that will be your answer to the key.
Common attributes used to identify people in these kind of data universes are Email address, User name / Alias (login to your system), SSN, perhaps even name + address may be unique enough for your business needs - what are the odds of 2 people with the exact same name residing at the same address? Theoretically, you should accommodate every possible case, but in practice this would be much better than having a surrogate key, which will guarantee duplicates as you correctly observed, and not having any way to tell the entities in the database apart. Using an ID will lead to much worse data consistency problems later on when the 2 John Doe start showing up in the wrong conferences :-)
HTH