Assuming you have some tables for Persons, Animals:
CREATE TABLE Person
( PersonID INT UNSIGNED NOT NULL AUTO_INCREMENT
, PersonName VARCHAR(255) NOT NULL
, CONSTRAINT Person_PK
PRIMARY KEY (PersonID)
, CONSTRAINT PersonName_UQ
UNIQUE (PersonName)
) ;
CREATE TABLE Animal
( AnimalID INT UNSIGNED NOT NULL AUTO_INCREMENT
, AnimalName VARCHAR(255) NOT NULL
, CONSTRAINT Animal_PK
PRIMARY KEY (AnimalID)
, CONSTRAINT AnimalName_UQ
UNIQUE (AnimalName)
) ;
and results:
CREATE TABLE Result
( RaceID INT UNSIGNED NOT NULL
, Position INT UNSIGNED NOT NULL
, PersonID INT UNSIGNED NOT NULL
, AnimalID INT UNSIGNED NOT NULL
, Errors INT UNSIGNED NOT NULL DEFAULT 0
, CompletionTime Time NULL DEFAULT NULL
, CONSTRAINT Result_PK
PRIMARY KEY (RaceID, Position)
, CONSTRAINT Race_Person_UQ -- assuming a Person cannot enter
UNIQUE (RaceID, PersonID) -- a race twice
, CONSTRAINT Race_Animal_UQ -- assuming an Animal cannot enter
UNIQUE (RaceID, AnimalID) -- a race twice
, INDEX PersonID_IX (PersonID) -- indexes for the Foreign Key
, INDEX AnimalID_IX (AnimalID) -- constraints:
, CONSTRAINT Person_Result_FK
FOREIGN KEY (PersonID)
REFERENCES Person (PersonID)
, CONSTRAINT Animal_Result_FK
FOREIGN KEY (AnimalID)
REFERENCES Animal (AnimalID)
) ;
I suggest you first bulk load the data (possibly with LOAD DATA
from .txt
or .csv
files) in a table in MySQL (supplying race IDS. If you can't supply raceIDs but you have race names, the tables should be adjusted accordingly). You should have a Race
table as well, this is just a sample procedure:
CREATE TABLE BulkData
( RaceID INT UNSIGNED NOT NULL
, Position INT UNSIGNED NOT NULL
, PersonName VARCHAR(255) NOT NULL
, AnimalName VARCHAR(255) NOT NULL
, Errors INT UNSIGNED NOT NULL DEFAULT 0 -- adjust datatypes according
, CompletionTime Time NULL DEFAULT NULL -- to your data
) ;
LOAD DATA INFILE '/results.txt'
INTO TABLE BulkData
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n' ;
Then you can manipulate them and insert them into the 2-3 tables. For Person
:
INSERT INTO Person
(PersonName)
SELECT DISTINCT
b.PersonName
FROM
BulkData AS b
WHERE NOT EXISTS
( SELECT 1
FROM Person AS p
WHERE p.PersonName = b.PersonName
) ;
Similar for Animal
:
INSERT INTO Animal
(AnimalName)
SELECT DISTINCT
b.AnimalName
FROM
BulkData AS b
WHERE NOT EXISTS
( SELECT 1
FROM Animal AS a
WHERE a.AnimalName = b.AnimalName
) ;
And then in Result
:
INSERT INTO Result
(RaceID, Position, PersonID, AnimalID, Errors, CompletionTime)
SELECT
b.RaceID, b.Position, p.PersonID, a.AnimalID, b.Errors, b.CompletionTime
FROM
BulkData AS b
JOIN
Person AS p ON p.PersonName = b.PersonName
JOIN
Animal AS a ON a.AnimalName = b.AnimalName
WHERE NOT EXISTS
( SELECT 1
FROM Result AS r
WHERE r.RaceID = b.RaceID
AND r.PositionID = b.PositionID
) ;
If the importing results are satisfying, then you can empty the BulkData
table and repeat the procedure with more files. The NOT EXISTS
conditions will take care and not allow duplicates even if you try to load same data twice.
Not going to go through what you're trying to accomplish, but for your update, you're using a TOP
without an ORDER BY
. While SQL Server doesn't guarantee ordering for such queries, it doesn't just randomly choose, either. If you want to force a random row to be returned, you'll need to order by a row-level random number. For that, you can use ORDER BY ABS( CHECKSUM( NEWID() ) )
.
Try it out against sys.objects
to see the behavior of explicitly ordering randomly.
SELECT TOP 1 name
FROM sys.objects
ORDER BY ABS( CHECKSUM( NEWID() ) );
And good luck with whatever it is you're doing.
Edit:
@jean points out in the comments just NEWID()
is all that is necessary, which is 100% correct, I just muscle-memory the ABS( CHECKSUM() )
since I tend to need row-level randomization with a %
operator.
Edit 2:
So something like this, then?
UPDATE t
SET CategoryID = nt.CategoryID
FROM dbo.Topic t
INNER JOIN ( SELECT t.ID, ucl.CategoryID,
Ordinal = ROW_NUMBER() OVER (
PARTITION BY t.ID
ORDER BY NEWID() )
FROM dbo.Topic t
INNER JOIN dbo.UserCategoryLink ucl
ON t.UserID = ucl.UserID ) nt
ON t.ID = nt.ID
WHERE nt.Ordinal = 1;
Best Answer
There are two options
OPTION #1 : INSERT INTO ... ON DUPLICATE KEY UPDATE
OPTION #2 : REPLACE INTO
It mechanically performs a DELETE and an INSERT.
CAVEAT
Notice I applied the ORDER BY against the old table. This should help get the latest data for a give LastName, FirstName.
You should also add a UNIQUE INDEX on the new table as follows:
UPDATE 2013-04-08 17:13 EDT
If you are doing updates, there are two ways to do this:
1. UPDATE JOIN
2. UPDATE JOIN with GROUP BY
Give it a Try !!!
Both options will take full advantage of the UNIQUE index on name.