This would depend on the table's layout.
Suppose you have the following table
CREATE TABLE `mydata` (
`A` varchar(19) NOT NULL,
`B` varchar(9) NOT NULL,
KEY `A` (`A`),
KEY `B` (`B`)
) ENGINE=InnoDB;
Before you insert 1000 rows into mydata, you could do preload them into another table called mynewdata like this:
CREATE TABLE mynewdata LIKE mydata;
CREATE TABLE mynewdups LIKE mydata;
INSERT INTO mynewdata ... ;
INSERT INTO mynewdups SELECT * FROM mynewdata;
Next delete all rows in mynewdata that matches A or B in mydata
DELETE T1.* FROM mynewdata T1 INNER JOIN mydata T2 ON T1.A=T2.A OR T1.B=T2.B;
What's left in mydata are rows that do not have A or B matching
What about the rows that matched? Run this
DELETE T1.* FROM mynewdups T1 LEFT JOIN mydata T2
ON T1.A=T2.A OR T1.B=T2.B
WHERE T2.A IS NOT NULL;
What's left in mynewdata is data to import
What's left in mynewdups is data that had a dup key in mydata
Give it a Try !!!
Here is something very quick and dirty
Get the minimum datetime for each personalid, status, ride_taken
SELECT B.* FROM
(SELECT personalid,status,ride_taken,MIN(swipe_time) swipe_time
FROM swipes GROUP BY personalid,status,ride_taken) A
INNER JOIN swipes B USING (personalid,status,ride_taken,swipe_time);
First let's load your sample data:
mysql> drop database if exists sdujan;
Query OK, 1 row affected (0.00 sec)
mysql> create database sdujan;
Query OK, 1 row affected (0.00 sec)
mysql> use sdujan;
Database changed
mysql> CREATE TABLE `swipes` (
-> `id` int(11) NOT NULL AUTO_INCREMENT,
-> `personalid` varchar(20) NOT NULL,
-> `swipe_time` datetime NOT NULL,
-> `status` tinyint(1) NOT NULL COMMENT '1=registered,0=not',
-> `ride_taken` tinyint(1) NOT NULL DEFAULT '0',
-> `source` tinyint(4) NOT NULL DEFAULT '0',
-> PRIMARY KEY (`id`),
-> UNIQUE KEY `my_key` (`personalid`,`swipe_time`,`status`,`ride_taken`)
-> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.09 sec)
mysql> insert into swipes values
-> (1661,'C08877547','2012-10-21 01:12:08',1,1,3),
-> (1662,'C09364782','2012-10-21 01:23:38',1,1,3),
-> (1663,'C09364782','2012-10-21 01:23:48',1,1,3),
-> (1664,'D09490557','2012-10-21 01:24:39',1,1,3);
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from swipes;
+------+------------+---------------------+--------+------------+--------+
| id | personalid | swipe_time | status | ride_taken | source |
+------+------------+---------------------+--------+------------+--------+
| 1661 | C08877547 | 2012-10-21 01:12:08 | 1 | 1 | 3 |
| 1662 | C09364782 | 2012-10-21 01:23:38 | 1 | 1 | 3 |
| 1663 | C09364782 | 2012-10-21 01:23:48 | 1 | 1 | 3 |
| 1664 | D09490557 | 2012-10-21 01:24:39 | 1 | 1 | 3 |
+------+------------+---------------------+--------+------------+--------+
4 rows in set (0.00 sec)
mysql>
Here is the output of my proposed query
mysql> SELECT B.* FROM
-> (SELECT personalid,status,ride_taken,MIN(swipe_time) swipe_time
-> FROM swipes GROUP BY personalid,status,ride_taken) A
-> INNER JOIN swipes B USING (personalid,status,ride_taken,swipe_time);
+------+------------+---------------------+--------+------------+--------+
| id | personalid | swipe_time | status | ride_taken | source |
+------+------------+---------------------+--------+------------+--------+
| 1661 | C08877547 | 2012-10-21 01:12:08 | 1 | 1 | 3 |
| 1662 | C09364782 | 2012-10-21 01:23:38 | 1 | 1 | 3 |
| 1664 | D09490557 | 2012-10-21 01:24:39 | 1 | 1 | 3 |
+------+------------+---------------------+--------+------------+--------+
3 rows in set (0.00 sec)
mysql>
Give it a Try !!!
SUGGESTION #1
To accommodate the query better, add this index
ALTER TABLE swipes ADD UNIQUE KEY `my_key2`
(`personalid`,`status`,`ride_taken`,`swipe_time`);
SUGGESTION #2 (Optional)
To accommodate more than 255 swipe stations, perhaps source
should be smallint unsigned
SUGGESTION #3 (Optional)
Some Transit Systems allow multiple swipes (up to 4 for PATH TransHudson) in one station. You could give some additional thought on this should you have to allow multiple swipes.
UPDATE 2012-10-24 17:30 EDT
If you are trying to limit within a time range, I have another viewpoint for you
Here is the subquery within my answer:
mysql> SELECT
-> personalid,status,ride_taken,MIN(swipe_time) swipe_time
-> FROM
-> swipes GROUP BY personalid,status,ride_taken
-> ;
+------------+--------+------------+---------------------+
| personalid | status | ride_taken | swipe_time |
+------------+--------+------------+---------------------+
| C08877547 | 1 | 1 | 2012-10-21 01:12:08 |
| C09364782 | 1 | 1 | 2012-10-21 01:23:38 |
| D09490557 | 1 | 1 | 2012-10-21 01:24:39 |
+------------+--------+------------+---------------------+
3 rows in set (0.00 sec)
mysql>
Let's say you want to limit to the Latest Entry Per 10 seconds. You will first have to generate a ten-second interval boundary like this:
mysql> SELECT personalid,status,ride_taken,swipe_time,
-> (swipe_time - INTERVAL MOD(UNIX_TIMESTAMP(swipe_time),10) SECOND) ten_sec_in
-> FROM swipes;
+------------+--------+------------+---------------------+---------------------+
| personalid | status | ride_taken | swipe_time | ten_sec_int |
+------------+--------+------------+---------------------+---------------------+
| C08877547 | 1 | 1 | 2012-10-21 01:12:08 | 2012-10-21 01:12:00 |
| C09364782 | 1 | 1 | 2012-10-21 01:23:38 | 2012-10-21 01:23:30 |
| C09364782 | 1 | 1 | 2012-10-21 01:23:48 | 2012-10-21 01:23:40 |
| D09490557 | 1 | 1 | 2012-10-21 01:24:39 | 2012-10-21 01:24:30 |
+------------+--------+------------+---------------------+---------------------+
4 rows in set (0.00 sec)
mysql>
Perhaps the thing to do is introduce the ten_second boundary into the table
ALTER TABLE swipes ADD swipe_int DATETIME;
UPDATE swipes SET
swipe_int = swipe_time - INTERVAL MOD(UNIX_TIMESTAMP(swipe_time),10) SECOND;
ALTER TABLE swipes ADD UNIQUE (personalid,swipe_int);
I ran this
mysql> ALTER TABLE swipes ADD swipe_int DATETIME;
Query OK, 4 rows affected (0.05 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> UPDATE swipes SET
-> swipe_int = swipe_time - INTERVAL MOD(UNIX_TIMESTAMP(swipe_time),10) SECOND;
Query OK, 4 rows affected (0.00 sec)
Rows matched: 4 Changed: 4 Warnings: 0
mysql> ALTER TABLE swipes ADD UNIQUE personalid_swipe_int_ndx (personalid,swipe_int);
Query OK, 4 rows affected (0.05 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from swipes;
+------+------------+---------------------+--------+------------+--------+---------------------+
| id | personalid | swipe_time | status | ride_taken | source | swipe_int |
+------+------------+---------------------+--------+------------+--------+---------------------+
| 1661 | C08877547 | 2012-10-21 01:12:08 | 1 | 1 | 3 | 2012-10-21 01:12:00 |
| 1662 | C09364782 | 2012-10-21 01:23:38 | 1 | 1 | 3 | 2012-10-21 01:23:30 |
| 1663 | C09364782 | 2012-10-21 01:23:48 | 1 | 1 | 3 | 2012-10-21 01:23:40 |
| 1664 | D09490557 | 2012-10-21 01:24:39 | 1 | 1 | 3 | 2012-10-21 01:24:30 |
+------+------------+---------------------+--------+------------+--------+---------------------+
4 rows in set (0.00 sec)
mysql> show create table swipes\G
*************************** 1. row ***************************
Table: swipes
Create Table: CREATE TABLE `swipes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`personalid` varchar(20) NOT NULL,
`swipe_time` datetime NOT NULL,
`status` tinyint(1) NOT NULL COMMENT '1=registered,0=not',
`ride_taken` tinyint(1) NOT NULL DEFAULT '0',
`source` tinyint(4) NOT NULL DEFAULT '0',
`swipe_int` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `my_key` (`personalid`,`swipe_time`,`status`,`ride_taken`),
UNIQUE KEY `personalid_swipe_int_ndx` (`personalid`,`swipe_int`)
) ENGINE=MyISAM AUTO_INCREMENT=1665 DEFAULT CHARSET=utf8
1 row in set (0.01 sec)
mysql>
Once you do that, going forward, any INSERT within the same 10-second interval gets immediately rejected. That way, you do not have to code it. It will be part of the table design.
Naturally, you will need a trigger to populate swipe_int
. Here is the trigger:
mysql> DELIMITER $$
mysql> CREATE TRIGGER swipes_bi BEFORE INSERT ON swipes
-> FOR EACH ROW
-> BEGIN
-> SET NEW.swipe_int = (NEW.swipe_time -
-> INTERVAL MOD(UNIX_TIMESTAMP(NEW.swipe_time),10) SECOND);
-> END $$
Query OK, 0 rows affected (0.11 sec)
mysql> DELIMITER ;
mysql>
Best Answer
fiddle