Whatever you are trying to do can easily be done by
USING ON DELETE CASCADE AND ON UPDATE CASCADE(IF Required)
While creating the foreign keys on your table.
For your reference have a look at FOREIGN KEY Constraints.
You have not given much of a table description. Notwithstanding, I will try to answer your question.
For this example, let's use this table:
CREATE TABLE mytable
(
id INT NOT NULL AUTO_INCREMENT,
email VARCHAR(255),
PRIMARY KEY (id)
);
First, try counting all the IDs that have .ac.
in the email column.
SELECT COUNT(1) INTO @CompleteCount FROM mytable;
SELECT COUNT(1) INTO @DeleteCount FROM mytable WHERE LOCATE('.ac.',email) > 0;
There are two approaches to doing a DELETE:
APPROACH #1 : DELETE JOIN
If @DeleteCount < 5% of @CompleteCount, use a list of IDs to DELETE from mytable
:
CREATE TABLE mytable_delete_id
SELECT id FROM mytable WHERE LOCATE('.ac.',email) > 0;
ALTER TABLE mytable_delete_id ADD PRIMARY KEY (id);
DELETE B.* FROM mytable_delete_id A INNER JOIN mytable B USING (id);
DROP TABLE mytable_delete_id;
This helps take the guess work out of which rows to DELETE by using a SELECT to get the IDs. The JOIN takes care of the DELETE.
APPROACH #2 : CREATE A NEW TABLE
If you have the following conditions:
- @DeleteCount > 30% of @CompleteCount
- your table does not have foreign keys/constraints
you could just create a new table and filter out the rows with .ac.
in the email column:
CREATE TABLE mytable_new LIKE mytable;
#insert rows which does not contain '.ac.' from mytable
INSERT INTO mytable_new SELECT * FROM mytable WHERE LOCATE('.ac.',email) = 0;
ALTER TABLE mytable RENAME mytable_old;
ALTER TABLE mytable_new RENAME mytable;
DROP TABLE mytable_old;
EPILOGUE
You can do either of these methods without checking table counta ahead of time. They will both work. If you are concerned that .ac.
is in a huge number of rows, go with APPROACH #2. If there are not that many occurrences, go with APPROACH #1.
Best Answer
This would depend on which one of the repeat you want to keep. For this example, I will dedup the table based on url and title and keeping the first occurrence.
First thing to do: run these queries
This will show total table count and how all (url,title) combinations that repeat twice and more than twice with the total for each at the bottom. If the sum of the repeats is more than 5% of the total:
This method made be very slow if you do not have an index on (url,title)
Otherwise, run this
This method may be better since it only operates on the table id.
Please look over both methods. Try testing it on test databases with copies of the data before running anything.
If you want to keep the last occurrence of duplicates, replace
MIN(id)
withMAX(id)
.Give it a Try !!!