Today I had a client with a database that has a table named companies that had duplicate email addresses in it that needed to be removed. Here is the SQL query I used to delete the doubles. I will explain below.
DELETE e1 FROM companies e1,companies e2 WHERE e1.email = e2.email AND e1.id > e2.id
Okay, first, I did this from PHPmyadmin, you may shoose to do it from a PHP script if you do not have access to PHPmyadmin. All you have to do is edit the query to look like this:
//include your database connection first:
mysql_query(“DELETE e1 FROM companies e1,companies e2 WHERE e1.email = e2.email AND e1.id > e2.id”) or die(mysql_error());
Either way it does the same thing. The e1 and e2 you see are simply temporary variables used to perform the actions needed to delete all rows with emails that are the same as another row’s email while keeping the one with the lowest ID number.
When I ran the query from PHPMyAdmin, on a table with 28k rows, i took several minutes to run, so this may cause problems if you run it from a PHP script and you have a table that is as large or larger than mine was.