[关闭]
@Cesar 2015-11-27T10:29:26.000000Z 字数 1150 阅读 3419

SQL删除重复数据,只保留一行

sql 学习


在sql的使用中,我们总是碰到需要删除重复数据的情况,但是又不能全部删除完,必须要保留至少一个重复的数据。重复的记录根据两个字段a2,a3判断(实际使用中可以拓展为多个)

eg:表A

a1 a2 a3
1 1 1
2 1 2
3 2 2
4 2 2
5 3 3
6 2 2

在上述的表中第三行和第四行重复,我们要选择一行删除,流程如下:
1. 选择重复的行:

  1. select *,count(*)
  2. from A group by a2,a3
  3. having count(*)>1;

结果如下:

a1 a2 a3 count(*)
3 2 2 3

2. 使用in来找到我们想要的ID

  1. SELECT *
  2. FROM A
  3. WHERE (a2,a3) IN
  4. (SELECT A.`a2`,A.`a3`
  5. FROM A
  6. GROUP BY A.`a2`,A.`a3`
  7. HAVING COUNT(*)>1)

得到的结果如下:

a1 a2 a3
3 2 2
4 2 2
6 2 2

那么后面就很好办了:

3.选出要删除的值:

  1. SELECT *
  2. FROM A
  3. WHERE (a2, a3) IN
  4. (SELECT `a2`,`a3`
  5. FROM A
  6. GROUP BY A.`a2`,A.`a3`
  7. HAVING COUNT(*) > 1)
  8. AND a1 NOT IN
  9. (SELECT MIN(a1)
  10. FROM A
  11. GROUP BY A.`a2`,A.`a3`
  12. HAVING COUNT(*) > 1) ;

结果是保留a1最小的值,其他选项全部选出,
请注意此时并不是将Select 改为delete就可以了,如果你直接这样子改的话,会报如下错误:

You can't specify target table 'A' for update in FROM clause

该错误提示你,不能先select出同一表中的某些值,再update这个表(在同一语句中)。所以要稍微修改一下。
4. 删除值
sql语句如下:

  1. //创建中间表
  2. CREATE TABLE F(a1 INTEGER,a2 INTEGER,a3 INTEGER);
  3. //将要删除的数据插入中间表
  4. INSERT INTO F (
  5. SELECT *
  6. FROM A
  7. WHERE (a2, a3) IN (SELECT `a2`,`a3`
  8. FROM A GROUP BY A.`a2`,A.`a3`
  9. HAVING COUNT(*) > 1)
  10. AND a1 NOT IN
  11. (SELECT MIN(a1) FROM A
  12. GROUP BY A.`a2`,A.`a3`
  13. HAVING COUNT(*) > 1)) ;
  14. //删除中间表
  15. DELETE FROM A WHERE a1 IN (SELECT a1 FROM F);
  16. SELECT *FROM A;

结果如下:

a1 a2 a3
1 1 1
2 1 2
3 2 2
5 3 3

完毕

注:如果说不用保留一行数据的话那么就简单多了,只需要一个很简单的sql语句:

  1. DELETE FROM A WHERE (a2,a3) IN (SELECT a2,a3 FROM A GROUP BY a2,a3 HAVING COUNT(*)>1)
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注