我的表中有重复的行,如何根据单列的值删除它们?
例如
uniqueid, col2, col3 ... 1, john, simpson 2, sally, roberts 1, johnny, simpson delete any duplicate uniqueIds to get 1, John, Simpson 2, Sally, Roberts
您可以DELETE从CTE:
DELETE
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank' FROM Table) DELETE FROM cte WHERE RowRank > 1
该ROW_NUMBER()函数为每行分配一个数字。 PARTITION BY用于从该组中的每个项目开始编号,在这种情况下,的每个值uniqueid将从1开始编号并从该位置开始递增。 ORDER BY确定数字的顺序。由于每个uniqueid数字都从1开始编号,因此任何ROW_NUMBER()大于1的记录都具有重复项uniqueid
ROW_NUMBER()
PARTITION BY
uniqueid
ORDER BY
要了解该ROW_NUMBER()函数的工作原理,只需尝试一下:
SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid ORDER BY col2)'RowRank' FROM Table ORDER BY uniqueid
您可以调整ROW_NUMBER()函数的逻辑,以调整要保留或删除的记录。
例如,也许您想分多个步骤进行操作,首先删除姓氏相同但名字不同的记录,则可以将姓氏添加到PARTITION BY:
WITH cte AS (SELECT *,ROW_NUMBER() OVER(PARTITION BY uniqueid, col3 ORDER BY col2)'RowRank' FROM Table) DELETE FROM cte WHERE RowRank > 1