我有一个表“内容”有以下的列: voter,election_year,election_type,party 我需要删除的组合的所有重复行voter和election_year,和我有麻烦搞清楚如何做到这一点。
voter
election_year
election_type
party
我执行以下操作:
WITH CTE AS( SELECT voter, election_year, ROW_NUMBER()OVER(PARTITION BY voter, election_year ORDER BY voter) as RN FROM votes ) DELETE FROM CTE where RN>1
基于另一个StackOverflow答案,但似乎这是特定于SQL Server的。我已经看到了使用唯一ID来执行此操作的方法,但是此特定表没有那么豪华。如何采用上述脚本删除需要的重复项?谢谢!
编辑:根据请求,创建带有一些示例数据的表:
CREATE TABLE public.votes ( voter varchar(10), election_year smallint, election_type varchar(2), party varchar(3) ); INSERT INTO votes (voter, election_year, election_type, party) VALUES ('2435871347', 2018, 'PO', 'EV'), ('2435871347', 2018, 'RU', 'EV'), ('2435871347', 2018, 'GE', 'EV'), ('2435871347', 2016, 'PO', 'EV'), ('2435871347', 2016, 'GE', 'EV'), ('10215121/8', 2016, 'GE', 'ED') ;
从Postgres中删除或更新CTE无效,请参见公认的答案:``PostgreSQL with-delete -‘elation not exist’‘。
由于您没有主键,因此您可以(ab)使用ctid伪列来标识要删除的行。
ctid
WITH cte AS ( SELECT ctid, row_number() OVER (PARTITION BY voter, election_year ORDER BY voter) rn FROM votes ) DELETE FROM votes USING cte WHERE cte.rn > 1 AND cte.ctid = votes.ctid;
db<>小提琴
并可能考虑引入主键。