admin

排除时间汇总函数中的重叠时段

sql

我有一个包含每个开始日期和结束日期的表:

DROP TABLE temp_period;

CREATE TABLE public.temp_period
(
  id integer NOT NULL,
  "startDate" date,
  "endDate" date
);

INSERT INTO temp_period(id,"startDate","endDate") VALUES(1,'2010-01-01','2010-03-31');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(2,'2013-05-17','2013-07-18');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(3,'2010-02-15','2010-05-31');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(7,'2014-01-01','2014-12-31');
INSERT INTO temp_period(id,"startDate","endDate") VALUES(56,'2014-03-31','2014-06-30');

现在,我想知道存储在那里的所有期间的总持续时间。我只需要时间作为一个时间interval。这很简单:

SELECT sum(age("endDate","startDate")) FROM temp_period;

但是,问题是:这些时间段确实重叠。我想消除所有重叠的时间段,以便获得表中至少一个记录所覆盖的总时间。

您会发现,时间之间存在相当大的差距,因此,将最小的开始日期和最近的结束日期传递给age函数将无法解决问题。但是,我考虑过这样做并减去缺口的总量,但是我没有想到任何一种优雅的方法来做到这一点。

我使用PostgreSQL 9.6。


阅读 153

收藏
2021-07-01

共1个答案

admin

那这个呢:

WITH
   /* get all time points where something changes */
   points AS (
       SELECT "startDate" AS p
       FROM temp_period
       UNION SELECT "endDate"
       FROM temp_period
   ),
   /*
    * Get all date ranges between these time points.
    * The first time range will start with NULL,
    * but that will be excluded in the next CTE anyway.
    */
   inter AS (
      SELECT daterange(
                lag(p) OVER (ORDER BY p),
                p
             ) i
      FROM points
   ),
   /*
    * Get all date ranges that are contained
    * in at least one of the intervals.
    */
   overlap AS (
      SELECT DISTINCT i
      FROM inter
         CROSS JOIN temp_period
      WHERE i <@ daterange("startDate", "endDate")
   )
/* sum the lengths of the date ranges */
SELECT sum(age(upper(i), lower(i)))
FROM overlap;

对于您的数据,它将返回:

2021-07-01