在我的数据库中,我有一个Reservation表,它有三列Initial Day,Last Day 和House Id。
Reservation
Initial Day
Last Day
House Id
我想计算总天数,而忽略重复的天数,例如:
+-------------+------------+------------+ | | Results | | +-------------+------------+------------+ | House Id | InitialDay | LastDay | +-------------+------------+------------+ | 1 | 2017-09-18 | 2017-09-20 | | 1 | 2017-09-18 | 2017-09-22 | | 19 | 2017-09-18 | 2017-09-22 | | 20 | 2017-09-18 | 2017-09-22 | +-------------+------------+------------+
如果您发现House Id编号为1的包含两行,并且每一行都有日期,但是第一行在第二行的日期间隔中。总的天数应该为5,因为不应将第一天计算在内,因为第二天已经存在这些天数。
发生这种情况的原因是,每个房屋都有两个房间,不同的人可以在同一日期住进该房屋。
我的问题是:我该如何忽略这些情况,而仅计算房屋被占用的真实日期?
在使用SQL Server 2012或更高版本的系统中,您可以LAG()用来获取上一个最终日期并调整初始日期:
LAG()
with ReservationAdjusted as ( select *, lag(LastDay) over(partition by HouseID order by InitialDay, LastDay) as PreviousLast from Reservation ) select HouseId, sum(case when PreviousLast>LastDay then 0 -- fully contained in the previous reservation when PreviousLast>=InitialDay then datediff(day,PreviousLast,LastDay) -- overlap else datediff(day,InitialDay,LastDay)+1 -- no overlap end) as Days from ReservationAdjusted group by HouseId
情况是:
InitialDay, LastDay
请注意,我们不需要额外的条件来保留a,HouseID因为默认情况下该LAG()函数NULL在没有上一行时返回,并且与null的比较始终为false。
HouseID
NULL
样本输入和输出:
| HouseId | InitialDay | LastDay | |---------|------------|------------| | 1 | 2017-09-18 | 2017-09-20 | | 1 | 2017-09-18 | 2017-09-22 | | 1 | 2017-09-21 | 2017-09-22 | | 19 | 2017-09-18 | 2017-09-27 | | 19 | 2017-09-24 | 2017-09-26 | | 19 | 2017-09-29 | 2017-09-30 | | 20 | 2017-09-19 | 2017-09-22 | | 20 | 2017-09-22 | 2017-09-26 | | 20 | 2017-09-24 | 2017-09-27 | | HouseId | Days | |---------|------| | 1 | 5 | | 19 | 12 | | 20 | 9 |