admin

SQL:自首次出现某些值以来的行数:继续计数

sql

在SQL Server中,我试图计算自过去5天首次观察到与今天相同的天气(今天假设是2018年8月6日)以来的天数。每个镇。

数据如下:

+---------+---------+--------+--------+--------+
| Date    | Toronto | Cairo  | Zagreb | Ankara |
+---------+---------+--------+--------+--------+
| 1.08.18 | Rain    | Sun    | Clouds | Sun    |
| 2.08.18 | Sun     | Sun    | Clouds | Sun    |
| 3.08.18 | Rain    | Sun    | Clouds | Rain   |
| 4.08.18 | Clouds  | Sun    | Clouds | Clouds |
| 5.08.18 | Rain    | Clouds | Rain   | Rain   |
| 6.08.18 | Rain    | Sun    | Sun    | Sun    |
+---------+---------+--------+--------+--------+

这需要执行得很好,但到目前为止,我只想针对每个镇进行单个查询(并且将会有数十个镇,而不仅仅是四个镇)。这行得通,但不会扩展。

这是多伦多的那个…

SELECT 
    DATEDIFF(DAY, MIN([Date]), GETDATE()) + 1 
FROM
    (SELECT TOP 5 * 
     FROM Weather 
     WHERE [Date] <= GETDATE()
     ORDER BY [Date] DESC) a
WHERE 
    Toronto = (SELECT TOP 1 Toronto 
               FROM Weather
               WHERE DataDate = GETDATE())

…正确返回4,因为今天有雨,而过去5天内第一次下雨是8月3日。

但是我想要返回的是一个像这样的表:

+---------+-------+--------+--------+
| Toronto | Cairo | Zagreb | Ankara |
+---------+-------+--------+--------+
| 4       | 5     | 1      | 5      |
+---------+-------+--------+--------+

此代码由@Used_By_Already从接受的答案中进行了稍微修改:

CREATE TABLE mytable(
   Date    date  NOT NULL
  ,Toronto VARCHAR(9) NOT NULL
  ,Cairo   VARCHAR(9) NOT NULL
  ,Zagreb  VARCHAR(9) NOT NULL
  ,Ankara  VARCHAR(9) NOT NULL
);

INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180801','Rain','Sun','Clouds','Sun');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180802','Sun','Sun','Clouds','Sun');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180803','Rain','Sun','Clouds','Rain');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180804','Clouds','Sun','Clouds','Clouds');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180805','Rain','Clouds','Rain','Rain');
INSERT INTO mytable(Date,Toronto,Cairo,Zagreb,Ankara) VALUES ('20180806','Rain','Sun','Sun','Sun');

with cte as (
        select
              date, city, weather
        FROM (
              SELECT * from mytable
             ) AS cp
        UNPIVOT (
                  Weather FOR City IN (Toronto, Cairo, Zagreb, Ankara)
            ) AS up
    )

select 
        date, city, weather, datediff(day,ca.prior,cte.date)+1 as daysPresent
from cte
cross apply (
    select min(prev.date) as prior
    from cte as prev 
    where prev.city = cte.city
    and prev.date between dateadd(day,-4,cte.date) and dateadd(day,0,cte.date)
    and prev.weather = cte.weather
    ) ca

order by city,date

但是,我现在要尝试的是即使在有问题的过去五天之后,也要继续计算“
daysPresent”。这意味着输出样本中最后一个标记的行应显示为6。逻辑是,如果它们之间的间隔少于5天,则以它们之间的天数增加前一个数字。如果过去5天没有相同的天气,请返回1。

我尝试了LEAD和LAG,但无法使其正常工作。它是否是在其上添加另一层的正确方法,还是该查询需要看起来完全不同?

我很困惑。


阅读 165

收藏
2021-07-01

共1个答案

admin

您的数据结构存在重大问题。值应该 在行中 ,而不是 列中 。因此,从以下内容开始:

select d.dte, v.*from data d cross apply
     (values ('Toronto', Toronto), ('Cairo', Cairo), . . .
     ) v(city, val)
where d.date >= dateadd(day, -5, getdate());

从那里,我们可以使用window函数first_value()(或last_value())来获取最新的读数。其余的只是按城市汇总:

with d as (
      select d.dte, v.*,
             first_value(v.val) over (partition by v.city order by d.dte desc) as last_val
      from data d cross apply
           (values ('Toronto', Toronto), ('Cairo', Cairo), . . .
           ) v(city, val)
      where d.date >= dateadd(day, -5, getdate())
     )
select city, datediff(day, min(dte), getdate()) + 1
from d
where val = last_val
group by city;

这样可以以行而不是列的形式为您提供所需的信息。您可以根据需要重新旋转。但我建议您将数据与城市数据保留在不同的行中。

2021-07-01