很快,我将不得不准备几天内的物品价格清单。粒度为1天,在有商品销售的日子里,我将平均价格以获取当天的平均价格。有时候会没有销售,因此我很适合通过拉前一次和下一次销售来使用足够的近似值,并且在它们之间的每一天,其价格都从一个线性变化到另一个线性变化。
想象一下原始数据是:
Item Date Price Bread 2000-01-01 10 Bread 2000-01-02 9.5 Bread 2000-01-04 9.1 Sugar 2000-01-01 100 Sugar 2000-01-11 150
我可以到这里:
Item Date Price Bread 2000-01-01 10 Bread 2000-01-02 9.5 Bread 2000-01-03 NULL Bread 2000-01-04 9.1 Sugar 2000-01-01 100 Sugar 2000-01-02 NULL Sugar 2000-01-03 NULL Sugar 2000-01-04 NULL Sugar 2000-01-05 NULL Sugar 2000-01-06 NULL Sugar 2000-01-07 NULL Sugar 2000-01-08 NULL Sugar 2000-01-09 NULL Sugar 2000-01-10 NULL Sugar 2000-01-11 150
我想去的地方是:
Item Date Price Bread 2000-01-01 10 Bread 2000-01-02 9.5 Bread 2000-01-03 9.3 --being 9.5 + ((9.1 - 9.5 / 2) * 1) Bread 2000-01-04 9.1 Sugar 2000-01-01 100 Sugar 2000-01-02 105 --being 100 + (150 - 100 / 10) * 1) Sugar 2000-01-03 110 --being 100 + (150 - 100 / 10) * 2) Sugar 2000-01-04 115 Sugar 2000-01-05 120 Sugar 2000-01-06 125 Sugar 2000-01-07 130 Sugar 2000-01-08 135 Sugar 2000-01-09 140 Sugar 2000-01-10 145 --being 100 + (150 - 100 / 10) * 9) Sugar 2000-01-11 150
到目前为止,我尝试了什么?仅思考;我正计划做类似的事情:
但是,我想知道是否有一种更简单的方法,因为我有数百万个项目日,而且这似乎不那么有效。
我发现了很多这样的问题示例,其中逐行抹掉最后一行或下一行的数据以填补空白,但我不记得看到过这种尝试进行某种过渡的情况。也许可以通过向前涂抹,复制最新值以及向后涂抹的方式来双重应用该技术:
Item Date DateFwd DateBak PriceF PriceB Bread 2000-01-01 2000-01-01 2000-01-01 10 10 Bread 2000-01-02 2000-01-02 2000-01-02 9.5 9.5 Bread 2000-01-03 2000-01-02 2000-01-04 9.5 9.1 Bread 2000-01-04 2000-01-04 2000-01-04 9.1 9.1 Sugar 2000-01-01 2000-01-01 2000-01-01 100 100 Sugar 2000-01-02 2000-01-01 2000-01-11 100 150 Sugar 2000-01-03 2000-01-01 2000-01-11 100 150 Sugar 2000-01-04 2000-01-01 2000-01-11 100 150 Sugar 2000-01-05 2000-01-01 2000-01-11 100 150 Sugar 2000-01-06 2000-01-01 2000-01-11 100 150 Sugar 2000-01-07 2000-01-01 2000-01-11 100 150 Sugar 2000-01-08 2000-01-01 2000-01-11 100 150 Sugar 2000-01-09 2000-01-01 2000-01-11 100 150 Sugar 2000-01-10 2000-01-01 2000-01-11 100 150 Sugar 2000-01-11 2000-01-11 2000-01-11 150 150
这些可能会为公式提供必要的数据 (preceding_price + ((next_price - preceding_price)/gap_distance) * gap_progress):
(preceding_price + ((next_price - preceding_price)/gap_distance) * gap_progress)
?
这是我知道可以获取的数据的DDL(与日历表结合的原始数据)
CREATE TABLE Data ([I] varchar(5), [D] date, [P] DECIMAL(10,5)) ; INSERT Data ([I], [D], [P]) VALUES ('Bread', '2000-01-01', 10), ('Bread', '2000-01-02', 9.5), ('Bread', '2000-01-04', 9.1), ('Sugar', '2000-01-01', 100), ('Sugar', '2000-01-11', 150); CREATE TABLE Cal([D] DATE); INSERT Cal VALUES ('2000-01-01'), ('2000-01-02'), ('2000-01-03'), ('2000-01-04'), ('2000-01-05'), ('2000-01-06'), ('2000-01-07'), ('2000-01-08'), ('2000-01-09'), ('2000-01-10'), ('2000-01-11'); SELECT d.i as [item], c.d as [date], d.p as [price] FROM cal c LEFT JOIN data d ON c.d = d.d
一口气就能轻松产生缺失的缺口和价格
所以我从您的原始数据开始
CREATE TABLE t ([I] varchar(5), [D] date, [P] DECIMAL(10,2)) ; INSERT INTO t ([I], [D], [P]) VALUES ('Bread', '2000-01-01 00:00:00', '10'), ('Bread', '2000-01-02 00:00:00', '9.5'), ('Bread', '2000-01-04 00:00:00', '9.1'), ('Sugar', '2000-01-01 00:00:00', '100'), ('Sugar', '2000-01-11 00:00:00', '150'); ; with -- number is a tally table. here i use recursive cte to generate 100 numbers number as ( select n = 0 union all select n = n + 1 from number where n < 99 ), -- a cte to get the Price of next date and also day diff cte as ( select *, nextP = lead(P) over(partition by I order by D), cnt = datediff(day, D, lead(D) over(partition by I order by D)) - 1 from t ) select I, D = dateadd(day, n, D), P = coalesce(c.P + (c.nextP - c.P) / ( cnt + 1) * n, c.P) from cte c cross join number n where n.n <= isnull(c.cnt, 0) drop table t