tl; dr:我想在Redshift中生成一个日期表,以使报表更易于生成。最好不需要在Redshift中已经有大表,而需要上传一个csv文件。
长版:我正在处理一份报告,其中我必须平均一周中每天创建的新项目。日期范围可能跨越数月或更长时间,因此可能有5个星期一,但只有4个星期日,这可能会使数学有些棘手。另外,我不能保证每天都有一个实例,尤其是一旦用户开始切片数据时。其中,这是BI工具的绊脚石。
解决此问题的最佳方法很可能是日期表。但是,大多数有关日期表的教程都使用Redshift不可用或不完全支持的SQL命令(我在看着您, generate_series )。
有没有一种简单的方法可以在Redshift中生成日期表?
我尝试使用的代码:(基于这个也不起作用的建议:http : //elliot.land/post/building-a-date-dimension- table-in-redshift)
CREATE TABLE facts.dates ( "date_id" INTEGER NOT NULL PRIMARY KEY, -- DATE "full_date" DATE NOT NULL, -- YEAR "year_number" SMALLINT NOT NULL, "year_week_number" SMALLINT NOT NULL, "year_day_number" SMALLINT NOT NULL, -- QUARTER "qtr_number" SMALLINT NOT NULL, -- MONTH "month_number" SMALLINT NOT NULL, "month_name" CHAR(9) NOT NULL, "month_day_number" SMALLINT NOT NULL, -- WEEK "week_day_number" SMALLINT NOT NULL, -- DAY "day_name" CHAR(9) NOT NULL, "day_is_weekday" SMALLINT NOT NULL, "day_is_last_of_month" SMALLINT NOT NULL ) DISTSTYLE ALL SORTKEY (date_id) ; INSERT INTO facts.dates ( "date_id" ,"full_date" ,"year_number" ,"year_week_number" ,"year_day_number" -- QUARTER ,"qtr_number" -- MONTH ,"month_number" ,"month_name" ,"month_day_number" -- WEEK ,"week_day_number" -- DAY ,"day_name" ,"day_is_weekday" ,"day_is_last_of_month" ) SELECT cast(seq + 1 AS INTEGER) AS date_id, -- DATE datum AS full_date, -- YEAR cast(extract(YEAR FROM datum) AS SMALLINT) AS year_number, cast(extract(WEEK FROM datum) AS SMALLINT) AS year_week_number, cast(extract(DOY FROM datum) AS SMALLINT) AS year_day_number, -- QUARTER cast(to_char(datum, 'Q') AS SMALLINT) AS qtr_number, -- MONTH cast(extract(MONTH FROM datum) AS SMALLINT) AS month_number, to_char(datum, 'Month') AS month_name, cast(extract(DAY FROM datum) AS SMALLINT) AS month_day_number, -- WEEK cast(to_char(datum, 'D') AS SMALLINT) AS week_day_number, -- DAY to_char(datum, 'Day') AS day_name, CASE WHEN to_char(datum, 'D') IN ('1', '7') THEN 0 ELSE 1 END AS day_is_weekday, CASE WHEN extract(DAY FROM (datum + (1 - extract(DAY FROM datum)) :: INTEGER + INTERVAL '1' MONTH) :: DATE - INTERVAL '1' DAY) = extract(DAY FROM datum) THEN 1 ELSE 0 END AS day_is_last_of_month FROM -- Generate days for 81 years starting from 2000. ( SELECT '2000-01-01' :: DATE + generate_series AS datum, generate_series AS seq FROM generate_series(0,81 * 365 + 20,1) ) DQ ORDER BY 1;
哪个抛出这个错误
[Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.; 1 statement failed.
…因为我假设在Redshift中同一命令中不允许INSERT和 generate_series
在问这个问题时,我知道了。哎呀。
我从“事实”模式开始。
CREATE SCHEMA facts;
运行以下命令以启动数字表:
create table facts.numbers ( number int PRIMARY KEY ) ;
使用它来生成您的号码列表。我用了一百万开始
SELECT ',(' || generate_series(0,1000000,1) || ')' ;
然后将您的结果中的数字复制粘贴到下面的查询中,位于VALUES之后:
INSERT INTO facts.numbers VALUES (0) ,(1) ,(2) ,(3) ,(4) ,(5) ,(6) ,(7) ,(8) ,(9) -- etc
^确保从复制粘贴的数字列表中删除前导逗号
一旦有了数字表,就可以生成一个日期表(同样,从elliot land http://elliot.land/post/building-a-date- dimension-table-in-redshift窃取代码):
CREATE TABLE facts.dates ( "date_id" INTEGER NOT NULL PRIMARY KEY, -- DATE "full_date" DATE NOT NULL, -- YEAR "year_number" SMALLINT NOT NULL, "year_week_number" SMALLINT NOT NULL, "year_day_number" SMALLINT NOT NULL, -- QUARTER "qtr_number" SMALLINT NOT NULL, -- MONTH "month_number" SMALLINT NOT NULL, "month_name" CHAR(9) NOT NULL, "month_day_number" SMALLINT NOT NULL, -- WEEK "week_day_number" SMALLINT NOT NULL, -- DAY "day_name" CHAR(9) NOT NULL, "day_is_weekday" SMALLINT NOT NULL, "day_is_last_of_month" SMALLINT NOT NULL ) DISTSTYLE ALL SORTKEY (date_id) ; INSERT INTO facts.dates ( "date_id" ,"full_date" ,"year_number" ,"year_week_number" ,"year_day_number" -- QUARTER ,"qtr_number" -- MONTH ,"month_number" ,"month_name" ,"month_day_number" -- WEEK ,"week_day_number" -- DAY ,"day_name" ,"day_is_weekday" ,"day_is_last_of_month" ) SELECT cast(seq + 1 AS INTEGER) AS date_id, -- DATE datum AS full_date, -- YEAR cast(extract(YEAR FROM datum) AS SMALLINT) AS year_number, cast(extract(WEEK FROM datum) AS SMALLINT) AS year_week_number, cast(extract(DOY FROM datum) AS SMALLINT) AS year_day_number, -- QUARTER cast(to_char(datum, 'Q') AS SMALLINT) AS qtr_number, -- MONTH cast(extract(MONTH FROM datum) AS SMALLINT) AS month_number, to_char(datum, 'Month') AS month_name, cast(extract(DAY FROM datum) AS SMALLINT) AS month_day_number, -- WEEK cast(to_char(datum, 'D') AS SMALLINT) AS week_day_number, -- DAY to_char(datum, 'Day') AS day_name, CASE WHEN to_char(datum, 'D') IN ('1', '7') THEN 0 ELSE 1 END AS day_is_weekday, CASE WHEN extract(DAY FROM (datum + (1 - extract(DAY FROM datum)) :: INTEGER + INTERVAL '1' MONTH) :: DATE - INTERVAL '1' DAY) = extract(DAY FROM datum) THEN 1 ELSE 0 END AS day_is_last_of_month FROM -- Generate days for 81 years starting from 2000. ( SELECT '2000-01-01' :: DATE + number AS datum, number AS seq FROM facts.numbers WHERE number between 0 and 81 * 365 + 20 ) DQ ORDER BY 1;
^请务必在所需日期范围的末尾设置数字