如果我没有记错的话,计算当天活跃用户应该很简单。只需将今天和x天退回(7天有效将是6天退回),然后计算不同的ID。我有一个为期2天的活跃用户的以下查询:
WITH allTables AS ( SELECT CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) AS app, event.date, user_dim.app_info.app_instance_id as users FROM `dataset.app_events_intraday_20170407` CROSS JOIN UNNEST(event_dim) AS event UNION ALL SELECT CONCAT(user_dim.app_info.app_id, ':', user_dim.app_info.app_platform) AS app, event.date, user_dim.app_info.app_instance_id as users FROM `dataset.app_events_20170406` CROSS JOIN UNNEST(event_dim) AS event ) SELECT COUNT(DISTINCT(users)) AS unique, COUNT(users) as total FROM allTables
这是为期2天的活动,但对于7天或30天,我只需要合并所有这些表即可。这是正确的还是需要修改?
而不是使用UNION ALL,您应该尝试使用 Querying Multiple Tables Using a Wildcard Table
Querying Multiple Tables Using a Wildcard Table
尝试以下类似的方法
#standardSQL WITH allTables AS ( SELECT CONCAT(user_dim.app_info.app_instance_id, ':', user_dim.app_info.app_platform) AS app, event.date, user_dim.app_info.app_instance_id AS users FROM `dataset.app_events_intraday_*`, UNNEST(event_dim) AS event WHERE _TABLE_SUFFIX BETWEEN '20170401' AND '20170407' UNION ALL SELECT CONCAT(user_dim.app_info.app_instance_id, ':', user_dim.app_info.app_platform) AS app, event.date, user_dim.app_info.app_instance_id AS users FROM `dataset.app_events_*`, UNNEST(event_dim) AS event WHERE _TABLE_SUFFIX BETWEEN '20170401' AND '20170407' ) SELECT COUNT(DISTINCT(users)) AS unique, COUNT(users) AS total FROM allTables
您可以在下面的forWHERE子句中使用使其更通用
WHERE
WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 6 DAY)) AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
另请注意:我改变app_id在user_dim.app_info.app_id给app_instance_id,我认为这是在你的身边错字- 但我可能是错的
app_id
user_dim.app_info.app_id
app_instance_id