Amazon Redshift机制，用于将列聚合为字符串[重复]

一尘不染

Amazon Redshift机制，用于将列聚合为字符串[重复]

sql

我有表格中的数据集。

id  |   attribute
-----------------
1   |   a
2   |   b
2   |   a
2   |   a
3   |   c

所需的输出：

attribute|  num
-------------------
a        |  1
b,a      |  1
c        |  1

在MySQL中，我将使用：

select attribute, count(*) num 
from 
   (select id, group_concat(distinct attribute) attribute from dataset group by id) as     subquery 
group by attribute;

我不确定可以在Redshift中完成此操作，因为它不支持group_concat或任何psql组聚合函数，例如array_agg（）或string_agg（）。

一种可行的替代解决方案是，是否有办法让我从每个组中选择一个随机属性，而不是group_concat。这在Redshift中如何运作？

阅读 155

2021-05-16

共1个答案

一尘不染

受Masashi启发，此解决方案更简单，并且可以从Redshift中从一组中选择一个随机元素。

SELECT id, first_value as attribute 
FROM(SELECT id, FIRST_VALUE(attribute) 
    OVER(PARTITION BY id ORDER BY random() 
    ROWS BETWEEN unbounded preceding AND unbounded following) 
    FROM dataset) 
GROUP BY id, attribute ORDER BY id;

2021-05-16