如何从MySql中存储为逗号分隔值的数据中选择不同值的计数?最后,我将使用PHP从MySql输出数据。
里面有每个帖子的标签。因此,最后,我试图输出数据,就像stackoverflow处理其标签的方式一样,如下所示:
tag-name x 5
这就是表中数据的样子(对内容感到抱歉,但这是一个食谱网站)。
"postId" "tags" "category-code" "1" "pho,pork" "1" "2" "fried-rice,chicken" "1" "3" "fried-rice,pork" "1" "4" "chicken-calzone,chicken" "1" "5" "fettuccine,chicken" "1" "6" "spaghetti,chicken" "1" "7" "spaghetti,chorizo" "1" "8" "spaghetti,meat-balls" "1" "9" "miso-soup" "1" "10" "chanko-nabe" "1" "11" "chicken-manchurian,chicken,manchurain" "1" "12" "pork-manchurian,pork,manchurain" "1" "13" "sweet-and-sour-pork,pork" "1" "14" "peking-duck,duck" "1"
输出量
chicken 5 // occurs 5 time in the data above pork 4 // occurs 4 time in the data above spaghetti 3 // an so on fried-rice 2 manchurian 2 pho 1 chicken-calzone 1 fettuccine 1 chorizo 1 meat-balls 1 miso-soup 1 chanko-nabe 1 chicken-manchurian 1 pork-manchurian 1 sweet-n-sour-pork 1 peking-duck 1 duck 1
我正在尝试select count of all distinct values in there,但是由于它是用逗号分隔的数据,因此似乎无法执行此操作。select distinct不管用。
select count of all distinct values in there
select distinct
您能在mysql中还是使用php来获得类似于我已经完成的输出的好方法吗?
我真的不知道如何在不创建包含数字的表的情况下将逗号分隔的值的水平列表转换为行列表,而该表包含的数字与您可能有的逗号分隔的值一样多。如果可以创建此表,这是我的答案:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) AS one_tag, COUNT(*) AS cnt FROM ( SELECT GROUP_CONCAT(tags separator ',') AS all_tags, LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test ) t JOIN numbers n ON n.num <= t.count_tags GROUP BY one_tag ORDER BY cnt DESC;
返回值:
+---------------------+-----+ | one_tag | cnt | +---------------------+-----+ | chicken | 5 | | pork | 4 | | spaghetti | 3 | | fried-rice | 2 | | manchurain | 2 | | pho | 1 | | chicken-calzone | 1 | | fettuccine | 1 | | chorizo | 1 | | meat-balls | 1 | | miso-soup | 1 | | chanko-nabe | 1 | | chicken-manchurian | 1 | | pork-manchurian | 1 | | sweet-and-sour-pork | 1 | | peking-duck | 1 | | duck | 1 | +---------------------+-----+ 17 rows in set (0.01 sec)
让我们构建您的架构:
CREATE TABLE test ( id INT PRIMARY KEY, tags VARCHAR(255) ); INSERT INTO test VALUES ("1", "pho,pork"), ("2", "fried-rice,chicken"), ("3", "fried-rice,pork"), ("4", "chicken-calzone,chicken"), ("5", "fettuccine,chicken"), ("6", "spaghetti,chicken"), ("7", "spaghetti,chorizo"), ("8", "spaghetti,meat-balls"), ("9", "miso-soup"), ("10", "chanko-nabe"), ("11", "chicken-manchurian,chicken,manchurain"), ("12", "pork-manchurian,pork,manchurain"), ("13", "sweet-and-sour-pork,pork"), ("14", "peking-duck,duck");
我们将在一行中处理所有标签,因此我们GROUP_CONCAT可以完成以下工作:
GROUP_CONCAT
SELECT GROUP_CONCAT(tags SEPARATOR ',') FROM test;
返回所有用逗号分隔的标签:
河粉,猪肉,大米,鸡,大米,猪肉,鸡肉,意大利细面条,鸡肉,意大利面,鸡肉,意大利面,香肠,意大利面,肉丸子,味增汤,日本火锅,鸡肉满洲,鸡肉,满洲,猪肉满洲,猪肉,满洲,糖醋猪肉,猪肉,北京烤鸭,鸭
要计算所有标签,我们获得标签完整列表的长度,并在将空号替换为之后删除标签完整列表的长度,。我们加1,因为分隔符在两个值之间。
,
SELECT LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test;
+------------+ | count_tags | +------------+ | 28 | +------------+ 1 row in set (0.00 sec)
我们使用SUBSTRING_INDEX函数来获得
SUBSTRING_INDEX
-- returns the string until the 2nd delimiter\'s occurrence from left to right: a,b SELECT SUBSTRING_INDEX('a,b,c', ',', 2); -- return the string until the 1st delimiter, from right to left: c SELECT SUBSTRING_INDEX('a,b,c', ',', -1); -- we need both to get: b (with 2 being the tag number) SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('a,b,c', ',', 2), ',', -1);
通过这种逻辑,要在列表中获得第3个标记,我们使用:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) FROM test;
+-------------------------------------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(tags SEPARATOR ','), ',', 3), ',', -1) | +-------------------------------------------------------------------------------------+ | fried-rice | +-------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
我的想法有些棘手:
因此,我们将创建一个表格,其中包含从1到列表中可能具有的最大标签数的所有数字。如果可以有1M个值,请创建1M到1,000,000之间的1M条目。对于100个标签,这将是:
CREATE TABLE numbers ( num INT PRIMARY KEY ); INSERT INTO numbers VALUES ( 1 ), ( 2 ), ( 3 ), ( 4 ), ( 5 ), ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ), ( 12 ), ( 13 ), ( 14 ), ( 15 ), ( 16 ), ( 17 ), ( 18 ), ( 19 ), ( 20 ), ( 21 ), ( 22 ), ( 23 ), ( 24 ), ( 25 ), ( 26 ), ( 27 ), ( 28 ), ( 29 ), ( 30 ), ( 31 ), ( 32 ), ( 33 ), ( 34 ), ( 35 ), ( 36 ), ( 37 ), ( 38 ), ( 39 ), ( 40 ), ( 41 ), ( 42 ), ( 43 ), ( 44 ), ( 45 ), ( 46 ), ( 47 ), ( 48 ), ( 49 ), ( 50 ), ( 51 ), ( 52 ), ( 53 ), ( 54 ), ( 55 ), ( 56 ), ( 57 ), ( 58 ), ( 59 ), ( 60 ), ( 61 ), ( 62 ), ( 63 ), ( 64 ), ( 65 ), ( 66 ), ( 67 ), ( 68 ), ( 69 ), ( 70 ), ( 71 ), ( 72 ), ( 73 ), ( 74 ), ( 75 ), ( 76 ), ( 77 ), ( 78 ), ( 79 ), ( 80 ), ( 81 ), ( 82 ), ( 83 ), ( 84 ), ( 85 ), ( 86 ), ( 87 ), ( 88 ), ( 89 ), ( 90 ), ( 91 ), ( 92 ), ( 93 ), ( 94 ), ( 95 ), ( 96 ), ( 97 ), ( 98 ), ( 99 ), ( 100 );
现在,我们使用以下查询获得numth(num为的行number):
num
number
SELECT n.num, SUBSTRING_INDEX(SUBSTRING_INDEX(all_tags, ',', num), ',', -1) as one_tag FROM ( SELECT GROUP_CONCAT(tags SEPARATOR ',') AS all_tags, LENGTH(GROUP_CONCAT(tags SEPARATOR ',')) - LENGTH(REPLACE(GROUP_CONCAT(tags SEPARATOR ','), ',', '')) + 1 AS count_tags FROM test ) t JOIN numbers n ON n.num <= t.count_tags
+-----+---------------------+ | num | one_tag | +-----+---------------------+ | 1 | pho | | 2 | pork | | 3 | fried-rice | | 4 | chicken | | 5 | fried-rice | | 6 | pork | | 7 | chicken-calzone | | 8 | chicken | | 9 | fettuccine | | 10 | chicken | | 11 | spaghetti | | 12 | chicken | | 13 | spaghetti | | 14 | chorizo | | 15 | spaghetti | | 16 | meat-balls | | 17 | miso-soup | | 18 | chanko-nabe | | 19 | chicken-manchurian | | 20 | chicken | | 21 | manchurain | | 22 | pork-manchurian | | 23 | pork | | 24 | manchurain | | 25 | sweet-and-sour-pork | | 26 | pork | | 27 | peking-duck | | 28 | duck | +-----+---------------------+ 28 rows in set (0.01 sec)
现在,有了 经典 行,我们就可以轻松计算每个标签的出现次数。
请参阅此答案的顶部以查看请求。