我有两张桌子。帖子和回复。将帖子视为博客条目,而将回复视为评论。
我想显示X个帖子,然后显示每个帖子的最新三个评论。
我的回复有一个外键“ post_id”,它与每个帖子的“ id”匹配。
我正在尝试创建一个包含以下内容的主页
发表-回复-回复-回复
发表回复
以此类推。我可以通过在模板中使用for循环并丢弃不需要的答复来完成此操作,但是我讨厌从不会使用的数据库中获取数据。有任何想法吗?
这实际上是一个非常有趣的问题。
哈哈无视我,我吸
编辑时:此答案有效,但在MySQL上,当父行数少至100时,它变得非常繁琐。 但是,请参见下面的性能修复。
显然,您可以为每个帖子运行一次该查询:select * from comments where id = $id limit 3这会产生大量开销,因为最终每个帖子只能执行一个数据库查询,即可怕的 N + 1查询 。
select * from comments where id = $id limit 3
如果您想一次获取所有帖子(或某个带有子集的子集),那么以下操作将 令人惊讶 。它假定评论的id单调增加(因为不能保证日期时间是唯一的),但是允许在帖子之间插入评论id。
由于auto_increment id列是单调递增的,因此如果comment具有id,则一切就绪。
首先,创建此视图。在视图中,我调用postparent和comment child:
parent
child
create view parent_top_3_children as select a.*, (select max(id) from child where parent_id = a.id) as maxid, (select max(id) from child where id < maxid and parent_id = a.id) as maxidm1, (select max(id) from child where id < maxidm1 and parent_id = a.id) as maxidm2 from parent a;
maxidm1只是“最大ID减去1”;maxidm2,“最大ID减去2”-即, 特定父 ID 内 的第二和第三最大子ID 。
maxidm1
maxidm2
然后将视图加入到注释中所需的任何内容(我将其称为text):
text
select a.*, b.text as latest_comment, c.text as second_latest_comment, d.text as third_latest_comment from parent_top_3_children a left outer join child b on (b.id = a.maxid) left outer join child c on (c.id = a.maxidm1) left outer join child d on (c.id = a.maxidm2);
当然,您可以在其中添加任何where子句,以限制帖子数:where a.category = 'foo'或其他。
where a.category = 'foo'
这是我的桌子的样子:
mysql> select * from parent; +----+------+------+------+ | id | a | b | c | +----+------+------+------+ | 1 | 1 | 1 | NULL | | 2 | 2 | 2 | NULL | | 3 | 3 | 3 | NULL | +----+------+------+------+ 3 rows in set (0.00 sec)
和一部分孩子。父母1的孩子没有:
mysql> select * from child; +----+-----------+------+------+------+------+ | id | parent_id | a | b | c | d | +----+-----------+------+------+------+------+ . . . . | 18 | 3 | NULL | NULL | NULL | NULL | | 19 | 2 | NULL | NULL | NULL | NULL | | 20 | 2 | NULL | NULL | NULL | NULL | | 21 | 3 | NULL | NULL | NULL | NULL | | 22 | 2 | NULL | NULL | NULL | NULL | | 23 | 2 | NULL | NULL | NULL | NULL | | 24 | 3 | NULL | NULL | NULL | NULL | | 25 | 2 | NULL | NULL | NULL | NULL | +----+-----------+------+------+------+------+ 24 rows in set (0.00 sec)
该视图为我们提供了这一点:
mysql> select * from parent_top_3; +----+------+------+------+-------+---------+---------+ | id | a | b | c | maxid | maxidm1 | maxidm2 | +----+------+------+------+-------+---------+---------+ | 1 | 1 | 1 | NULL | NULL | NULL | NULL | | 2 | 2 | 2 | NULL | 25 | 23 | 22 | | 3 | 3 | 3 | NULL | 24 | 21 | 18 | +----+------+------+------+-------+---------+---------+ 3 rows in set (0.21 sec)
该视图的解释计划仅是毛茸茸的:
mysql> explain select * from parent_top_3; +----+--------------------+------------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+------------+------+---------------+------+---------+------+------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | | | 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 3 | | | 5 | DEPENDENT SUBQUERY | child | ALL | PRIMARY | NULL | NULL | NULL | 24 | Using where | | 4 | DEPENDENT SUBQUERY | child | ALL | PRIMARY | NULL | NULL | NULL | 24 | Using where | | 3 | DEPENDENT SUBQUERY | child | ALL | NULL | NULL | NULL | NULL | 24 | Using where | +----+--------------------+------------+------+---------------+------+---------+------+------+-------------+
但是,如果我们为parent_fks添加一个索引,它将得到更好的效果:
mysql> create index pid on child(parent_id); mysql> explain select * from parent_top_3; +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | | | 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 3 | | | 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 2 | Using where | | 4 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 2 | Using where | | 3 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 2 | Using where | +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ 5 rows in set (0.04 sec)
如上所述, 即使我们使用其主键索引到父级 ,当父级行数很少为100时,这种情况也开始崩溃:
mysql> select * from parent_top_3 where id < 10; +----+------+------+------+-------+---------+---------+ | id | a | b | c | maxid | maxidm1 | maxidm2 | +----+------+------+------+-------+---------+---------+ | 1 | 1 | 1 | NULL | NULL | NULL | NULL | | 2 | 2 | 2 | NULL | 25 | 23 | 22 | | 3 | 3 | 3 | NULL | 24 | 21 | 18 | | 4 | NULL | 1 | NULL | 65 | 64 | 63 | | 5 | NULL | 2 | NULL | 73 | 72 | 71 | | 6 | NULL | 3 | NULL | 113 | 112 | 111 | | 7 | NULL | 1 | NULL | 209 | 208 | 207 | | 8 | NULL | 2 | NULL | 401 | 400 | 399 | | 9 | NULL | 3 | NULL | 785 | 784 | 783 | +----+------+------+------+-------+---------+---------+ 9 rows in set (1 min 3.11 sec)
(请注意,我故意在速度较慢的计算机上进行测试,并将数据保存在速度较慢的闪存盘上。)
这是解释,正好寻找一个id(和第一个ID):
mysql> explain select * from parent_top_3 where id = 1; +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1000 | Using where | | 2 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 1000 | | | 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 179 | Using where | | 4 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | util.a.id | 179 | Using where | | 3 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 179 | Using where | +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ 5 rows in set (56.01 sec)
即使在我的慢速机器上,连续超过56秒也无法接受两个数量级。
那么我们可以保存此查询吗?它 有效 ,但速度太慢。
这是修改后的查询的说明计划。看起来糟坏了:
mysql> explain select * from parent_top_3a where id = 1; +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 100 | Using where | | 2 | DERIVED | <derived4> | ALL | NULL | NULL | NULL | NULL | 100 | | | 4 | DERIVED | <derived6> | ALL | NULL | NULL | NULL | NULL | 100 | | | 6 | DERIVED | a | ALL | NULL | NULL | NULL | NULL | 100 | | | 7 | DEPENDENT SUBQUERY | child | ref | pid | pid | 5 | util.a.id | 179 | Using where | | 5 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | a.id | 179 | Using where | | 3 | DEPENDENT SUBQUERY | child | ref | PRIMARY,pid | pid | 5 | a.id | 179 | Using where | +----+--------------------+------------+------+---------------+------+---------+-----------+------+-------------+ 7 rows in set (0.05 sec)
但是,它能以每秒1/20秒的速度完成 三个 数量级的运算!
我们如何到达更快的parent_top_3a?我们创建 三个 视图,每个视图都依赖于前一个视图:
create view parent_top_1 as select a.*, (select max(id) from child where parent_id = a.id) as maxid from parent a; create view parent_top_2 as select a.*, (select max(id) from child where parent_id = a.id and id < a.maxid) as maxidm1 from parent_top_1 a; create view parent_top_3a as select a.*, (select max(id) from child where parent_id = a.id and id < a.maxidm1) as maxidm2 from parent_top_2 a;
这不仅可以更快地工作,而且在MySQL以外的RDBMS上是合法的。
让我们将父行的数量增加到12800,子行的数量增加到1536(大多数博客帖子没有评论,对吗?))
mysql> select * from parent_top_3a where id >= 20 and id < 40; +----+------+------+------+-------+---------+---------+ | id | a | b | c | maxid | maxidm1 | maxidm2 | +----+------+------+------+-------+---------+---------+ | 39 | NULL | 2 | NULL | NULL | NULL | NULL | | 38 | NULL | 1 | NULL | NULL | NULL | NULL | | 37 | NULL | 3 | NULL | NULL | NULL | NULL | | 36 | NULL | 2 | NULL | NULL | NULL | NULL | | 35 | NULL | 1 | NULL | NULL | NULL | NULL | | 34 | NULL | 3 | NULL | NULL | NULL | NULL | | 33 | NULL | 2 | NULL | NULL | NULL | NULL | | 32 | NULL | 1 | NULL | NULL | NULL | NULL | | 31 | NULL | 3 | NULL | NULL | NULL | NULL | | 30 | NULL | 2 | NULL | 1537 | 1536 | 1535 | | 29 | NULL | 1 | NULL | 1529 | 1528 | 1527 | | 28 | NULL | 3 | NULL | 1513 | 1512 | 1511 | | 27 | NULL | 2 | NULL | 1505 | 1504 | 1503 | | 26 | NULL | 1 | NULL | 1481 | 1480 | 1479 | | 25 | NULL | 3 | NULL | 1457 | 1456 | 1455 | | 24 | NULL | 2 | NULL | 1425 | 1424 | 1423 | | 23 | NULL | 1 | NULL | 1377 | 1376 | 1375 | | 22 | NULL | 3 | NULL | 1329 | 1328 | 1327 | | 21 | NULL | 2 | NULL | 1281 | 1280 | 1279 | | 20 | NULL | 1 | NULL | 1225 | 1224 | 1223 | +----+------+------+------+-------+---------+---------+ 20 rows in set (1.01 sec)
请注意,这些计时是针对MyIsam表的;我将它留给其他人在Innodb上进行计时。
但是使用Postgresql,在相似但不相同的数据集上,我们对where涉及parent的列的谓词获得了类似的计时:
where
postgres=# select (select count(*) from parent) as parent_count, (select count(*) from child) as child_count; parent_count | child_count --------------+------------- 12289 | 1536 postgres=# select * from parent_top_3a where id >= 20 and id < 40; id | a | b | c | maxid | maxidm1 | maxidm2 ----+---+----+---+-------+---------+--------- 20 | | 18 | | 1464 | 1462 | 1461 21 | | 88 | | 1463 | 1460 | 1457 22 | | 72 | | 1488 | 1486 | 1485 23 | | 13 | | 1512 | 1510 | 1509 24 | | 49 | | 1560 | 1558 | 1557 25 | | 92 | | 1559 | 1556 | 1553 26 | | 45 | | 1584 | 1582 | 1581 27 | | 37 | | 1608 | 1606 | 1605 28 | | 96 | | 1607 | 1604 | 1601 29 | | 90 | | 1632 | 1630 | 1629 30 | | 53 | | 1631 | 1628 | 1625 31 | | 57 | | | | 32 | | 64 | | | | 33 | | 79 | | | | 34 | | 37 | | | | 35 | | 60 | | | | 36 | | 75 | | | | 37 | | 34 | | | | 38 | | 87 | | | | 39 | | 43 | | | | (20 rows) Time: 91.139 ms