使用Django ORM,可以做类似的事情queryset.objects.annotate(Count('queryset_objects', gte=VALUE))。赶上我的漂移?
queryset.objects.annotate(Count('queryset_objects', gte=VALUE))
这是一个用于说明可能答案的快速示例:
在Django网站中,内容创建者提交文章,普通用户查看(即阅读)所述文章。文章可以发布(即供所有人阅读),也可以草稿模式发布。描述这些需求的模型是:
class Article(models.Model): author = models.ForeignKey(User) published = models.BooleanField(default=False) class Readership(models.Model): reader = models.ForeignKey(User) which_article = models.ForeignKey(Article) what_time = models.DateTimeField(auto_now_add=True)
我的问题是: 如何才能获得所有发表的文章,并按过去30分钟内的唯一读者排序?也就是说,我想计算每个已发表文章在过去半小时内获得多少个独特(唯一)视图,然后生成按这些独特视图排序的文章列表。
我试过了:
date = datetime.now()-timedelta(minutes=30) articles = Article.objects.filter(published=True).extra(select = { "views" : """ SELECT COUNT(*) FROM myapp_readership JOIN myapp_article on myapp_readership.which_article_id = myapp_article.id WHERE myapp_readership.reader_id = myapp_user.id AND myapp_readership.what_time > %s """ % date, }).order_by("-views")
这会产生错误: 语法错误在“ 01”或附近 (其中“ 01”是多余的日期时间对象)。这没什么可继续的。
使用条件聚合:
from django.db.models import Count, Case, When, IntegerField Article.objects.annotate( numviews=Count(Case( When(readership__what_time__lt=treshold, then=1), output_field=IntegerField(), )) )
说明: 通过您的文章进行的常规查询将带有numviews字段注释。该字段将构造为CASE / WHEN表达式(由Count包裹),对于符合NULL读者身份的条件和不符合条件的读者,将返回1 。计数将忽略空值,仅计数值。
numviews
NULL
对于最近未查看的文章,您将得到零,并且可以使用该numviews字段进行排序和过滤。
PostgreSQL后面的查询将是:
SELECT "app_article"."id", "app_article"."author", "app_article"."published", COUNT( CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN 1 ELSE NULL END ) as "numviews" FROM "app_article" LEFT OUTER JOIN "app_readership" ON ("app_article"."id" = "app_readership"."which_article_id") GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"
如果我们只想跟踪唯一查询,则可以在中添加区分Count,并使When子句返回值,我们希望在上面区分。
Count
When
from django.db.models import Count, Case, When, CharField, F Article.objects.annotate( numviews=Count(Case( When(readership__what_time__lt=treshold, then=F('readership__reader')), # it can be also `readership__reader_id`, it doesn't matter output_field=CharField(), ), distinct=True) )
这将产生:
SELECT "app_article"."id", "app_article"."author", "app_article"."published", COUNT( DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id" ELSE NULL END ) as "numviews" FROM "app_article" LEFT OUTER JOIN "app_readership" ON ("app_article"."id" = "app_readership"."which_article_id") GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"
您可以仅raw用于执行由django的较新版本创建的SQL语句。显然,没有一种简单而优化的方法可以在不使用数据的情况下查询该数据raw(即使extra注入必填JOIN子句也存在一些问题)。
raw
extra
JOIN
Articles.objects.raw('SELECT' ' "app_article"."id",' ' "app_article"."author",' ' "app_article"."published",' ' COUNT(' ' DISTINCT CASE WHEN "app_readership"."what_time" < 2015-11-18 11:04:00.000000+01:00 THEN "app_readership"."reader_id"' ' ELSE NULL END' ' ) as "numviews"' 'FROM "app_article" LEFT OUTER JOIN "app_readership"' ' ON ("app_article"."id" = "app_readership"."which_article_id")' 'GROUP BY "app_article"."id", "app_article"."author", "app_article"."published"')