summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorTim Graham <timograham@gmail.com>2015-10-29 18:43:53 -0400
committerTim Graham <timograham@gmail.com>2015-11-12 13:22:13 -0500
commit16865782d7d92c863bcb62e2e91400dd8af79c5b (patch)
tree645b48ba8890fb278be62a7c2eceff63fcedef5b /docs
parent5bb40e7608c45b35af5868d190a44462ec88d2c0 (diff)
[1.9.x] Fixed #10045 -- Corrected docs about .annotate()/.filter() ordering.
Thanks Josh, Anssi, and Carl for reviews and advice. Backport of 91a431f48c1fc5ecc9a837e8071a0062d31b490f from master
Diffstat (limited to 'docs')
-rw-r--r--docs/topics/db/aggregation.txt76
1 files changed, 58 insertions, 18 deletions
diff --git a/docs/topics/db/aggregation.txt b/docs/topics/db/aggregation.txt
index 65a1eb7ee6..f34648124e 100644
--- a/docs/topics/db/aggregation.txt
+++ b/docs/topics/db/aggregation.txt
@@ -184,6 +184,8 @@ of the ``annotate()`` clause is a ``QuerySet``; this ``QuerySet`` can be
modified using any other ``QuerySet`` operation, including ``filter()``,
``order_by()``, or even additional calls to ``annotate()``.
+.. _combining-multiple-aggregations:
+
Combining multiple aggregations
-------------------------------
@@ -340,29 +342,67 @@ Order of ``annotate()`` and ``filter()`` clauses
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When developing a complex query that involves both ``annotate()`` and
-``filter()`` clauses, particular attention should be paid to the order
-in which the clauses are applied to the ``QuerySet``.
+``filter()`` clauses, pay particular attention to the order in which the
+clauses are applied to the ``QuerySet``.
+
+When an ``annotate()`` clause is applied to a query, the annotation is computed
+over the state of the query up to the point where the annotation is requested.
+The practical implication of this is that ``filter()`` and ``annotate()`` are
+not commutative operations.
+
+Given:
+
+* Publisher A has two books with ratings 4 and 5.
+* Publisher B has two books with ratings 1 and 4.
+* Publisher C has one book with rating 1.
+
+Here's an example with the ``Count`` aggregate::
+
+ >>> a, b = Publisher.objects.annotate(num_books=Count('book', distinct=True)).filter(book__rating__gt=3.0)
+ >>> a, a.num_books
+ (<Publisher: A>, 2)
+ >>> b, b.num_books
+ (<Publisher: B>, 2)
+
+ >>> a, b = Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book'))
+ >>> a, a.num_books
+ (<Publisher: A>, 2)
+ >>> b, b.num_books
+ (<Publisher: B>, 1)
+
+Both queries return a list of publishers that have at least one book with a
+rating exceeding 3.0, hence publisher C is excluded.
+
+In the first query, the annotation precedes the filter, so the filter has no
+effect on the annotation. ``distinct=True`` is required to avoid a
+:ref:`cross-join bug <combining-multiple-aggregations>`.
+
+The second query counts the number of books that have a rating exceeding 3.0
+for each publisher. The filter precedes the annotation, so the filter
+constrains the objects considered when calculating the annotation.
-When an ``annotate()`` clause is applied to a query, the annotation is
-computed over the state of the query up to the point where the annotation
-is requested. The practical implication of this is that ``filter()`` and
-``annotate()`` are not commutative operations -- that is, there is a
-difference between the query::
+Here's another example with the ``Avg`` aggregate::
- >>> Publisher.objects.annotate(num_books=Count('book')).filter(book__rating__gt=3.0)
+ >>> a, b = Publisher.objects.annotate(avg_rating=Avg('book__rating')).filter(book__rating__gt=3.0)
+ >>> a, a.avg_rating
+ (<Publisher: A>, 4.5) # (5+4)/2
+ >>> b, b.avg_rating
+ (<Publisher: B>, 2.5) # (1+4)/2
-and the query::
+ >>> a, b = Publisher.objects.filter(book__rating__gt=3.0).annotate(avg_rating=Avg('book__rating'))
+ >>> a, a.avg_rating
+ (<Publisher: A>, 4.5) # (5+4)/2
+ >>> b, b.avg_rating
+ (<Publisher: B>, 4.0) # 4/1 (book with rating 1 excluded)
- >>> Publisher.objects.filter(book__rating__gt=3.0).annotate(num_books=Count('book'))
+The first query asks for the average rating of all a publisher's books for
+publisher's that have at least one book with a rating exceeding 3.0. The second
+query asks for the average of a publisher's book's ratings for only those
+ratings exceeding 3.0.
-Both queries will return a list of publishers that have at least one good
-book (i.e., a book with a rating exceeding 3.0). However, the annotation in
-the first query will provide the total number of all books published by the
-publisher; the second query will only include good books in the annotated
-count. In the first query, the annotation precedes the filter, so the
-filter has no effect on the annotation. In the second query, the filter
-precedes the annotation, and as a result, the filter constrains the objects
-considered when calculating the annotation.
+It's difficult to intuit how the ORM will translate complex querysets into SQL
+queries so when in doubt, inspect the SQL with ``str(queryset.query)`` and
+write plenty of tests.
``order_by()``
--------------