]> cat aescling's git repositories - mastodon.git/commitdiff
Use Status.group instead of Status.distinct in HashQueryService (#14662)
authorAkihiko Odaki <nekomanma@pixiv.co.jp>
Tue, 25 Aug 2020 11:39:35 +0000 (20:39 +0900)
committerGitHub <noreply@github.com>
Tue, 25 Aug 2020 11:39:35 +0000 (13:39 +0200)
DISTINCT clause removes duplicated records according to all the selected
attributes. In reality, it can remove duplicated records only looking at
statuses.id, but the clause confuses the query planner and yields
insufficient performance.
The behavior is also problematic if the scope produced by HashQueryService
is used to query columns without id (using pluck method, for example). The
scope is expected to contain unique statuses, but the uniquness will be
evaluated with some arbitrary columns other than id.

GROUP BY clause resolves those problem by explicitly specifying the
column to take into account for the record distinction.

A workaround for the problem of DISTINCT clause in
Api::V1::Timelines::TagController is no longer necessary and removed.

app/controllers/api/v1/timelines/tag_controller.rb
app/services/hashtag_query_service.rb

index 2d6ad5a80c5bc3fc09ec8020cf112027cdf3c6ab..62f34d3f74c2ea58db2829753482993e82f0384e 100644 (file)
@@ -33,9 +33,7 @@ class Api::V1::Timelines::TagController < Api::BaseController
       )
 
       if truthy_param?(:only_media)
-        # `SELECT DISTINCT id, updated_at` is too slow, so pluck ids at first, and then select id, updated_at with ids.
-        status_ids = statuses.joins(:media_attachments).distinct(:id).pluck(:id)
-        statuses.where(id: status_ids)
+        statuses.joins(:media_attachments)
       else
         statuses
       end
index 196de0639205b20596ea5968210234ab390c5b08..0bdf60221034b2e1b3724839fbe67e64249dbe04 100644 (file)
@@ -8,7 +8,7 @@ class HashtagQueryService < BaseService
     all  = tags_for(params[:all])
     none = tags_for(params[:none])
 
-    Status.distinct
+    Status.group(:id)
           .as_tag_timeline(tags, account, local)
           .tagged_with_all(all)
           .tagged_with_none(none)