Thursday, January 22, 2009

AR constructs chained named scope queries in the order opposite what you'd expect

Here's a good one. Consider this chain of named scopes:

Conversation.in_the_last(1.day).
with_question.
not_error.
not_canceled.
not_flagged.
not_answered.
not_involving(user)


You'd expect the created_at restriction to come first in the resulting SELECT, wouldn't you? Nope:

SELECT * FROM `conversations` WHERE (((((((NOT EXISTS (SELECT chs3.id FROM channels chs3 WHERE conversations.id = chs3.conversation_id AND chs3.user_id = 176)) AND (NOT EXISTS (SELECT chs2.id FROM channels chs2 WHERE conversations.id = chs2.conversation_id AND chs2.answer_request_response = 'answer'))) AND (NOT EXISTS (SELECT chs1.id FROM channels chs1 WHERE conversations.id = chs1.conversation_id AND chs1.flagged = 1))) AND (`conversations`.`canceled` = 0 )) AND (`conversations`.`error` = 0 )) AND (EXISTS (SELECT chs4.id FROM channels chs4 WHERE conversations.id = chs4.conversation_id AND chs4.asker = 1 AND chs4.has_question = 1))) AND (created_at > '2009-01-21 16:38:23'));

AR sticks it at the end of the query and as a result it runs in 6-10 seconds on acceptance (w/no query cache). Applying the created_at restriction at the end of the chain constructs a query with it at the beginning:

Conversation.with_question.
not_error.
not_canceled.
not_flagged.
not_answered.
not_involving(user).
in_the_last(1.day)


SELECT * FROM `conversations` WHERE (((((((created_at > '2009-01-21 18:07:29') AND (NOT EXISTS (SELECT chs3.id FROM channels chs3 WHERE conversations.id = chs3.conversation_id AND chs3.user_id = 176))) AND (NOT EXISTS (SELECT chs2.id FROM channels chs2 WHERE conversations.id = chs2.conversation_id AND chs2.answer_request_response = 'answer'))) AND (NOT EXISTS (SELECT chs1.id FROM channels chs1 WHERE conversations.id = chs1.conversation_id AND chs1.flagged = 1))) AND (`conversations`.`canceled` = 0 )) AND (`conversations`.`error` = 0 )) AND (EXISTS (SELECT chs4.id FROM channels chs4 WHERE conversations.id = chs4.conversation_id AND chs4.asker = 1 AND chs4.has_question = 1)))

Apparently it can now use the created_at to eliminate rows in the scan. The resulting query time is about 300ms on acceptance (again, w/no query cache).

No comments:

Post a Comment