You'll get the best performance if you forget the where clause and place all conditions in the ON expression.
I think this is because the query first has to join the tables then runs the where clause on that, so if you can reduce what is required to join then that's the fasted way to get the results/do the udpate.
Example
Scenario
You have a table of users. They can log in using their username or email or account_number. These accounts can be active (1) or inactive (0). This table has 50000 rows
You then have a table of users to disable at one go because you find out they've all done something bad. This table however, has one column with usernames, emails and account numbers mixed. It also has a "has_run" indicator which needs to be set to 1 (true) when it has been run
Query
UPDATE users User
INNER JOIN
blacklist_users BlacklistUser
ON
(
User.username = BlacklistUser.account_ref
OR
User.email = BlacklistedUser.account_ref
OR
User.phone_number = BlacklistUser.account_ref
AND
User.is_active = 1
AND
BlacklistUser.has_run = 0
)
SET
User.is_active = 0,
BlacklistUser.has_run = 1;
Reasoning
If we had to join on just the OR conditions it would essentially need to check each row 4 times to see if it should join, and potentially return a lot more rows. However, by giving it more conditions it can "skip" a lot of rows if they don't meet all the conditions when joining.
Bonus
It's more readable. All the conditions are in one place and the rows to update are in one place
No comments:
Post a Comment