Solving N+1 queries without Memory Problems — Utilizing scopes in Rails
Killing N+1 queries is essential, however, eliminating them without thinking is even more dangerous.
This article will cover a few options that you can utilize when dealing with N+1 problems including:
- Utilizing Select and Scopes
- Utilizing Custom Relations for limited association problem
N+1 Query
N+1 query happens when you execute the N queries statement to fetch the same data that could have been fetched in the primary query.
Let’s say we have a simple model setup like
class User < ApplicationRecord
has_many :posts
endclass Post < ApplicationRecord
belongs_to :user
end
Thanks to the beauty and convenience of ActiveRecord, it’s very common that we try to iterate users and fetch their posts in the loop which leads to the N+1 problem.
@users.each do |users|
users.posts.size
end
Includes is great but not enough
ActiveRecord provides a great method — includes
which will either preload or eager load associations in a smart way and prevents N+1 problems. You can learn more from this article — Preload, Eagerload, Includes and Joins.
In our case, since we are not querying associations, includes
try to preload
all the posts in a separate query by all users.id
from the primary query.
@users.includes(:posts).each do |users|
users.posts.size
end
N+1 Problem resolved!
However, if we look deep into the query, you’ll notice that we are loading ALL posts in memory.
Imagine you have 100 users and 100 posts per user, you are loading 10000 objects that you don’t even need!
It’s always good to keep in mind what you really needed when fetching data from the database. N+1 is bad, yet unnecessary memory allocation is worst.
Utilizing Select and Scopes
A SQL way to avoid running an extra query per post is to join the posts to count the associated posts in a single query.
class User < ActiveRecord::Base
scope :with_posts_count, -> {
select(
"users.*,
(
select count(*)
from posts
where users.id = posts.user_id
)
as posts_count"
)
.group("users.id")
}
end
This solution prevents extra queries by making the first query slower and more complex. Though the query is slower, it’s still a good direction for optimization if the query plan is healthy.
Remember, you should always check the query plan when you try to build complex queries.
@users.with_posts_count.map do |user|
user.posts_count
end
Noted that Counter cache is another great approach to speed up in this case.
However, it needs an additional column and it doesn’t work with conditions. It’s a space and time trade-off, just like most programming problem, the best solution depends on your use case.
Limited Associations Problems
What if we want to load only the latest post for every user? Is there a way to do it in a native approach?
The good news is we are able to define custom relations in your model with different conditions by passing lambda.
class User < ApplicationRecord
has_many :posts
has_one :latest_post, -> { order(created_at: :desc) },
class_name: "Post"
end
You are able to get the latest post by calling lastest_post
.
user.latest_post
Looks like it also works with includes
…?
@users.includes(:latest_post).each do |user|
user.latest_post
end
No…we run into the same memory issue again — loading all the posts without limit instead of 1 post per user!
You might potentially load ALL the posts without notice. If you’re dealing with large data, sometimes it’s even slower than loading them by N+1 query.
Utilizing Custom Relation
From the count example, we know that we can query more things in the SELECT statement.
The best thing is, we are also able to combine the result and pass it to our custom relation by specifying primary_key
.
class User < ApplicationRecord
has_one :latest_post,
class_name: "Post",
primary_key: :latest_post_id,
foreign_key: :id scope :with_latest_post, lambda {
select(
"users.*,
(
SELECT id as latest_post_id
FROM posts
WHERE user_id = users.id
ORDER BY created_at desc
LIMIT 1
)"
)
}
end
Then we can chain them together to eager load limited associations in a native and graceful way!
@users.with_latest_post.includes(:latest_post).map do |user|
user.latest_post
end
Wrap up
ActiveRecord is so great that we often forget the basics of SQL queries and how powerful the tool is.
By reviewing the SQL and all the options ActiveRecord provides, we are very likely to solve N+1 problems without memory bloating in a relatively graceful way.
References
The featured image is from https://shadowmyst.net/active-record-sin-rails/