ruby

ActiveRecord performance: the N+1 queries antipattern

Jeff Kreeftmeijer

Jeff Kreeftmeijer on

ActiveRecord performance: the N+1 queries antipattern

At AppSignal we help developers with application performance. We're monitoring a huge number of apps that send billions of requests. We thought we could also help a bit with a few blogposts about Ruby and performance. The N+1 queries problem is a common antipattern in Rails applications.

A lot of ORMs, like Rails' ActiveRecord, have lazy loading built in to allow you to defer querying associations until the moment they're needed. It allows being implicit about which associations need to be loaded by offloading this decision to the view.

The N+1 queries problem is a common, but usually easy to spot, performance antipattern that results in running a query for each association, which causes overhead when querying a large number of associations from the database.

Lazy loading in ActiveRecord

ActiveRecord uses implicit lazy loading to make it easier to work with relations. Let's consider the webshop example, where each Product can have any number of Variants which contain the product's color or size, for example.

1# app/models/product.rb
2class Product < ActiveRecord::Base
3  has_many :variants
4end

In ProductsController#show, the detail view for one of the products, we'll use Product.find(params[:id]) to get the product and assign it to the @product variable.

1# app/controllers/products_controller.rb
2class ProductsController < ApplicationController
3  def show
4    @product = Product.find(params[:id])
5  end
6end

In the view for this action, we'll loop over the product's variants by calling the variants method on the @product variable we received from the controller.

1# app/views/products/show.html.erb
2<h1><%= @product.title %></h1>
3
4<ul>
5<%= @product.variants.each do |variant| %>
6  <li><%= variant.name %></li>
7<% end %>
8</ul>

By calling @product.variants in the view, Rails will query the database to get the variants for us to loop over. Aside from the explicit query we did in the controller, we can see another query is executed to fetch the variants if we check Rails' logs for this request.

1Started GET "/products/1" for 127.0.0.1 at 2018-04-19 08:49:13 +0200
2Processing by ProductsController#show as HTML
3  Parameters: {"id"=>"1"}
4  Product Load (1.1ms)  SELECT  "products".* FROM "products" WHERE "products"."id" = ? LIMIT ?  [["id", 1], ["LIMIT", 1]]
5  Rendering products/show.html.erb within layouts/application
6  Variant Load (1.1ms)  SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = ?  [["product_id", 1]]
7  Rendered products/show.html.erb within layouts/application (4.4ms)
8Completed 200 OK in 64ms (Views: 56.4ms | ActiveRecord: 2.3ms)

This request executed two queries to show a product with all of its variants.

  1. SELECT "products".* FROM "products" WHERE "products"."id" = 1 LIMIT 1
  2. SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = 1

Looped lazy loading

Lazy loading has been great so far. By using an implicit query, we don't have to remember to remove it from the controller when we decide we don't want to show the variants on this view anymore, for example.

Let's say we're working on ProductsController#index, where we'd like to show a list of all products with each of their variants. We can implement that with lazy loading the same way as we did before.

1# app/controllers/products_controller.rb
2class ProductsController < ApplicationController
3  def index
4    @products = Product.all
5  end
6end
1# app/views/products/index.html.erb
2<h1>Products</h1>
3
4<% @products.each do |product| %>
5<article>
6  <h1><%= product.title %></h1>
7
8  <ul>
9    <% product.variants.each do |variant| %>
10      <li><%= variant.description %></li>
11    <% end %>
12  </ul>
13</article>
14<% end %>

Unlike the first example we now get a list of products from the controller instead of a single one. The view then loops over each product, and lazy loads each variant for each product.

While this works, there is one catch. Our query count is now N+1.

N+1 queries

In the first example, we rendered a view for a single product and its variants. The query count was 2 because we executed two queries. This request returned all products (3, in this example) from the database, and each of their variants, and it did four queries instead of two.

1Started GET "/products" for 127.0.0.1 at 2018-04-19 09:49:02 +0200
2Processing by ProductsController#index as HTML
3  Rendering products/index.html.erb within layouts/application
4  Product Load (0.3ms)  SELECT "products".* FROM "products"
5  Variant Load (0.2ms)  SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = ?  [["product_id", 1]]
6  Variant Load (0.2ms)  SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = ?  [["product_id", 2]]
7  Variant Load (0.1ms)  SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = ?  [["product_id", 3]]
8  Rendered products/index.html.erb within layouts/application (5.6ms)
9Completed 200 OK in 36ms (Views: 32.6ms | ActiveRecord: 0.8ms)
  1. SELECT "products".* FROM "products"
  2. SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = 1
  3. SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = 2
  4. SELECT "variants".* FROM "variants" WHERE "variants"."product_id" = 3

The first query, which is executed by the explicit call to Product.all in the controller, finds all products. The subsequent ones are lazily executed while looping over each product in the view.

This example results in a query count of N+1, where N is the number of products, and the added one is the explicit query that fetched all products. In other words; this example does one query, and then another one for each of the results in the first query. Because N = 3 in this example, the resulting query count is N + 1 = 3 + 1 = 4.

While this might not really be a problem when having only three products, the query count goes up with the number of products. Because we know this request has N+1 queries, we can predict a query count of 101 when we have 100 products (N + 1 = 100 + 1 = 101), for example.

Eager loading associations

Instead of increasing the number of queries with the number of products like we do now, we'd like to have a static number of requests in this view. We can do that by explicitly preloading the variants in the controller before rendering the view.

1# app/controllers/products_controller.rb
2class ProductsController < ApplicationController
3  def index
4    @products = Product.all.includes(:variants)
5  end
6end

ActiveRecord's includes query method makes sure the associated variants are loaded with their products. Because it knows which variants need to be loaded beforehand, it can fetch all variants of all requested products in one query.

1Started GET "/products" for 127.0.0.1 at 2018-04-19 10:33:59 +0200
2Processing by ProductsController#index as HTML
3  Rendering products/index.html.erb within layouts/application
4  Product Load (0.3ms)  SELECT "products".* FROM "products"
5  Variant Load (0.4ms)  SELECT "variants".* FROM "variants" WHERE "variants"."product_id" IN (?, ?, ?)  [["product_id", 1], ["product_id", 2], ["product_id", 3]]
6  Rendered products/index.html.erb within layouts/application (5.9ms)
7  Completed 200 OK in 45ms (Views: 40.8ms | ActiveRecord: 0.7ms)

By preloading the variants, the query count drops back to 2, even if the number of products increases in the future.

  1. SELECT "products".* FROM "products"
  2. SELECT "variants".* FROM "variants" WHERE "variants"."product_id" IN (1, 2, 3)

Lazy or eager?

In most situations, getting all associated records from the database in a single query is a lot faster than lazy loading them.

In this example application, the database performance difference is measurable with only three products, each having ten variants. On average, eager loading the products list is about 12.5% faster (0.7 ms vs 0.8 ms) than lazy loading. With ten products, that difference jumps to 59% (1.22 ms vs 2.98 ms). With 1000 products, the difference is almost 80%, as the eager queries clock in at 58.4 ms, while lazy loading them takes about 290.12 ms.

Although lazily-loaded associations give more flexibility in the view without having to update the controller, a good rule of thumb is to have the controller handle loading the data before passing it off to the view.

Lazy loading from the view works for views that show one model object and it's associations (like the ProductsController#show in our first example) and can be useful when having multiple views that require different data from the same controller, for example.

Cats and Dolls

Cats might not agree, but sometimes it pays of to be eager rather than lazy. In this post we dove into the lazy loading in ActiveRecord and showed an example of the situations in which this can create a performance issue. Like when it leads to the N+1 queries problem.

In short: always keep an eye on the development logs, or the event timeline in AppSignal, to make sure you’re not doing queries that could be lazy loaded and keep track of your response times, especially when the amount of data that’s processed increases.

If you liked this, check out some more things we wrote on performance and monitoring, like this favorite about Russian Doll Caching or this one about Conditional Get Requests.

Share this article

RSS

AppSignal monitors your apps

AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!

Discover AppSignal
AppSignal monitors your apps