ActiveRecord is Ruby on Rails’ most magical feature. We don’t usually need to worry about its inner workings, but when we do, here’s how AppSignal can help us know what’s going on under the hood.
What Is ActiveRecord?
To talk about ActiveRecord, we need to first think of frameworks, specifically about MVC frameworks. MVC stands for Model-View-Controller, and it’s a popular software design pattern for graphical and web applications.
MVC frameworks are composed of:
- Model: handles business logic and data persistence.
- View: drives the presentation layer and draws the user interface.
- Controller: ties everything together.
ActiveRecord is the model component in the Ruby in Rails framework. It introduces an abstraction layer between code and data, so we don’t have to write SQL code ourselves. Each model is mapped to one table and provides various methods to perform CRUD operations (Create, Read, Update and Delete).
Monitoring ActiveRecord With AppSignal
Abstractions feel magical — they help us ignore details we don’t need to know and focus on the task at hand. But when things don’t work as expected, the complexity they add can make it harder to determine the root cause. AppSignal can give us a detailed breakdown of what’s really happening in Rails.
The Response Time Graph
Let’s get a first look into the problem. Troubleshooting performance issues is an iterative process. Once you have your Rails application reporting to AppSignal, go to Incidents > Performance.
The Response Time graph will show the response time percentiles for each namespace. ActiveRecord events are automatically assigned to the namespace the request or background job the queries are executed in.
Next, look at the Event Group graph. It shows how much time is consumed by category. Check how much relative time is spent by active_record. Check the usage in all your namespaces.
The graph will immediately tell us where we should focus our code-optimizing efforts.
While you’re in the performance graph dashboard, check the response time and the throughput to ensure there isn’t any higher-than-usual activity on your application.
The Slow Queries Dashboard
Now that we’ve identified that the problem is data-bound let’s see if we can zoom in to determine the root cause.
Open the Improve > Slow Queries dashboard. This page shows the list of SQL queries ranked by impact on the overall time. All ActiveRecord-originated queries are shown as a sql.active_record events.
Try clicking on the topmost query to see its details. The dashboard shows the average duration and the query text.
Scrolling below will show you the query’s response time in the last few hours and the originating action.
You might find that some of the actions have an associated incident. This means that AppSignal created a performance incident while the query was running, but it doensn’t necessarily mean that ActiveRecord is the cause of it.
Performance Measurements Dashboard
Performance measurement incidents are opened when AppSignal records a new endpoint or background job.
Incidents are located on Performance > Issue List dashboard.
The incident page shows the elapsed time and number of allocations for each of the MVC components. ActiveRecord problems will exhibit long durations in the active_record category.
The event timeline shows how the event progressed over time.
Finding ActiveRecord Issues
In this section, we’ll see how AppSignal can help us identify some common ActiveError issues.
Selecting the Relevant Columns
Database wisdom says that we should always retrieve the columns needed for the job. For example, instead of SELECT * FROM people we should SELECT first_name, surname, birthdate FROM people. That’s all well and good, but how do we do it on Rails?
By default, ActiveRecord retrieves all columns.
1Person.all.each {
2      # process data
3}
Luckily, we have the select method to pick and choose the required columns:
1Person.select(:name, :address, :birthdate).each {
2      # process data
3}
It may sound like I’m being obsessive about little details. But on wide tables, selecting all columns is just wasteful. You’ll notice that when this happens, ActiveRecord allocates big chunks of memory:
N+1 Problems
The N+1 problem happens when the application gets a set of records from the database and loops through it. This causes the application to execute N+1 queries, where N is the number of rows initially obtained. As you might imagine, this pattern scales poorly as the table grows. It’s such a damaging problem that AppSignal specifically warns you about it:
N+1 problems usually appear with associated models. Imagine we have a Person model:
1class Person < ApplicationRecord
2    has_many :addresses
3endEach person can have many addresses:
1class Address < ApplicationRecord
2    belongs_to :person
3endThe most straightforward way of retrieving the data leads to the N+1 problem:
1class RelatedTablesController < ApplicationController
2    def index
3        Person.all.each do |person|
4            person.addresses.each do |address|
5            address.address
6            end
7        end
8    end
9endYou can see in AppSignal that the application is running a SELECT per person:
The fix for this particular case is simple: use includes, which tells ActiveRecord to optimize the query for the needed columns:
1class RelatedTablesController < ApplicationController
2    def index
3        Person.all.includes(:addresses).each do |person|
4            person.addresses.each do |address|
5            address.address
6            end
7        end
8    end
9endNow we have two queries instead of N+1:
1Processing by RelatedTablesController#index as HTML
2   (0.2ms)  SELECT sqlite_version(*)
3  ↳ app/controllers/related_tables_controller.rb:12:in `index'
4  Person Load (334.6ms)  SELECT "people".* FROM "people"
5  ↳ app/controllers/related_tables_controller.rb:12:in `index'
6
7  Address Load (144.4ms)  SELECT "addresses".* FROM "addresses" WHERE "addresses"."person_id" IN (1, 2, 3, . . .)
Execute the Least Necessary Queries Per Table
This is sometimes confused with the N+1 problem, but it’s a bit different. When we query a table, we should retrieve all the data we think we’ll need to minimize read operations. There is, however, a lot of innocent-looking code that triggers redundant queries. For example, look how count always results in a SELECT COUNT(*) query in the following view:
1<ul>
2    <% @people.each do |person| %>
3        <li><%= person.name %></li>
4    <% end %>
5</ul>
6
7<h2>Number of Persons: <%= @people.count %></h2>Now ActiveRecord does two queries:
1Rendering duplicated_table_query/index.html.erb within layouts/application
2  (69.1ms)  SELECT COUNT(*) FROM "people" WHERE "people"."name" = ?  [["name", "John Waters"]]
3↳ app/views/duplicated_table_query/index.html.erb:3
4Person Load (14.6ms)  SELECT "people".* FROM "people" WHERE "people"."name" = ?  [["name", "John Waters"]]
5↳ app/views/duplicated_table_query/index.html.erb:6
In AppSignal, the symptom you’ll notice is that there are two active_record events on the same table:
The reality is that we don’t need two queries; we already have all the data we need in memory. In this case, the solution is to swap count with size:
1<ul>
2<% @people.each do |person| %>
3<li><%= person.name %></li>
4<% end %>
5</ul>
6
7<h2>Number of Persons: <%= @people.size %></h2>Now we have a single SELECT, as it should be:
1Rendering duplicated_table_query/index.html.erb within layouts/application
2Person Load (63.2ms)  SELECT "people".* FROM "people" WHERE "people"."name" = ?  [["name", "Abdul Strosin"]]
3↳ app/views/duplicated_table_query/index.html.erb:5
Another solution is to use preload to cache the data in memory.
Computing Aggregated Data in Rails
Aggregation is used to compute a value based on a set of data. Databases are great at working with big datasets. This is what they do and what we use them for. On the other hand, using Rails to aggregate doesn't scale since it requires getting all the records from the database, holding them in memory, and then calculating using high-level code.
We’re doing aggregation in Rails whenever we resort to Ruby functions like max, min, or sum over ActiveRecord elements or other enumerables.
1class AggregatedColumnsController < ApplicationController
2    def index
3        @mean = Number.pluck(:number).sum()
4    end
5endFortunately, ActiveRecord models include specific methods that map to aggregation functions in the database. For example, the following query maps to SELECT SUM(number) FROM ..., which is much faster and cheaper to run than the previous example:
1# controller
2
3class AggregatedColumnsController < ApplicationController
4    def index
5        @mean = Number.sum(:number)
6    end
7end1Processing by AggregatedColumnsController#index as */*
2  (2.4ms)  SELECT SUM("numbers"."number") FROM "numbers"
If you need more complex or combined aggregation functions, you may need to include a bit of raw SQL code:
1sql = "SELECT AVG(number), STDDEV(number), VAR(number) FROM ..."
2@results = ActiveRecord::Base.connection.execute(sql)Managing Big Transactions
SQL transactions ensure consistent and atomic updates. When we use a transaction to make a change, either every row is updated successfully, or the whole thing is rolled back. In any case, the database always remains in a consistent state.
We can bundle a batch of changes in a single transaction with ActiveRecord::Base.transaction.
1class BigTransactionController < ApplicationController
2    def index
3        ActiveRecord::Base.transaction do
4            (1..1000).each do
5                Person.create(name: 'Les Claypool')
6            end
7        end
8    end
9endThere are a lot of legitimate cases for using large transactions. These, however, run into the risk of slowing the database down. Besides, transactions exceeding certain thresholds will result in the database rejecting them.
The first sign of transactions being too big is spending a lot of time on commit transaction events:
Barring configuration issues on the database itself, the solution is to break down the transaction into smaller chunks.
Monitoring The Database
Sometimes even though we search the code, we can’t find anything wrong with it. Then, there may be a problem in the database itself. Database engines are complex, and many things can go wrong: low memory, default settings, missing indexes, inconveniently-scheduled backup jobs. The picture can’t be complete unless we get information about the machine running the database.
If we’re running our own databases, we can install the standalone agent to capture host metrics. To learn more about how to use the standalone agent, read: Monitoring Any System with StatsD and AppSignal’s Standalone Agent.
The following signs might show something is going on with the database. Go to the Inspect > Host Metrics dashboard to see the resource usage in your server:
- High memory usage: databases need a lot of memory — more than most other systems — to run correctly. As the dataset grows, memory requirements usually scale along. We’ll need to either add more memory or split the dataset among different machines from time to time.
- Swap usage: ideally, database machines should not need swap memory at all. Swapping kills database performance. Its presence means there’s a more in-depth configuration or memory problem.
- High I/O usage: peaks in disk activity can be due to maintenance tasks such as re-indexing or backing up the database. Doing these tasks during peak hours will definitely hit performance.
Conclusion
Diagnosing performance issues is never an easy task. Today, we’ve learned how to use Appsignal to quickly pinpoint the source of the problem.
Let’s keep learning about AppSignal and Ruby on Rails:
- ActiveRecord performance: the N+1 queries antipattern
- Rails is Fast: Optimize Your View Performance
- Introduction to Ruby on Rails Patterns and Anti-patterns
P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!
