In today's post, we'll cover the significant differences and benefits of using structure.sql
versus the default schema.rb
schema formats in your Ruby on Rails application. In a data-driven world, knowing how to exploit all of your database's rich features can make the difference between a successful and unsuccessful enterprise.
After evincing the main differences between the two formats, we'll outline how to switch to structure.sql
and demonstrate how it can help with ensuring data integrity as well as database functionality that you might otherwise not be able to preserve.
In the post, I'll give examples of a Rails app that makes use of structure.sql
with a PostgreSQL database, but the underlying concepts can be transposed to other databases as well. No real-world web application is truly complete without a reliable database to support it.
Without further ado, let's dive right in!
The Difference Between schema.rb and structure.sql
One of the first things you need to do when starting a Ruby on Rails project is to run database migrations. If you generate a User model, for instance, Rails will inevitably ask you to run migrations, which will create a schema.rb
file accordingly:
1rails g model User first_name:string last_name:string
Rails will generate the following migration:
1class CreateUsers < ActiveRecord::Migration[6.0]
2 def change
3 create_table :users do |t|
4 t.string :first_name
5 t.string :last_name
6
7 t.timestamps
8 end
9 end
10end
Once the migration is executed, you will find that Rails generated a schema.rb
file for you:
1ActiveRecord::Schema.define(version: 2019_12_14_074018) do
2
3 # These are extensions that must be enabled in order to support this database
4 enable_extension "plpgsql"
5
6 create_table "users", force: :cascade do |t|
7 t.string "first_name"
8 t.string "last_name"
9 t.datetime "created_at", precision: 6, null: false
10 t.datetime "updated_at", precision: 6, null: false
11 end
12
13end
This schema.rb
file is fantastic for relatively basic applications and use cases.
There are two main things to notice here:
- It is a Ruby representation of your database;
schema.rb
is created by inspecting the database and expressing its structure using Ruby. - It is database-agnostic (i.e. whether you use SQLite, PostgreSQL, MySQL or any other database that Rails supports, the syntax and structure will remain largely the same)
However, there may come a time when this strategy becomes too limiting for your growing app.
Say, for instance, you have hundreds or thousands of migration files.
If you need to rapidly spin up a new production system, you might encounter a scenario where running them all in sequence takes too long. Or you might face a situation where some migrations contain code that was meant to be executed on an older version of your database, but that is no longer executable on the current version. You might have a situation where migrations were written with certain data assumptions that are no longer valid, which would cause the migrations to fail.
All these scenarios prevent efficiently setting up a new instance of your application—be it in production or for a new team member—with a simple rails db:create db:migrate
command. If this were the case, how would you go about getting up to speed with a correct database schema?
Certainly, one way would be to go back and fix all the broken migrations. That's never a bad idea!
If going back and fixing a bunch of migrations is too costly, another way would be to run the rails db:setup
task. This task will generate a database schema from your schema.rb
file. However, what if your database contained complex logic that is not represented in the schema.rb
representation of your database?
Luckily, Rails offers an alternative: structure.sql
structure.sql
differs from schema.rb
in the following ways:
- It allows for an exact copy of the database structure. This is important when working with a team, as well as if you need to rapidly generate a new database in production from a
rails db:setup
task. - It allows preserving information of advanced database features. For example, if you are using PostgreSQL, it enables the use of views, materialized views, functions, constraints and so on.
Once an application reaches a certain maturity level, we have to use every trick in the book to boost efficiency, preserve data correctness, and ensure blazing-fast performance. Using structure.sql
to manage the Rails database's behavior allows users to do so.
Switching From schema.rb
to structure.sql
Making the change from schema.rb
to structure.sql
is a relatively straightforward process. All you need to do is set the following line in config/application.rb
:
1module YourApp
2 class Application < Rails::Application
3 config.load_defaults 6.0
4
5 # Add this line:
6 config.active_record.schema_format = :sql
7 end
8end
Then, run rails db:migrate
and you should see the file in db/structure.sql
. Voilà! Rails will dump the database structure using the tool specific to the database you are using (in PostgreSQL's case, that tool is pg_dump
, for MySQL or MariaDB, it will contain the output of SHOW CREATE TABLE
for each table, etc). It is advisable to ensure this file is under version control so that the rest of your team will have the same database structure.
A first glance at that file may be daunting: the schema.rb
file was only 25 lines, whereas the structure.sql
file is a whopping 109 lines! What benefits could such a large file add to the app development?
Adding Database-level Constraints
ActiveRecord is one of my favorite parts of using Rails. It allows you to query the database in a way that feels natural, almost like in a spoken language. For instance, if you want to find all a company's users named Dan, then ActiveRecord allows you to simply run a query like the following:
1company = Company.find(name: 'Some Company')
2
3# Reads just like in a natural language!
4company.users.where(first_name: 'Dan')
There are some cases in which ActiveRecord falls short though. For instance, say you have the following validation on your User model:
1class User < ApplicationRecord
2 validate :name_cannot_start_with_d
3
4 private
5
6 def name_cannot_start_with_d
7 if first_name.present? && first_name[0].downcase == 'd'
8 errors.add(:first_name, "cannot start with the letter 'D'")
9 end
10 end
11end
If you try to create a user with the name 'Dan', you should see an error when the validation runs:
1User.create!(first_name: 'Dan')
2Traceback (most recent call last):
3ActiveRecord::RecordInvalid (Validation failed: First name cannot start with the letter 'D')
This is fine, but suppose you or one of your team members changed the data by bypassing ActiveRecord's validation:
1u = User.create(first_name: 'Pan')
2
3# The update_attribute method bypasses ActiveRecord validations
4u.update_attribute :first_name, 'Dan'
5u.first_name
6=> "Dan"
As demonstrated, it is very easy to bypass the validation.
This can have disastrous consequences for our application. ActiveRecord can be a blessing as well as a curse—while it has a very clean and natural DSL that makes it a pleasure to work with, it is often overly permissive when enforcing model-level validations. The solution, as you may already know, is to add database-level constraints.
1rails g migration AddFirstNameConstraintToUser
This will generate a file that you can edit with the logic to disallow first names that start with the letter 'D':
1class AddFirstNameConstraintToUser < ActiveRecord::Migration[6.0]
2 def up
3 execute "ALTER TABLE users ADD CONSTRAINT name_cannot_start_with_d CHECK (first_name !~* '^d')"
4 end
5
6 def down
7 execute "ALTER TABLE users DROP CONSTRAINT IF EXISTS name_cannot_start_with_d"
8 end
9end
Note that it is very important to add code that successfully reverts the migration. In the above example, I have up
and down
directives. The up
method gets executed when the migration runs, down
gets executed when the migration is rolled back. Without properly reverting your database structure, you may have to do some manual house-cleaning later. I'd recommend always having a migration file that can be executed both up
and down
to avoid future headaches.
Now, run the migration and check whether you can bypass that constraint:
1rails db:migrate
1user = User.create first_name: 'Pan'
2user.update_attribute :first_name, 'Dan'
3
4ActiveRecord::StatementInvalid (PG::CheckViolation: ERROR: new row for relation "users" violates check constraint "name_cannot_start_with_d")
5DETAIL: Failing row contains (2, Dan, null, 2019-12-14 09:40:11.809358, 2019-12-14 09:40:41.658974).
Perfect! Our constraint is working as intended. Even if, for whatever reason, we bypass ActiveRecord's validation, we can still rely on the database—our ultimate goalkeeper—to preserve our data integrity.
What does this have to do with structure.sql
?
If you take a look at it, you'll see that the following was added:
1CREATE TABLE public.users (
2 id bigint NOT NULL,
3 first_name character varying,
4 last_name character varying,
5 created_at timestamp(6) without time zone NOT NULL,
6 updated_at timestamp(6) without time zone NOT NULL,
7 CONSTRAINT name_cannot_start_with_d CHECK (((first_name)::text !~* '^d'::text)));
Your constraint is within the schema itself!
While schema.rb
also supports database-level constraints, it is important to remember that it does not express everything your database may support such as triggers, sequences, stored procedures or check constraints. For example, this is what would happen to your schema file with the same exact migration (AddFirstNameConstraintToUser
) if you were just to use schema.rb
:
1ActiveRecord::Schema.define(version: 2019_12_14_074018) do
2
3 # These are extensions that must be enabled in order to support this database
4 enable_extension "plpgsql"
5
6 create_table "users", force: :cascade do |t|
7 t.string "first_name"
8 t.string "last_name"
9 t.datetime "created_at", precision: 6, null: false
10 t.datetime "updated_at", precision: 6, null: false
11 end
12
13end
The file has not changed! The constraint was not added.
If you were to onboard a new developer to work on your project, you could potentially be operating under different database regulations.
Committing structure.sql
to version control would help ensure that your team is on the same page. If you were to run rails db:setup
having a structure.sql
file, your database's structure will contain the above constraint. With schema.rb
there is no such guarantee.
The same can be said about a production system. If you needed to rapidly spin up a new instance of your application with a fresh database—and running all migrations sequentially takes a long time—setting up the database from the structure.sql
file would be a lot quicker. We can rest assured that the structure.sql
will create our database with the exact same structure as in other instances.
Growing Pains
Managing the concise schema.rb
file across a team is a far easier task than managing the verbose structure.sql
file.
One of the biggest growing pains when migrating to structure.sql
is ensuring that only the required changes get committed to that file, which can sometimes be difficult to do.
Say, for instance, you pull someone's branch and run the migrations specific to that branch. Your structure.sql
will now contain some changes. You then go back to working on your own branch and generate a new migration. Your structure.sql
file will now contain both your branch's and the other branch's changes. This can be a bit of a hassle to deal with, and there is undoubtedly a bit of a learning curve when it comes to managing these conflicts.
By using this approach, we're making a tradeoff. We have to deal with a bit of code complexity upfront that allows us to preserve our database's advanced functionality. In turn, we also have to deal with a simpler schema representation as well as not having all the power of the database at our fingertips, e.g. if we want to set a backup from a db:setup
task. I posit that it's best to put up with a bit of version-control hassle than to suffer through fixing corrupt/incorrect data in a production system, or to not be able to make use of all the advanced functionality that your database offers.
Generally speaking, there are two strategies I've used to ensure my structure.sql
file only contains the necessary changes to a specific branch:
- Once you are done working on a branch that contains migrations, make sure you run
rails db:rollback STEP=n
wheren
is the number of migrations in that branch. This will ensure your database structure reverts to its original state. - You might forget to rollback after working on a branch. In that case, when working on a new branch, make sure you pull a pristine
structure.sql
file from master before creating any new migrations.
As a rule of thumb, your structure.sql
file should only contain the changes relevant to your branch before being merged into master.
Conclusion
Generally speaking, when Rails applications are small or don't need some of the more advanced features that a database offers then it's safe to use schema.rb
, which is very readable, concise and easy to manage.
However, as an application grows in size and complexity, an accurate reflection of the database structure is of the essence. It will allow a team to maintain the right constraints, database modules, functions and operators that otherwise wouldn't be possible. Learning to use Rails with a well-maintained structure.sql
file will offer an edge that the simpler schema.rb
simply cannot.
P.S. If you'd like to read Ruby Magic posts as soon as they get off the press, subscribe to our Ruby Magic newsletter and never miss a single post!