Data modeling in Ecto takes a bit of getting used to, especially for developers that have mostly been working with traditional "heavy" ORMs.
For many novice Ecto users, association-related operations become the first stumbling stone. Ecto provides multiple
functions for establishing and modifying associations between records, each tailored to the particular use-case.
Judging from the number of questions about cast_assoc
, put_assoc
and build_assoc
on StackOverflow and other
online communities, choosing the right one can often be challenging, especially if the user is not yet accustomed to
the technical terminology in Ecto's official documentation.
The goal of this post is to give a short but definitive answer to such questions in a few of the most common (and most simple) scenarios.
Traditional "Heavy" ORMs VS Ecto
Traditional ORMs take on the massively complicated task of "masking" data-related operations, giving developers the illusion of working with language-native data containers. To achieve that, ORMs often perform complex transformations behind-the-scenes, which sometimes leads to suboptimal database performance. In this sense, ORMs propose a tradeoff between the convenience of language-native syntax and the precision of hand-crafted SQL queries.
Ecto essentially provides the same tradeoff but leans much closer to the side of hand-crafted SQL queries. The core conceptual difference is that Ecto does not intend to abstract away the database operations, instead, it provides an Elixir syntax for crafting SQL queries themselves. Ecto relies on a developer to format and validate the data to conform with the database schema, craft queries that use indexes efficiently, associate the records together and perform other tasks that ORM would try to automate. The result is a somewhat higher learning curve, but also, significantly increased flexibility. For a more in-depth comparison between ActiveRecord and Ecto, check out this excellent ActiveRecord vs. Ecto post.
Direct Casting of Association ID's: cast
Working with associations doesn't always have to be complex. In a situation where you have the target ID, Ecto lets you treat the relation column as a normal database field.
To give a concrete example, let's assume we work with two models, Post
and Comment
, where multiple comments can
refer to a single post. In that case, your models would look something like this.
1 defmodule Blog.Post do
2 use Ecto.Schema
3 schema "post" do
4 has_many :comments, Blog.Comment
5 field :title, :string
6 field :body, :string
7 end
8 end
9
10 defmodule Blog.Comment do
11 use Ecto.Schema
12 schema "post" do
13 belongs_to :post, Blog.Post
14 field :body, :string
15 end
16 end
These models reflect the following table schemas in the database:
Each table contains a primary field id
by default. The has_many
field on Post
does not refer to a database field, it
only exists to hint to Ecto that it's possible to preload comments for a post using the comment's belongs_to
field.
The belongs_to
field, on the other hand, refers to an existing field in a table schema. By default, the name of this
field in a table is different from the name in Ecto's model: the database field has _id
at the end.
Ecto lets you modify these kinds of association fields the same way you would modify any other field. In Ecto,
changing the value of a primitive field is called "casting". If you need to create a new comment for a particular post,
you don't really need any of the association-specific functions, you can just cast
the value of a primary key:
1comment
2 |> cast(params, [:post_id, :body])
This is the simplest and most straightforward method of creating an association between two tables.
Casting Associations: cast_assoc
It is useful to think about cast_assoc
as a special version of cast
that works on associations.
However, casting associations can be much more complex than casting normal fields. The cast
call normally translates
more or less directly into a single SQL query, while cast_assoc
might result in multiple INSERT
, UPDATE
or
DELETE
queries. Let's assume the database tables from the previous example contains the following content:
Post:
id | title | body |
---|---|---|
7 | A story... | Once upon a time... |
Comment
id | post_id | body |
---|---|---|
10 | 7 | Great story! |
11 | 7 | What happened next? |
12 | 7 | Thanks for the article |
Since the Post
model contains has_many
association to Comment
, it's trivial to preload all comments
on a particular post:
1Post
2 |> Repo.get!(id)
3 |> Repo.preload(:comments)
The shape of the returned data will be as follows:
1%Post{
2 "id" => 1,
3 "title" => "A story of...",
4 "body" => "Once upon a time...",
5 "comments" => [
6 %Comment{"id" => 10, "body" => "Great story!"},
7 %Comment{"id" => 11, "body" => "What happened next?"},
8 %Comment{"id" => 12, "body" => "Thanks for the article"},
9 ],
10}
Single cast_assoc
call on :comments
will replace the association as a whole. In effect, this means that the values
you pass to cast_assoc
will be returned in future preload
calls. This does not necessarily mean that all
database rows are replaced. Ecto compares before and after states and does the minimal amount of work required
to reach the desired state. To illustrate that, consider the following changeset:
1params = %{comments: [
2 %Comment{"id" => 11, "body" => "What happened next?"},
3 %Comment{"id" => 12, "body" => "Thank you for the post"},
4 %Comment{"body" => "Interesting"},
5]}
6post
7|> cast(params, [])
8|> cast_assoc(:comments)
Executing this changeset results in three calls to the database:
DELETE
the comment with an id of10
UPDATE
the comment with the id of12
and set body to "Thank you for the post"INSERT
a comment with a body "Interesting" and assign it a new id.
The row with the ID of 11 was left unchanged because it matches preloaded values. An important thing to note here is that
Ecto will not preload data on its own, so to make use of cast_assoc
, you need to remember to call preload
beforehand. However, you are not restricted to preloading a complete association. cast_assoc
will work just as
well when you use preload as a subset of records with Repo.preload(:comments, query)
. This feature is very useful for
limiting the impact of cast_assoc
to a subset of associated records.
Defining Associations: put_assoc
At first glance, put_assoc
is in many ways similar to cast_assoc
: it also works on a whole association
and requires you to pre-load records to be updated. However, upon closer examination, it turns out to be almost
opposite in the way you use it. The crucial distinction is that put_assoc
is designed to update
the association "references", not the data. That is, you would typically use put_assoc
when you want to connect
a record to one or more records that already exist in the database.
put_assoc
can be used to associate a new comment with an existing post, similar to what we did in the "direct casting"
section, but without using the post_id
field directly:
1post = Repo.get!(7)
2# ...
3comment
4 |> cast(params, [:body])
5 |> put_assoc(:post, post)
This makes your code a little bit cleaner in cases with complex primary fields because Ecto does all the bookkeeping for you.
Building Related Records: build_assoc
build_assoc
is the convenience function that allows you to create related records through an association
on an existing record. To continue our post/comment example, here is another way to create a new comment:
1post = Repo.get!(7)
2...
3comment_params = %{
4 "title": "A story..."
5 "body": "Once upon a time..."
6}
7Ecto.build_assoc(post, :comments, comment_params)
8# %Comment{post_id: 7, title: "A story...", body: "Once upon a time..."}
The power of build_assoc
is in its expressiveness: the code above clearly shows you that comment belongs to the post.
Unlike the functions discussed above, build_assoc
does not operate on a changeset — it builds one. This means you would
only ever use build_assoc
when you want to create a new record.
Wrap Up
While this post doesn't begin to cover the variety of use-cases you might encounter in production applications, I hope it gives you a strong foundation from which to begin searching for an answer. And if you need a refresher in the future, here is a simple flowchart that will remind you of the discussed use cases:
Ecto's association functions are relatively thin abstractions over field references in a database. Understanding how each of those functions works on a database level is crucial to becoming an expert Elixir/Phoenix developer. Fortunately, Ecto is built in such a way that each function is relatively small, deterministic, and has a single purpose. After you have mastered the basics, you can expect fewer "gotchas" compared to the traditional ORMs (or at least in my experience that was the case). Happy coding!
P.S. If you'd like to read Elixir Alchemy posts as soon as they get off the press, subscribe to our Elixir Alchemy newsletter and never miss a single post!