Are not as difficult as they seem
When I’m working with a new Phoenix app (written in Elixir), I often come across the following situation. I happened to run into it last night on a side-project I’m building called Remind Me (which you should definitely check out, it’s free and it will help you stay on top of a whole class of to-do’s that often fall through the cracks), so I decided to write up my thoughts on the subject. Here’s the issue (with fake resource names for illustration):
- I have a few associated resources in my database, like a
TextMessageItem‘s and has_one
- I have a
MyApp.Userscontext, which when generated by default has a function like this:
def list_users do
- I need to be able to get a few
User’s out of the database and send them to the front-end (or through GraphQL or whatever is needing the data) with the associated
- I hate having to make a function like this to do the work:
def list_users_with_text_messages_and_api_connection() do
|> Repo.preload([:text_messages, :api_connection])
Because as soon as I need another preload or I don’t need one of them, I have to make related functions like
list_users_with_api_connection/0 and so on. Over time you end up with long function names and the callers are always receiving more or less data than the really need (usually more than needed, which can become a performance problem).
So here’s a more elegant solution.
In the context, e.g.
def list_users(opts \\ ) do
preloads = Keyword.get(opts, :preloads, ) User
The first thing we’re doing here is adding a keyword list as a parameter, with an empty list as the default value.
Then, we’re checking the keyword list for the
:preloads option, and if it doesn’t exist in
opts we’ll default to an empty list.
We rely on the behavior of
Ecto.Repo.preload/2, which just returns the struct(s) in the first argument if you pass an empty list as the second argument.
Modifying the function this way gives the calling function ultimate flexibility: it can choose to preload only the associations it needs, to get the minimum amount of data back as possible.
For example, in the controller or resolver, e.g.
defmodule MyAppWeb.UserController do
alias MyApp.Users def index(conn, _params) do
preloads = [:text_message_items, :api_connection] users = Users.list_users(preloads: preloads) render(conn, "index.html", users: users)
As a bonus, this approach even works with nested preloads! Imagine that each
text_message_item had many
attachment‘s, we could preload those like this:
preloads = [:api_connection, text_message_items: [:attachments]]
In software, most decisions aren’t about the “right” or “wrong” way to do things, but rather they’re about tradeoffs.
The solution I’ve proposed here has some advantages:
- Easy to read and reason about
But it also has a disadvantage: it requires multiple trips to the database.
Any time you use
Repo.preload/2 or the
Ecto.Query equivalents (more on that syntax in a minute), each item (including nested preloads) will generate a separate query. So in our example, where our list of preloads looks like
[:text_message_items, :api_connection], we will have three queries: one to fetch the
user’s, another to fetch the associated
text_message_item’s, and another to fetch the
With two preloaded associations, it’s probably not that bad and unless you’re developing for scale that demands optimizing every query to the hilt, it’s probably worth a couple extra trips to the database for the code readability. In addition, you’ll keep your context nice and tidy while requirements are added, and by the time the query becomes a bottleneck you’ll have a much more mature idea of what preloads you really need from the query.
So say you are at the point where every trip to the database makes a big difference, whether it’s about page load times or the stress on the database.
The basic idea for optimizing preloads is that you need to use the
Ecto.Query syntax and join the associations yourself. Here’s one way to do it, but we’re back to the dilemma of having long function names, and it’s not nearly as pretty:
def list_users_with_text_messages_and_api_connection do
|> join(:left, [user], _ in assoc(user, :text_message_items))
|> join(:left, [user, _], _ in assoc(user, :api_connection))
|> preload([_user, tmi, api], [text_message_items: tmi, api_connection: api])
Ecto is a very powerful tool with a whole range of functions to help you interact with a database at any level of abstraction you like.
When development speed is of the essence, Ecto has high-level abstractions to make your life easy and your code beautiful. It’s easy to maintain and extend, and will keep your development cycle short while you’re building and iterating rapidly.
But, when your app hits scale and performance becomes critical, Ecto has the tools for you to get down into the nitty gritty SQL layers while still writing safe, expressive, composable queries.
There is a great macro solution from joshnuss in the comments that allows you to join and preload in one step: https://gist.github.com/joshnuss/9c68ad2c2649b571dd241693dad6f6f6
It looks really promising (and I tested it and it works great!) and it gives me hope that there are elegant solutions within reach to keep the maintainability and simplicity of piping in preloads from a caller, without multiple queries to the database.