Beyond Functions in Elixir: Refactoring for Maintainability
Elixir can be a beautiful language, it has Ruby’s syntactic elegance, Lisp’s metaprogramming, and many functional features of languages like F#. The user has license to use many idioms and features: pattern matching, macros, behaviours, protocols, GenServers, ETS, etc. Working successfully in Elixir means choosing when to leverage a particular language feature for its ergonomics at the cost of grokking its complexity.
Typically, beginners are advised to organize their Elixir/Phoenix codebases into modules and functions. Write well named functions, and group functions with related concerns into modules. In a functional language like Elixir, writing functions that operate on raw data takes you far. Phoenix 1.3 even
popularized the idea of Contexts, which are a nice way of marketing modules and functions in terms of their boundaries and responsibilities.
A developer can think of modules and functions as the simplest tools in their Elixir tool belt. When working on larger projects or designing a library, patterns may start to emerge that be solved more ergonomically or efficiently utilizing more powerful language features. In this post, I’ll explore a source of complexity we encountered at The Outline and how we reduced the complexity by going beyond modules and functions.
Let’s consider the process for building a blog in Elixir and Phoenix. A blog has
many posts, and each post contains a title, an author, and a body.
defmodule Blog.Post do
use Blog.Web, :model schema “posts” do
field :title, :string
field :author, :string
field :body, :string
end
end
Let’s write an EEX template to render this data.
<article>
<header>
<h1><%= @post.title %></h1>
<address><%= @post.author %></address>
</header>
<section>
<%= @post.body %>
</section>
</article>
We’re off to a pretty good start; we can render some HTML with a %Blog.Post{}
struct, but something is missing. Usually when you’re writing prose, you want more than just a giant paragraph of text. In its current form, a blob of text is all our blog supports. Fortunately, Elixir has great community support for Markdown, a small markup language that outputs to HTML. Lets choose cmark for converting our Markdown to HTML.
defmodule Blog.Web.PostView do
@moduledoc "View for rendering posts"
use Blog.Web, :view
alias Blog.Markdown def render_markdown(binary) do
Markdown.to_html(binary)
endenddefmodule Blog.Markdown do
@moduledoc "Utility for rendering markdown -> html" def to_html(binary) when is_binary(binary) do
Cmark.to_html(binary)
end
def to_html(_other), do: ""end
We need to make one last change to our template.
<section>
<%# Convert the markdown -> HTML %>
<%= render_markdown @post.body %>
</section>
Everything looks great at face-value, but when we try to render our test post,
What’s happening here? If we try to call render on our view and inspect
the output, we’ll see something interesting.
iex(2)> Phoenix.View.render(Blog.Web.PostView, "show.html", post: post)
{:safe,
[[[[[[["" | "<article>\n <header>\n <h1>"] | "First post"] |
"</h1>\n <address>"] | "Dave"] |
"</address>\n </header>\n <section>\n"] |
"<p><em>Hello</em> <strong>World</strong>!</p>\n"] |
" </section>\n</article>\n"]}
Phoenix is safe about how it renders input to templates. Instead of rendering HTML, it returns HTML in escaped form, which shows up on our page. Unfortunately, what is smart for computers doesn’t always align with what we want as developers. Phoenix, doing its best to watch our back, is escaping and sanitizing the input to the template. The Phoenix.View
documentation hints at this behavior.
The safe tuple annotates that our template is safe and that we don’t need to escape its contents because all data has already been encoded.
The outer part of the returned term is a {:safe, iodata}
tuple. This signifies
to Phoenix that everything inside the tuple has been escaped and is sanitized
HTML. Behind the scenes, Phoenix is calling a form of Phoenix.HTML.html_escape
on the input value to the template. (If you’re really curious about the dirty details, you can walk your way back from here).
If we want Phoenix to pass through our HTML directly, we can call the Phoenix.HTML.raw
function, which wraps it in the :safe
tuple for us. Let’s change our render_markdown
implementation slightly.
def render_markdown(binary) do
binary
|> Markdown.to_html()
|> Phoenix.HTML.raw() # Convert to {:safe, iodata} tuple
end
Now when we render our Markdown to HTML, we pass it through the Phoenix.HTML.raw()
function, which tells Phoenix not to escape this particular input.
iex(2)> Phoenix.View.render(Blog.Web.PostView, "show.html", post: post)
{:safe,
[[[[[[["" | "<article>\n <header>\n <h1>"] | "First post"] |
"</h1>\n <address>"] | "Dave"] |
"</address>\n </header>\n <section>\n"] |
"<p><em>Hello</em> <strong>World</strong>!</p>\n"] |
" </section>\n</article>\n"]}
This is a great first attempt at solving our problem. We now have a working blog that allows for styling the post body without hand writing HTML. We can ship this to production, and forget about it for a while. We’ve subscribed to the common wisdom of “modules and functions”, which works great for our use case, but comes with some tradeoffs. For example, our current implementation does nothing to prevent a writer from dropping in a <script>
tag. Luckily, we can trust our users for our use case, and avoid solving that problem for now.
As the project grows, we’ll start to get additional business requirements. Maybe we’ll want to expand our blog to have a description, which also supports Markdown. We also might want an index page that shows summaries of posts, and a homepage that has different views of our data. Continuing our philosophy, we begin to generalize our templates and reuse them in different places of our app.
<section>
<h1><%= @title %></h1>
<div class="description">
<%= @description %> <%# Wait is description markdown? %>
</div>
</section>
After reaching a tipping point of reusable templates, contexts, and complexity,
it becomes harder to backtrace every input into a template. It’s possible that
there may be plain text or Markdown in the @description
input, depending on the usage. It’s at this point we should consider if there are any more options than just modules and functions.
Enter Protocols
From the Elixir guides…
Protocols are a mechanism to achieve polymorphism in Elixir. Dispatching on a protocol is available to any data type as long as it implements the protocol.
Polymorphism describes functions that can have different implementations for different types. In Elixir, you can think of Protocols as one method of
polymorphism that’s baked into the language. The real power of Protocols comes when you combine its polymorphism with structs. When you pass a struct to a protocol function, it will dispatch to that structs implementation.
Elixir comes with several protocols out of the box; Collectable, Enumerable, Inspect, List.Chars, and String.Chars.
When calling inspect
on a value, Elixir dispatches to the correct implementation of the Inspect
Protocol for the given type. So if I call inspect %{foo: :bar}
it will dispatch to the Map
implementation of the inspect protocol. You can think of Protocols just like you think of pattern matching with multiple function heads. In fact, when you compile your Elixir code in production mode, they get compiled down to exactly that.
The main difference between Protocols and pattern matching on different values is the inversion of control. Protocols let you add more “function heads” after the fact, so that app and library developers can match on their type separate from the definition of the Protocol itself.
This is really helpful for our Markdown problem. Phoenix defines its own Protocol, Phoenix.HTML.Safe.
In order to promote HTML safety, Phoenix templates do not use
Kernel.to_string/1
to convert data types to strings in templates. Instead, Phoenix uses this protocol which must be implemented by data structures and guarantee that a HTML safe representation is returned.
This is great news for our Markdown problem. We can create a struct in Blog.Markdown
and implement the Phoenix.HTML.Safe
Protocol for it. Anytime we want to render markdown into a template, we would just wrap the string in a %Blog.Markdown{}
struct, and the Protocol would do all the hard work for us! All we need to do is change the implementation of our Markdown
module a bit to add a struct and implement the Protocol.
defmodule Blog.Markdown do
defstruct text: "", html: nil def to_html(%__MODULE__{html: html}) when is_binary(html) do
html
end
def to_html(%__MODULE__{text: text}), do: to_html(text)
def to_html(binary) when is_binary(binary) do
Cmark.to_html(binary)
end
def to_html(_other), do: "" defimpl Phoenix.HTML.Safe do
def to_iodata(%Blog.Markdown{} = markdown) do
Blog.Markdown.to_html(markdown)
end
end
end
Now, as long as we wrap our :string
fields containing Markdown in the %Markdown{}
struct, they will automatically convert to HTML with no extra fiddling necessary. Our original blog template can go back to this.
# Template
<article>
<header>
<h1><%= @post.title %></h1>
<address><%= @post.author %></address>
</header>
<section>
<%= @post.body %>
</section>
</article># Rendering
Phoenix.View.render(
Blog.Web.PostView,
"show.html",
post: %{post | body: %Blog.Markdown{text: post.body}}
)
We’ve now solved the problem of our templates knowing when to render markdown and when not to. When the template gets an input, Phoenix calls to_iodata
on it. If that value passed to to_iodata
happens to be a %Markdown{}
struct, it will be automatically converted to HTML.
Can we do better?
We’ve moved the problem out of our template, but now our View or Model needs to convert the :body
field into a Markdown struct before passing it to the Template. What if this could also be done automatically?
The Ecto.Type Behaviour to the rescue
Right now, our Blog.Post
schema contains three :string
fields, which are all primitive Ecto types. Ecto has a powerful feature, which allows developers to write their own custom types by implementing the Ecto.Type
Behaviour. A Behaviour is simply an interface contract, as long the custom type implements all of the Ecto.Type
callbacks, and is backed by a primitive type, Ecto will automatically convert our field to this type in and out of the database.
We can write our own Ecto.Type
, Blog.Markdown.Ecto
, so that when we pull a Blog.Post
out of the database, our post.body
with be a %Markdown{}
struct.
Let’s start by updating our Blog.Post
schema.
defmodule Blog.Post do
use Blog.Web, :model
alias Blog.Markdown schema “posts” do
field :title, :string
field :author, :string
field :body, Markdown.Ecto # The custom Ecto.Type
end
end
This is our first step, but we need to implement the behavior as well. That will do the work to automatically convert :body
to %Blog.Markdown{}
when we do post = Repo.get(Blog.Post, id).
There are four functions that we will have to implement in our Ecto.Type
; cast, dump, load, and type.
Type is the backing type of our Markdown.Ecto
field, which is :string
Load takes data from the database (our :body
field as a :string
) and returns a %Markdown{}
struct. We assume that this data is already valid.
Dump takes a %Markdown{}
struct, validates it, and returns a valid :string
Cast is called when casting values in an Ecto.Changeset
or when passing arguments to an Ecto.Query
. It converts valid types into a %Markdown{}
struct.
defmodule Blog.Markdown.Ecto do
alias Blog.Markdown @behaviour Ecto.Type
@impl Ecto.Type
def type, do: :string
@impl Ecto.Type
def cast(binary) when is_binary(binary) do
{:ok, %Markdown{text: binary}}
end
def cast(%Markdown{} = markdown), do: {:ok, markdown}
def cast(_other), do: :error @impl Ecto.Type
def load(binary) when is_binary(binary) do
{:ok, %Markdown{text: binary, html: Markdown.to_html(binary)}}
end
def load(_other), do: :error @impl Ecto.Type
def dump(%Markdown{text: binary}) when is_binary(bibary) do
{:ok, binary}
end
def dump(binary) when is_binary(binary), do: {:ok, binary}
def dump(_other), do: :errorend
Now our Markdown will flow directly from the database to the template, and render into HTML without calling any explicit render_markdown
functions, or have to do any type casting. This is implicit behavior that we have justified because it removes the opportunity for latent bugs, and helps avoid having to trace through our application to determine whether a certain piece of data supports Markdown or not.
Choosing when to add this kind of complexity in exchange for ergonomics is not an easy choice, and is one that a developer should spend a lot of time thinking about before using. Remember, in most cases its better to choose duplication over the wrong abstraction. Once you have enough experience and information about your problem domain, choosing the right abstraction in Elixir becomes an easier, more informed decision.
This story is published in Noteworthy, where 10,000+ readers come every day to learn about the people & ideas shaping the products we love.
Follow our publication to see more product & design stories featured by the Journal team.