Elixir Atoms

The last post took a peek at how functions can be defined and passed around in Elixir. This post centres around atoms and some of their not so obvious traits.

What are they?

Atoms are named (or symbolic) constants, in other words their name is their value. They are preceded with a leading : and they can be found everywhere in Elixir, they are the keys to keyword lists, often used to indicate success or error, e.g. :ok, and many other places that will be touched on in this post.

Defining an atom

Atoms are defined by a leading : and a series of letters, digits, _s and @s.

iex> :bark@postman
:bark@postman

They can also end with a ? or a !, which can be pretty expressive.

iex> :bark@postman!
:bark@postman!

The @ symbol being legal makes a lot of sense when you consider how nodes can be specified in functions like Node.spawn_link/2:

iex> Node.spawn_link :foo@computername, fn -> Hello.world end

An atom can be declared with characters that would normally be illegal by enclosing the atom’s name in ". This means an atom can be defined that starts with a number, or has a space in it.

iex> :1
** (SyntaxError) iex:1: unexpected token: ":" (column 1, codepoint U+003A)
iex> :"1"
:"1"
iex> :"bark at postman"
:"bark at postman"

An atom defined either way with the same name will always be equal.

iex> :dog == :"dog"
true

Sneaky atoms

Elixir uses atoms in places that were not obvious to me at first. The first instances I came across are true, false and nil:

iex> true == :true
true
iex> false == :false
true
iex> nil == :nil
true

Other, particularly sneaky atoms, are those created whenever you define a name starting with a capital, e.g. a module name. Anything starting with a capital letter are actually aliases. If an alias has no value it defaults to an atom of the same name with an Elixir. prefix.

iex> AnAlias == :"Elixir.AnAlias"
true
iex> Atom.to_string(AnAlias)
"Elixir.AnAlias"
iex> alias :another_atom, as: AnotherAtom
:another_atom
iex> Atom.to_string(AnotherAtom)
"another_atom"
iex> :another_atom == AnotherAtom
true

This becomes particularly relevant when we have a look at how we invoke Erlang modules. Erlang modules are lower case, so they are not aliases. That means to invoke a function on an Erlang module we need to use the : atom syntax. What that also means is that the : atom syntax can be used to invoke functions on Elixir modules.

iex> :random.uniform
0.4435846174457203
iex> :"Elixir.Atom".to_string(:hello)
"hello"

Atoms and module loading

Atoms used when defining a module have an impact on the compilation output of .ex files. For example if one file called animals.ex exists consisting of two modules Dog and Cat, then two.beam files will be created during compilation. These files will be named after the atoms, soElixir.Dog.beam and Elixir.Cat.beam. This is how modules can be loaded dynamically at runtime, by looking up the .beam file corresponding to the atom.

Atoms are unique

Atoms are mapped to an integer reference at runtime and the atom’s text value is stored in the “atom table”, which by default has a hard limit of 1048576. This means that any atom, even across processes, will be the same integer mapped to the same value in a single vm instance. This is why external user input, such as params to a phoenix route, should be keyed with strings not atoms. We don't want users being able to fill up our atom table with garbage.

A handy side effect of this approach is that matching on atoms is extremely fast, which considering how often they are used when pattern matching and binding variables is a good thing.

Wrapping up

Atoms are one of the key building blocks when working in Elixir, they show up everywhere. Hopefully this post has shed some light on some of their properties that are not immediately obvious.