The last post took a peek at how functions can be defined and passed around in Elixir. This post centres around atoms and some of their not so obvious traits.
What are they?
Atoms are named (or symbolic) constants, in other words their name is their value. They are preceded with a leading
: and they can be found everywhere in Elixir, they are the keys to keyword lists, often used to indicate success or error, e.g.
:ok, and many other places that will be touched on in this post.
Defining an atom
Atoms are defined by a leading
: and a series of letters, digits,
They can also end with a
? or a
!, which can be pretty expressive.
@ symbol being legal makes a lot of sense when you consider how nodes can be specified in functions like Node.spawn_link/2:
iex> Node.spawn_link :foo@computername, fn -> Hello.world end
An atom can be declared with characters that would normally be illegal by enclosing the atom’s name in
". This means an atom can be defined that starts with a number, or has a space in it.
** (SyntaxError) iex:1: unexpected token: ":" (column 1, codepoint U+003A)
iex> :"bark at postman"
:"bark at postman"
An atom defined either way with the same name will always be equal.
iex> :dog == :"dog"
Elixir uses atoms in places that were not obvious to me at first. The first instances I came across are
iex> true == :true
iex> false == :false
iex> nil == :nil
Other, particularly sneaky atoms, are those created whenever you define a name starting with a capital, e.g. a module name. Anything starting with a capital letter are actually aliases. If an alias has no value it defaults to an atom of the same name with an
iex> AnAlias == :"Elixir.AnAlias"
iex> alias :another_atom, as: AnotherAtom
iex> :another_atom == AnotherAtom
This becomes particularly relevant when we have a look at how we invoke Erlang modules. Erlang modules are lower case, so they are not aliases. That means to invoke a function on an Erlang module we need to use the
: atom syntax. What that also means is that the
: atom syntax can be used to invoke functions on Elixir modules.
Atoms and module loading
Atoms used when defining a module have an impact on the compilation output of
.ex files. For example if one file called
animals.ex exists consisting of two modules
Cat, then two
.beam files will be created during compilation. These files will be named after the atoms, so
Elixir.Cat.beam. This is how modules can be loaded dynamically at runtime, by looking up the
.beam file corresponding to the atom.
Atoms are unique
Atoms are mapped to an integer reference at runtime and the atom’s text value is stored in the “atom table”, which by default has a hard limit of
1048576. This means that any atom, even across processes, will be the same integer mapped to the same value in a single vm instance. This is why external user input, such as params to a phoenix route, should be keyed with strings not atoms. We don't want users being able to fill up our atom table with garbage.
A handy side effect of this approach is that matching on atoms is extremely fast, which considering how often they are used when pattern matching and binding variables is a good thing.
Atoms are one of the key building blocks when working in Elixir, they show up everywhere. Hopefully this post has shed some light on some of their properties that are not immediately obvious.