Battleship Elixir: JSON sunk my Float

As the legend goes, Brendan Eich wrote the first version of Javascript over the course of 10 days in 1995. Among many compromises, Javascript was designed to have only one numeric type, the floating point IEEE 754. Extracted from this quirky language we get JSON, a serialization format that is the literal object notation of Javascript. Today, JSON is used nearly everywhere in almost every programming language and computing platform.

Recently, I got bit by a quirk of JSON in an unexpected place: my Elixir code. I want to share this experience because I think it is interesting, and to help others become aware of limitations in JSON. In particular, this situation highlights why you should be careful when using JSON to transfer numbers, especially monetary values.


In our project, we store data in Postgres using Ecto, Elixir’s most popular database wrapper. In order to store some presentational information, we make heavy use of jsonb columns. Ecto makes it really easy to work with jsonb. Here’s an example of the various ways you can define jsonb types in an Ecto schema.

defmodule MyTable do
use Ecto.Schema
  schema "my_table" do
field(:name, :string)
    field(:meta, :map) # jsonb, no typing
    embeds_one :fields, Fields do # jsonb, with an embedded schema
field(:foo, :integer)
field(:bar, :float)
end
end
end

Due to new design requirements, I needed to migrate a nested jsonb field from an integer to a float. Attempting this is pretty straightforward.

  1. Query all of rows in the table
  2. Cast the value from integer to float
  3. Update the rows in the database

The update pseudo-code looks something like this:

query =
from(
r in "my_table",
where: fragment("?->>'number' ~ '^[0-9]+$'", r.meta),
select: r
)
query
|> Repo.all()
|> Enum.map(&cast_field_to_float/1)
|> Repo.update_all()

Everything works great at this point. All of the rows update, but something unexpected happens. Most, but not all of the fields were cast to float. Numbers like 5 were properly converted to 5.0, but 5000 remained 5000. What the eff?

Let’s jump into the REPL and figure this out. What happens if we hardcode 5000.0?

iex>
meta = put_in(meta.number, 5000.0)
model
|> Ecto.Changeset.cast(%{meta: meta}, [:meta])
|> Repo.update(force: true)
iex> Repo.one(from(r in "my_table", select: r.meta, where: r.id == 1))
%{"number" => 5000}

At this point, I’m thinking that we’ve found a bug in Ecto, Postgrex, or maybe even Postgres itself. It appears that only numbers divisible by 10 will not cast. Hardcoding 5001.0 does seem to work, so maybe we’ve found an edge case?

Let’s go to Postgres directly to see if we can replicate.

outline_dev=# UPDATE my_table SET meta = '{"number": 5000.0}' where id = 1;
outline_dev=# select meta from cards where id = 1;
meta
--------------------
{"number": 5000.0}

Directly using SQL gives me the behavior I’m looking for, so at this point I’m inclined to think that there’s a bug somewhere in Elixir land. After chatting with Michal Muskala in the Elixir Slack for a bit, we found the problem.

In order to convert from an Elixir Map to JSON, Ecto must encode the value. Here’s what a few different values look like after they’ve been encoded to JSON.

iex(6)> Jason.encode(%{number: 5.0})
{:ok, "{\"number\":5.0}"}
iex(7)> Jason.encode(%{number: 5})
{:ok, "{\"number\":5}"}
iex(8)> Jason.encode(%{number: 5000.0})
{:ok, "{\"number\":5.0e3}"}

Notice how 5000.0 is being encoded? According to RFC 8259, the JSON RFC, 5.0e3 is a valid encoding for 5000.0. The RFC states how to parse this value, but it does not specify its storage.

Postgres makes the decision to store 5.0e3 as an integer value for a jsonb field, while Elixir chooses to store it as a float. Due to this mismatch, when we try to store this value as a float, Postgres casts it back into an integer.

So what is solution here? Unfortunately, Postgres is behaving according to spec. We can try leveraging Ecto fragments and the jsonb_set function to set the value directly, but even that has the same storage problem as outlined above. Alternatively, we can change the number to something else first, e.g. 5000 -> 5001.0 -> 5000.0. This is a less than ideal solution, but it tricks Postgres into changing the value the way we want.


JSON seems like a simple format, but it has strange edge cases that go much deeper than its surface. When using JSON to transfer or store data, keep in mind that different languages and platforms may choose to implement JSON in slightly different, yet surprising ways. Becoming aware of these limitations will help you guard against them and architect better, more reliable systems.