Monitoring Erlang Runtime Statistics

So the Cloud-based Erlang application you wrote over the weekend is up and running but all of a sudden, it goes down. I can personally attest it is not a good feeling. You wish you had been able to predict the failure before it happened. In this post, I will show you how we monitor Erlang runtime statistics and send our metrics to our Geckoboard, our KPI dashboard where we house all of our application statistics in a single place.

I will use Phoenix Framework which is a popular web framework written in Elixir which leverages the Erlang VM but provides beautiful syntax and other cool features to work with. However, this can be easily adapted for another Elixir or Erlang framework. Of course, runtime statistics is just one of the various metrics you would want to measure. Over the past week, we have been able to gather and measure the various important metrics of our platform such as response times, response status codes, database metrics, load average, etc.

First, lets put ExErlstats in the list of dependencies for our application by editing mix.exs:

defp deps do
{:ex_erlstats, "~> 0.1"}
end

Run mix deps.get and we are set to use ExErlstats which is just a simple wrapper to get Erlang VM statistics in Elixir.

Now, lets create a controller to handle the status check:

# web/controllers/status_check_controller.ex
defmodule MyApp.StatusCheckController do
use MyApp.Web, :controller
end

Lets add couple of index functions with pattern matching and configure router.ex so that we can hit the status check endpoint.

def index(conn, %{"gecko" => "true", "memory_stats" => "true"}) do
msg = ExErlstats.memory
|> Stream.filter(fn {_k, v} ->
valid_stat?(v)
end)
|> Enum.map(fn {k, v} ->
%{
"title": %{
"text": atom_to_str(k)
},
"description": mem_normalize(v)
}
end)
response(conn, msg)
end
def index(conn, %{"gecko" => "true", "sysinfo" => "true"}) do
msg = ExErlstats.system_info
|> Stream.filter(fn {_k, v} ->
valid_stat?(v)
end)
|> Enum.map(fn {k, v} ->
%{
"title": %{
"text": atom_to_str(k)
},
"description": "#{v}"
}
end)
response(conn, msg)
end
def index(conn, %{"gecko" => "true", "erl_stats" => "true"}) do
msg = ExErlstats.stats
|> Stream.filter(fn {_k, v} ->
valid_stat?(v)
end)
|> Enum.map(fn {k, v} ->
%{
"title": %{
"text": atom_to_str(k)
},
"description": "#{v}"
}
end)
response(conn, msg)
end
def index(conn, _params) do
time_start = :os.system_time(:milli_seconds)
time_end = :os.system_time(:milli_seconds)
time_diff = "#{time_end — time_start}ms"
success(conn, time_diff)
end
defp success(conn, time_diff) do
msg = %{
success: true,
msg: "ok",
time: time_diff
}
response(conn, msg)
end
defp response(conn, msg, status \\ :ok) do
conn
|> put_status(status)
|> put_resp_content_type("application/json")
|> render(MyApp.StatusCheckView, "index.json", msg: msg)
end
defp mem_normalize(v) when is_integer(v), do: _mem_normalize(v)
defp mem_normalize(v), do: v
defp _mem_normalize(v) when v < 1024, do: "#{v} bytes"
defp _mem_normalize(v) when v < 1048576, do: "#{Float.round(v / 1024, 2)} KB"
defp _mem_normalize(v) when v < 1073741824, do: "#{Float.round(v / 1048576, 2)} MB"
defp _mem_normalize(v), do: "#{Float.round(v / 1073741824, 2)} GB"
defp valid_stat?(v), do: is_bitstring(v) || is_integer(v) || is_float(v)
defp atom_to_str(v) when is_atom(v), do: v |> Atom.to_string |> String.capitalize |> String.replace("_", " ")
defp atom_to_str(v), do: v

We need to create a simple view:

# web/views/status_check_view.ex
defmodule MyApp.StatusCheckView do
use MyApp.Web, :view
def render("index.json", %{msg: msg}) do
msg
end
end

And the router.ex configuration:

# web/router.ex
scope "/", MyApp do
get "/status", StatusCheckController, :index
end

What we did just now is setup a /status endpoint where we can pass a couple of parameters. I used params pattern matching to generate the format Geckoboard’s List widget expects.

GET /status?gecko=true&sysinfo=true
[
{
"title": {
"text": "Otp release"
},
"description": "18"
},
{
"title": {
"text": "Port count"
},
"description": "47"
},
{
"title": {
"text": "Port limit"
},
"description": "65536"
},
{
"title": {
"text": "Process count"
},
"description": "381"
},
{
"title": {
"text": "Process limit"
},
"description": "262144"
},
{
"title": {
"text": "Schedulers"
},
"description": "8"
},
{
"title": {
"text": "Schedulers online"
},
"description": "8"
},
{
"title": {
"text": "Version"
},
"description": "7.3"
}
]

I’m not going to display the other possible json results but the endpoints to hit would be:

GET /status?gecko=true&memory=true
GET /status?gecko=true&erl_stats=true

These all return a JSON response that Geckoboard’s List widget expects. You will also notice that the code example for status controller also has a default status check, which we use for a simple check that our API is up.

GET /status
{
"time": "0ms",
"success": true,
"msg": "ok"
}

And, this is what’s displayed on Geckoboard:

Erlang VM Statistics on Geckoboard

While we use Geckoboard, you can easily adapt the above examples to send to the monitoring dashboard of your choice.

If you are looking to use Geckoboard with Elixir, check out ExGecko. ExGecko supports Geckoboard’s Dataset and Push APIs and also provides built-in adapters for feeding Papertrail, Runscope, Heroku server, db and db backup logs to Geckoboard.