Dealing with long-running HTTP Requests and Timeouts in Phoenix

Phoenix is fast and highly concurrent and it can process HTTP requests in less than a millisecond. It is our priority to serve the requests as fast as possible, but the reality is that sometimes processing a request can take too long. This forces Phoenix to trigger a timeout and close the connection.

chrome timeout
Timeout with Chrome

Generally our Phoenix action doesn’t just render a page; it deals with databases, remote resources, external services, API etc. while processing the request, and these steps can take time.

We’ve seen in the Step-by-Step Tutorial to Build a Phoenix App that Supports User Upload how to deal with uploaded files, saving them locally.

The UploadController.create action is simple, but it does a series of things: it calculates the file's hash, copies the file into a local directory and saves some info into the database. Each of these steps can take time, especially if the file is large or the database is under heavy load.

Usually, the files are saved in cloud storage (like AWS S3). This means we also need to consider the time it takes to upload the file (sometimes large) to the cloud storage service.

Simulate a Long-Running HTTP Request

To easily simulate a long-running request, we can use Process.sleep/1 in the action that is created by default with a new Phoenix project, PageController.index/2.

# lib/poetic_web/controllers/page_controller.ex

defmodule PoeticWeb.PageController do
use PoeticWeb, :controller

def index(conn, _params) do
Process.sleep 65_000

render(conn, "index.html")
end

end

We force the action’s process to sleep for 65 seconds before rendering the index page. We then make an HTTP request with curl, triggering the PageController.index/2 action.

$ curl -v localhost:4000

* Connected to localhost (127.0.0.1) port 4000 (#0)
> GET / HTTP/1.1
> Host: localhost:4000
> User-Agent: curl/7.54.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server

After 60 seconds, curl raises and error: Empty reply from server.

Phoenix doesn’t log any timeout error, which makes timeouts harder to spot, the only thing we notice is that the usual [info] Sent 2xx in .. is missing.

[info] GET /
[debug] Processing with PoeticWeb.PageController.index/2
Parameters: %{}
Pipelines: [:browser]

The reason Phoenix doesn’t log a timeout error is that it does not handle this type of timeout. Cowboy — the HTTP server shipped with Phoenix — takes care of timeouts.

How can we solve this?

Phoenix, by default, gives us 60 seconds to process an HTTP request and respond back. And let’s be honest… that’s PLENTY of time in most cases.

But what can we do if we need more than one minute to process the request? (if we need, for example, to upload a really large file to AWS S3)

  • We can use a job queue like rihanna, which is a great choice, especially if we are sure that our task will take a number of minutes. We immediately return a status response to the client, deferring the actual processing to some point in future. However, this entails further work: we need a Postgres database (which we will have to monitor and maintain) and we need to implement some sort of polling mechanism on the client side to check that the job has finished successfully.
  • We can take advantage of awesome Phoenix channels, which make easy to have a persistent connection with the client, so we can receive the request and push the result back upon completion.
  • In some cases, it’s fine (and easier) to just slightly increase the timeout. Thanks to the BEAM (Erlang virtual machine), Phoenix is highly concurrent and can easily handle thousands (or even millions) of active connections, so a long-running request shouldn’t have a significant effect on other requests.

Cowboy timeout

We focus on the latter and simpler solution. Phoenix is based on Plug, which uses Cowboy as its default web server. It’s Cowboy that actually handles timeouts, closing the connections.

Looking at the dependencies, in the mix.exs file, we can see plug_cowboy, the Plug adapter for Cowboy web server.

# mix.exs
defp deps do
[
...
{:plug_cowboy, "~> 2.0"}
]
end

From the version 1.4, Phoenix uses Cowboy 2 which has a new set of timeout options:

  • request_timeout (default 5_000 milliseconds), Time in ms with no requests before Cowboy closes the connection.
     This is the maximum time, in which the client has to send the HTTP request. We do not touch this.
  • inactivity_timeout (default 300_000 milliseconds), Time in ms with nothing received at all before Cowboy closes the connection.
  • idle_timeout (default 60_000 milliseconds), Time in ms with no data received before Cowboy closes the connection.

The connection in the previous example was closed by idle_timeout. Once the client sends the HTTP request, Cowboy waits for 60 seconds to receive data from the client. Since the client is waiting for our response, after 60 seconds the web server closes the connection.

To slightly increase the timeout, we need to change the Endpoint configuration in the phoenix config file of the environment config/{dev,prod,test}.exs.

#config/dev.exs
config :poetic, PoeticWeb.Endpoint,
http: [
port: 4000,

protocol_options: [
idle_timeout: 70_000
]
],
...

Adding protocol_options: [idle_timeout: 70_000] we increase the timeout from 60 to 70 seconds, enough to be able to render the index page in our example.

After restarting Phoenix, we see that this time the server, after idling for 60 seconds, answers correctly and logs the processing time.

[debug] Processing with PoeticWeb.PageController.index/2
Parameters: %{}
Pipelines: [:browser]
[info] Sent 200 in 65009ms

Linked processes and Task

Having a timeout helps to clean stale connections and release retained resources.

Sometimes, to serve an HTTP request, we need to take advantage of concurrency, spawning processes to get remote resources, connect to the database or call an external API.

It’s important to spawn processes linked to the connection process. This ensures that when a connection is closed due to a timeout, all the spawn processes are terminated.

If we just need to process some data concurrently, as we’ve seen in The Primitives of Elixir Concurrency, it’s usually better to use Elixir Task module instead of using directly primitives like spawn_link and receive.

Task.async spawns a process linked to the caller's process, so if the request times out, the task is terminated automatically.

# lib/poetic_web/controllers/page_controller.ex

defmodule PoeticWeb.PageController do
use PoeticWeb, :controller

def index(conn, _params) do
task1 = Task.async(fn ->
# long running task 1
IO.inspect(self(), label: "PID task1")
Process.sleep 80_000
end)

task2 = Task.async(fn ->
# long running task 2
IO.inspect(self(), label: "PID task2")
Process.sleep 75_000
end)

Task.await(task1, :infinity)
Task.await(task2, :infinity)

render(conn, "index.html")
end

end

In this example we start two tasks and then await for the result. Using the Erlang observer we can see the two tasks sleeping.

Erlang Observer — Two tasks sleeping

Since both tasks take longer than the default idle_timeout of 60 seconds, Cowboy closes the connection. We see that once the connection's process is terminated, all the linked processes are terminated including the two tasks.

Erlang Observer - After Timeout
Erlang Observer — After Timeout

Originally published at Poeticoding.