How to Pool PostgreSQL Connections with PgBouncer on Compose
The secret to a robust application is knowing how to manage resources, and your database connections are a precious resource. With PostgreSQL there are no exceptions; the more connections used, the more RAM that’s consumed. This is where connection pooling can help. In this article, we’ll look at pooling and give you a head start using PgBouncer, a popular connection pooling middleware for PostgreSQL.
Connection pooling is not a new concept to someone who manages a robust application where a multitude of database connections are used at any given time. However, if you’re new to connection pooling and new to PostgreSQL for that matter, you might be considering how to effectively manage your database connections. Ultimately, managing connections will increase the performance of your applications. This is where connection pooling comes in.
We’ll give you a brief overview of connection pooling and PostgreSQL, and then introduce you to PgBouncer, a popular connection pooling middleware that’s designed specifically for PostgreSQL. Then, we’ll show you how to configure it and how to start using it with your cloud-hosted PostgreSQL database (I’m using Compose).
Connection Pooling and PostgreSQL
PostgreSQL connections are forked processes that are created each time a client needs a connection. These are expensive operations because the client has to perform a handshake with your database, which basically comprises opening a connection, performing authentication tasks — where passwords are verified and TLS is established and configuration settings are shared — then closing the connection after it’s done.
With each connection, there is a memory cost of around 10 MB so let’s say we have 100 or 300 connections, that’s 1 to 3 GB of RAM allocated just for connections alone. This doesn’t include the amount of working memory your database needs to run memory intensive queries. So, the memory needed to create and run connections and other processes, as you might see, can get out of hand quickly when scaling up. Overall, this will probably degrade your application and database performance by orders of magnitude due to the increased workload and even crash your database.
Pooling takes care of this issue by sharing a pool of connections, or a finite set of connections set by the administrator that can use used and reused by any number of applications. A pool is essentially a wrapper around a database connection, with the internals of that connection between the database and the pool hidden from your applications.
Therefore, your applications don’t need to manage the actual connection to the database, they only need to connect to the pool and then close their connection once a task is done; therefore, returning it to the pool and freeing up a connection. You can think of it as a cached pool of connections. However, if all the connections have been used, the pool will either block the connection or return an error if it’s programmed to do so until a connection is freed up in the pool.
It’s preferable to set limits on the number of connections allowed in a pool. PostgreSQL’s default connection limit is set to 100 concurrent connections, which is also the default on Compose for PostgreSQL. Many connection pooling libraries and tools also set connections to 100 by default. Nonetheless, PostgreSQL recommends that you set the pool to a little lower than the actual max connection limit so that you still have a couple direct connections open for database maintenance and monitoring.
Dockerizing and Configuring PgBouncer
PgBouncer is a connection pooling middleware for PostgreSQL. Unlike other connection pooling middleware that exists for PostgreSQL, PgBouncer serves only as a pooling and proxy layer between PostgreSQL and your application. You can download and install it on various platforms — there’s even a Homebrew package for macOS — or you can install by creating your own Docker container.
For this example, we’re going to create a Docker image and run a container, which has been modified from the version here. That repo uses an older version of PgBouncer so we’ve updated it. You can get the docker container over on our Compose examples Github repository for Compose examples. Download the
Dockerfile and the
However, before building this container, let’s look some of the initial set up that we’ll have to do first.
Make sure that you have downloaded the
pgbouncer folder in the Github repository. The folder contains two files that you should be aware of:
userlist.txt. These are configuration files that PgBouncer will use to configure the pool and to authenticate users.
pgbouncer.ini file is the configuration file where you alias your database connections, set up the connection limits, and set up the SSL/TLS configuration. Since we're using Compose, we can configure PgBouncer using the
require SSL mode from the server. We'll come back to that.
In this configuration file, there are a number of settings that have been commented out. Many of these we will not use, but you can try them out to see if you want some of the extra features turned on or fine-tuned. PgBouncer has extensive documentation on each of these settings. A basic configuration that you might use looks similar to this:
Let’s take a look at what’s under
[pgbouncer] in turn.
PgBouncer Database Settings
We’ll assume you have a Compose for PostgreSQL database up and ready to work with.
[databases] that's where you create aliases for the databases you've set up, or the databases you want PgBouncer to manage.
As you can see, we have two databases in our Compose for PostgreSQL deployment. We’ve created users and passwords for these in our deployment because you don’t want to give your administrator password out to this file. It’s a lot safer to designate individual usernames and passwords for databases.
With two databases set up, let’s look at PgBouncer settings.
PgBouncer Pool Settings
[pgbouncer], are the settings that PgBouncer uses for configuring access to the pool, the number of connections allowed, ports to listen to, etc. The settings we've provided are very basic and can be modified at your leisure.
listen_addr is the address where you want to listen for TCP connections. The
*indicates all addresses.
listen_port refers to the port of the TCP connection; therefore, when you're connecting up your client, it will use that port, and when we connect to the Docker container we need to expose this port.
auth_file really go together.
auth_type refers to how you want PgBouncer to authenticate users - we don't authenticate against PostgreSQL here, only PgBouncer, using the
userlist.txt file. This file contains a username and the MD5 hash of a password and that username. Again, do not put your administrator password here. The format of the text file looks like:
admin_users setting, you can assign users from your
userlist.txt as admins who will have access to a special
pgbouncer database that provides performance-related information about PgBouncer.
Since you connect over TLS to Compose for PostgreSQL, you’ll also need to set
require, which is the same as you have in your connection string.
pool_mode, you have three options: transaction, session, and statement. "Session" mode is the default option, for when an application connects to PgBouncer, it gets a connection from the pool, connects to PostgreSQL using PgBouncer's connection, and then the application returns the connection when it's done. "Transaction" mode is not a constant connection between the client and the server, depending on the release of a connection by the application, but it releases the connection once a transaction by the application is finished. "Statement" mode means that the connection is returned after any query is finished, basically, it requires you to auto-commit. Each one of these approaches has its benefits and tradeoffs, and you can read about each one here. Usually, it's preferred to run in session or transaction mode, with transaction mode being preferable in many cases.
For the settings
default_pool_size, the former refers to the number of applications that will make connections and the latter is how many server connections per database. The defaults are set at 100 and 20, respectively. PgBouncer has a formula for determining the pool size and the number of clients that you should set, but the default is usually more than enough.
With the configuration set up, we’re ready to build a Docker container image that will run PgBouncer. From the same directory where you downloaded the example
pgbouncer folder, run:
docker build . -t pgbouncer
This will create an image with PgBouncer installed with the appropriate
userlist.txt files added for us.
Next, let’s run the image:
docker run -d --name pgbouncer -p 6432:6432 --mount source=pgbouncer,target=/etc/pgbouncer pgbouncer
Here, we run the image in the background
-d, or you can use
-it to view the logs interactively, open port
6432 up which we'll use to connect to PgBouncer, mount the
pgbouncer volume to the container, and use the
pgbouncer image we created.
Once it’s running, you can start making connections to PgBouncer which will interact with PostgreSQL. For example, when PgBouncer is running with the above configuration, we can log in using:
psql -p 6432 -d dev -h 0.0.0.0 -U app_admin
This will prompt you for the non-hashed password user “app_admin” who will connect to the
dev database or the alias we configured in the
[databases] section above. We don't need to specify the host in this example since we're running this locally. However, for applications, you'll need to specify the host to connect to since you won't host your apps locally, we hope.
That’s it for connecting to the database … let’s look at PgBouncer stats, briefly.
Getting PgBouncer Stats
If you want to view the stats that PgBouncer collects to determine how your application’s connections are running, who’s connecting, and any other information, then PgBouncer has it’s own database that it creates to provide you with this information. Using one of the admins for PgBouncer, you’d connect like:
psql -p 6432 -U app_admin -h 0.0.0.0 pgbouncer
This will give you a prompt like:
psql (9.6.9, server 1.9.0/bouncer)
Type "help" for help.
From here, you can get all of the commands by running
Here’s the pools look like running when connecting the
And the connected clients using
Time to Bounce …
This was an introduction to getting set up with PgBouncer and Compose for PostgreSQL. PgBouncer allows you to easily set up a connection pool that keeps your database performative, while also managing the connections between your applications and PostgreSQL. The overview provided is only an example of what you can do with the default setting so if your applications are connection hungry, or you have a lot of applications using PostgreSQL, you might consider making adjustments, or exploring other options that PgBouncer has to offer.