PgBouncer Installation, Configuration and Use Cases for Better Performance

Published in

The Startup

7 min readDec 6, 2020

Hi everyone,

In this article, I will try to explain how to install and configure PgBouncer for production use cases.
I hope you all enjoy and solve some issues in your environment :)

WHY DO WE NEED CONNECTION POOLING?

PostgreSQL has heavy weight connection handling architecture. Every incoming connection creates a new backend process which is considerably costly. This is not efficient when it comes to high connection numbers on production.

Usually when a client sends a request to PostgerSQL, it creates a connection on DB. Until it responds back to client, DB connection remains open. This is a problem if connections created and discarded in high frequency. Because these connections use their own memory allocation on the server. This is exactly why we need PgBouncer.

WHAT IS PgBouncer?

PgBouncer is an open-source, lightweight, single-binary connection pooler for PostgreSQL. It can pool connections to every possible database on the server it runs and uses TCP and Unix domain sockets.

When we use PgBouncer, it maintains a pool of connections for each unique user-database pairs. After that when a client tries to connect to PotgreSQL comes to PgBouncer first. PgBouncer gives a connection from the pool and returns it back to the pool after client disconnects. The important point is PgBouncer caches the connections after first connection. It doesn’t create a connection again and again. In this way we can reduce the load and improve the query performance on PostgreSQL dramatically.

PARAMETERS OVERVIEW

pool_mode
* session : Whenever a request comes to PgBouncer, keeps pool allocated. If the pool is full, then request will wait as many seconds as query_wait_timeout parameter. If it gets a connection in that time everything is OK. But if it doesn’t you should probably check your configuration and pgbouncer logs for long running queries.
* transaction : Workers will be assigned according to transactions. Which means it maybe considerable in multi-transactional environments. This configuration is more aggresive than session mode.
* statement : This is the most aggressive configuration parameter for pool mode. If it is well designed probably it will be the most effective mode for pooling. But performance comes with high load. This should be considered before setup.
default_pool_size : This configuration is equal to max pool size for user/db match. Should be combined well with pool mode.
reserve_pool_size : This is a backdoor for business critical operations. If a client has to open a connection and pool is full, client will wait as many seconds as reserve_pool_timeoutand going to use additional connections via this configuration.
reserve_pool_timeout : If a client has not been serviced in this many seconds, will use additional connections from the reserve_pool. Setting 0 disables this parameter.
server_lifetime : The pooler will close an unused server connection that has been connected longer than this. Setting it to 0 means the connection is to be used only once, then closed.
server_idle_timeout : If a server connection has been idle more than this many seconds it will be dropped. If 0 then timeout is disabled.
server_connect_timeout : If connection and login won’t finish in this amount of time, the connection will be closed.
server_login_retry : If login failed, because of failure from connect() or authentication that pooler waits this much before retrying to connect.
query_timeout : Queries running longer than that are canceled. This should be used only with slightly smaller server-side statement_timeout, to apply only for network problems.
query_wait_timeout : Maximum time queries are allowed to spend waiting for execution. If the query is not assigned to a server during that time, the client is disconnected. This is used to prevent unresponsive servers from grabbing up connections.
It also helps when the server is down or database rejects connections for any reason. If this is disabled, clients will be queued indefinitely.
client_idle_timeout : Client connections idling longer than this many seconds are closed. This should be larger than the client-side connection lifetime settings, and only used for network problems.
client_login_timeout : If a client connects but does not manage to log in in this amount of time, it will be disconnected. Mainly needed to avoid dead connections stalling SUSPEND and thus online restart.

Before installation you may want to visit the page below.

http://www.pgbouncer.org/config.html#hba-file-format

INSTALLATION

Environment Setup;
os : centos 7
db : postgresql-11
kernel version : 3.10.0–1062.9.1.el7.x86_64

Installation of PgBouncer is pretty simple. You can follow the codes below.

1. Install PgBouncer
sudo yum install -y pgbouncer2. Folder permissions to postgres user
chown postgres:postgres /etc/pgbouncer/ -R

3. Create Service file for PgBouncer and paste config into it
vi /usr/lib/systemd/system/pgbouncer.service[Unit]
Description=A lightweight connection pooler for PostgreSQL
Documentation=man:pgbouncer(1)
After=syslog.target network.target[Service]
RemainAfterExit=yesUser=postgres
Group=postgres# Path to the init file
Environment=BOUNCERCONF=/etc/pgbouncer/pgbouncer.iniExecStart=/usr/bin/pgbouncer -q ${BOUNCERCONF}
ExecReload=/usr/bin/pgbouncer -R -q ${BOUNCERCONF}# Give a reasonable amount of time for the server to start up/shut down
TimeoutSec=300[Install]
WantedBy=multi-user.target
4. Open PgBouncer config/init file and paste the config below
[databases]
* = port=5432 auth_user=postgres[pgbouncer]
logfile = pgbouncer.log
pidfile = pgbouncer.pid
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = hba
auth_hba_file = /path/to/pg_hba.conf
admin_users = postgres
stats_users = postgres
pool_mode = session
ignore_startup_parameters = extra_float_digits
max_client_conn = 200
default_pool_size = 50
reserve_pool_size = 25
reserve_pool_timeout = 3
server_lifetime = 300
server_idle_timeout = 120
server_connect_timeout = 5
server_login_retry = 1
query_timeout = 60
query_wait_timeout = 60
client_idle_timeout = 60
client_login_timeout = 60


5. Start PgBouncer service
systemctl start pgbouncer

Until here all the configs are default. The last part of the config is PgBouncer config/init file(pgbouncer.ini). There are some different types of authentication for PgBouncer. I will talk about the common ones.
Let’s consider the default auth_file = users.txt.

md5 : Default auth method for PgBouncer.
1.Run the query below.
SELECT CONCAT(‘“‘,pg_shadow.usename, ‘“ “‘, passwd, ‘“‘) FROM pg_shadow;
2.Copy output and paste into /etc/pgbouncer/users.txt
3.In ‘pgbouncer.ini’ make changes below
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

hba : Configuration that i choose. Dont forget to delete replication db configurations in your ‘hba.conf’ file, because pgbouncer doesn’t support replication. You can use a fake ‘hba.conf’ file instead of your original file.

1. Check your pg_hba.conf and comment replication db parts. After that make sure that PgBouncer can parse it.
2. In ‘pgbouncer.ini’ make changes below
auth_type = hba
auth_file = /path/to/pg_hba.conf

trust : No authentication is done. Just write your users to users.txt

1. List all users in enivronment
2. Paste user names into /etc/pgbouncer/users.txt
3.In ‘pgbouncer.ini’ make changes below
auth_type = trust
auth_file = /etc/pgbouncer/users.txt

any : Like the trust method, but the user name given is ignored. Requires that all databases are configured to log in as a specific user. Additionally, the console database allows any user to log in as admin. Means who cares :)
Just like the Rumi’s saying;
“Come, come, whoever you are. Wanderer, worshiper, lover of leaving. It doesn’t matter. Ours is not a caravan of despair. come, even if you have broken your vows a thousand times. Come, yet again , come , come.”
- Maulana Jalaluddin Rumi
Of course this is a joke, i wanted to make you smile guys :)

1. No need for extra config. Only in ‘pgbouncer.ini’ make changes below
auth_type = any

NOTE : If you change the auth type make sure you restarted the service.

BENCHMARK TESTS

Test Parameters are constant.
Number of Clients : 300
Number of Threads : 16
Number of Transactions Per Client : 100

Postgresql Benchmark Results (Without Pooling)

As u can see the results above, PgBouncer makes more efficient connection and session management than PostgreSQL itself. It also much more faster because of caching mechanism. Both benchmark tests are almost used same amount of CPU.
As you can see latency average in PgBouncer %25 lesser than PostgreSQL. Also PgBouncer processed %33 more transactions than PostgreSQL and handled %50 more TPS than PostgreSQL itself.
These statistics show us why we need to use connection pooling ,how will it affect our high frequency transactional environments, how costly is user and session management for PostgreSQL.
Just for better understanding and visualize data, you can see graphical results of benchmark tests below.

CONCLUSION

PostgreSQL connection management is costly. Thats why it needs a lightweight connection pooler for better results in same circumstances.

I mentioned about what i tried and what were the results. These results are always improvable. Please feel free to contact me. I am always open for new ideas. I will also share automation processes for PgBouncer in my next articles.

Thanks to my colleague “Hüseyin Demir” for supporting this improvement work with me. His point of view enriched this work.

My main approach is “No need to have a problem you can always make some things better”. I hope you enjoyed and found this article useful :)

I will continue to write articles about Couchbase, SQL Server and Postgresql as soon as possible. I like to share my knowledge and make improvements in the environments work.

Keep Following guys :)

Ahmet Demir