Go and MySQL: Setting up Connection Pooling

When writing code under high load conditions, one can often encounter problems related to the database. I used to launch different high load services in Go with MySQL as a data storage. During the maintenance of these services I had to solve various problems of working with the database. Misusing connection pooling and prepared statements, improper queries… All this affects performance and fault tolerance. Since then I have some tips to share with you to make your Go code more reliable in terms of MySQL usage.

One of the most common problems you may encounter is misconfiguration. Through this blog post, I will explain basic concepts to better understand the cause of problems and provide you with a compilation of best practices in Go and MySQL interaction.

So I’ll share some advice about configuring connections to MySQL. We’ll go through the main points that you need to pay attention to when working with MySQL using Go language.

  • We’ll cover the basics of client-server protocol MySQL, its basic structure and operation principle.
  • Then we’ll move on to the Go part and will analyze the implementation of the connection pooling.
  • We will go from configuring connections to making requests simultaneously looking at the driver code.

MySQL Client-server protocol

It is a half-duplex protocol, in other words, it is synchronous. It means you can either send a request or accept it.

After sending a command to the server the client should wait for a response before initiating another action . At the same time the message should be received entirely, the client cannot break it off abruptly, unless the connection is interrupted.

All data transmitted over the protocol is packed into packets. There are predefined packet types on the server and client sides and they can exchange these packets between each other.

Generally the client sends a request in one packet, while the server may respond with several packets. And even if the client needs only part of the packages, it still has to receive all of them.

In this protocol, TCP will be used as a transport that is used by the client and server to exchange these same packets in binary.

The MySQL protocol has two main phases:

Connection phase

Once TCP connection is established, the MySQL server sends a packet with a handshake to the client after which parameters are exchanged. For example, a client can request an SSL connection, and then send a response packet to the server for a handshake.
After a successful first phase the protocol enters the command phase mode.

Command phase

In this phase the client takes the initiative and sends command packets to the server. These include queries, prepared expressions, stored procedures, and replication commands. The server processes client requests and responds with a result.

We can look at the MySQL protocol. To do this the database has a special plugin that displays the trace. It is convenient to use for testing (not production solutions), so it is disabled by default. To enable it you need to create the database server using the WITH_TEST_TRACE_PLUGIN option.

You can watch the trace output in the terminal after connecting to the server. Here’s what it looks like.

Another way to look at the data being transmitted is to use the ngrep utility on the client:

sudo ngrep -x -q -d any port 3306

The output between the client and the server will look as follows.

Libraries for GO

The standard library contains only the main interface — database/sql for all sql or sql-like databases. The language maintainers left the implementation of protocols for specific databases to the third-party developers. That is why we will use its official go-mysql driver, which is a separate package itself.

The standard sql interface package uses a connection pooling for database connections. This is because MySQL protocol requests occur synchronously — each client can execute the next request only after it completes the previous one.

But are there any other options else?

One connection

The first thing that might come to mind is to use one global connection for the entire service. It is clear that as a result all new requests will wait until the connection is free. This will be a bottleneck and our service may hang for a long time.

Connection per each request

You can create a new connection on each request, closing it on completion.

While it solves the main issue of waiting for a connection, there are some other challenges related to its use.

First, we won’t be able to control the growth of connections to our database. The number of our connections will directly depend on the number of requests — i.e. how many connections the service will open will depend on how many requests will come. And it is not good for the database. In a normal mode, we may not have problems, but at times of peak loads you can reach the maximum possible number of connections from the database, and it will simply stop responding. There will be errors when connecting to the database both in our service and in all others that use it.

Second, each time we will be forced to go through the first phase of connecting to the MySQL server, in which TCP connection, handshake and authorization are established. Compared to the command phase, this one is more time consuming as we will spend extra time creating a new connection each time.

Thus, the connection pool solves all these problems to one degree or another.

Connection pooling

Pools help us control the total number of connections that a service can use to access the database. Thus we can secure both the database and minimize the number of passes of the first phase of the connection. Because as a rule in the pool we have connections prepared in the command phase, with the first phase of the connection already passed.

Pool initialization

Let’s start with initializing our pool and see what happens in the code directly from the documentation of the go-sql-driver package.

At the very beginning when importing the go-sql-driver the mysql protocol is registered.

Then when calling the Open method, we say that we want to open a pool that will use the protocol called mysql (first parameter) while connecting, that is implemented in the go-sql-driver package with the configurations passed using the second parameter. After that the methods for configuring our pool are called, which we will discuss at a later stage.

Pool view

This structure represents our connection, outlining various structure fields required for configurations. For example, the freeConn field, which is a slice of links to connections using our chosen protocol (in fact, this field represents the set of our free connections itself). Or the connRequests field which is used to request a new connection, etc.

Calling the Open method results in the creation of a new DB structure.

Creating an empty pool

Inside it the OpenDB method is called, in which the structure with the initial data is initialized.

As you can see from the code the pool is initialized empty. It means that at this stage we have not yet established any connection to the database. Therefore, if you have incorrect data in the configuration (for example an incorrect login or a password) when you call the Open method, you won’t know about it yet.

So, we have created a pool, and after that it must be configured.

Pool configuration

The following methods are mainly used to configure a pool.

Let’s see them in more detail.

Maximum number of connections

Using the SetMaxOpenConns() method we set the maximum possible number of open connections to the database from our pool. Be careful — by default this value is 0, which means unlimited number of connections. This can lead to the situation we talked about above, when every request to your service will open a new connection.

What’s the time when the connections are created? Usually after the pool is initialized, the ping method is called, which creates a connection to the database. At this stage, you can assess whether our configuration is correct or not.

err = db.Ping()

Connection creation strategies

Before each request our pool tries to get a connection by calling -

conn(ctx context.Context, strategy connReuseStrategy)

The method implements 2 strategies for creating a connection. AlwaysNewConn always creates a new connection, and cachedOrNewConn first tries to take a free connection from the pool, if any.

If there are no connections in the pool, then the method will create a new one. If it is impossible to create a new connection due to the limits set on the number of open connections, then a request will be created to open a new connection and it will wait for its execution.

The connection received using this method immediately goes into use and is marked as busy, then we are ready to make our requests. After we release this connection it will try to return to the pool.

Pool size limitation

Using the SetMaxIdleConns() method we adjust the maximum number of free connections that can be stored in the pool and wait until they are used. By default you can put up to two connections in the pool.

The «query» method is one of the most frequently called methods for performing queries, in which the connection is released after use and an attempt is made to put it back into the pool of free connections. In the specified code the dc.releaseConn function is responsible for this. If the pool already has the maximum number of connections, then the freed connection will be closed.

However, if the connections are still closed on time, it can still be configured.

Connection lifetime

The SetConnMaxLifetime() method is responsible for determining the lifetime of the connection in the pool from the moment it was created.

Age check is performed every time you try to get a connection from the pool. connectionCleaner also monitors the age. It simply closes expired connections by looping through them and checking the lifetime.

But wait there’s more. Starting with Go version 1.15 you can specify a connection time-out.

Pool time-out

The SetConnMaxIdleTime() method sets the maximum amount of time a connection can be idle in the pool. This setting is very similar to the previous one, but the main difference is that the connection lifetime is counted from the date of its creation, and the time-out in the pool is counted from the moment when this connection was returned to the pool.

After this time has elapsed the connection will also be closed.

Returning a connection to the pool

As we have already said, when executing a request the connection is taken from the pool or created if there is nothing in the pool, after which it is marked as busy. After making requests we must release it.

When we receive data for our request we get back a structure with the results of Rows, which also contains our connection.

When the Close() method is called, the connection is released and an attempt is made to put it in the pool or close it if there is no free space in the pool. Therefore, writing code as in the example above is not good. Be sure to call Close at least in defer.

If you do not release connections, then they will simply hang with the «busy» status, consume memory and clog the number of open connections. As a result we can get an empty pool, since all connections will be considered busy, but at the same time we won’t be able to create a new connection, as we have reached the limit of maximum open connections. This will lead to a situation in which we can neither create new connections nor use existing ones.

But it is even better to do this as follows.

When we initiate Scan(), Go tries to convert the bytes that came from MySQL to the data types passed in the parameters. And note that the request execution method is tied to the context. If the context is reset during the execution of Scan() the data may be corrupted. This problem is described in considerable detail in this article, section «The race».

You can catch this error explicitly calling the Close method and checking for an error. In defer we leave Close to make sure that if some panic occurs, the connection will return to the pool.

Next, I will use synthetic examples to show how all this affects real requests. But before that, let’s look at one more interesting nuance that affects our requests.

Interpolation of parameters

Generally the placeholders are used to make requests with parameters to avoid possible sql injections.

db.QueryRow(“SELECT * FROM client WHERE id=?”, id)

But this way of making requests can work differently depending on the settings.

The parameter interpolateParams=true/false is also supported as a MySQL connection configuration. By default its value is false, which is the same as a non-passed parameter.

If it is false, to make the above request a command is sent to the database server to prepare the expression (Prepared Statements), after which a new command will be sent to execute the prepared expression.

If we have a large and complex request with numerous placeholders that will be executed several times in a method (for example, when creating batches of work), then the prepared expressions can speed up our work. Having performed the preparation once on the MySQL server, we use the prepared request without spending resources on parsing it each time.

If interpolation is enabled, i.e. set to true, the placeholders are replaced by values in the go library code. But if you do not enable interpolation of parameters for regular one-time requests, we will receive two TCP requests to the MySQL server instead of one (not counting the connection itself).

Request with the parameter interpolateParams=false

The same request with interpolateParams=true

Connection Pooling in a Real Sense

Let’s see some pool configuration examples in action. I will loop through the request “SELECT SLEEP(2);” to a MySQL database and see what happens with the connections. In the specified code there is a pause of one second after each request. This results in connections not having the time to be reused, so we can see them on an informational basis.

And this will lead us to the situation we talked about above. When each call of our method will block the connection on itself, the rest of the requests will wait until the connection is released. Please note that this takes 11 seconds.

Then I will make a change maxOpenConns := 0:

maxLifeTime := time.Second * 10maxOpenConns := 0maxIdleConns := 0db := NewPool(maxLifeTime, maxOpenConns,maxIdleConns)

This will lead us to our second case, when each request will create a new connection to the base. As a result we will get an error when trying to connect due to the restriction on the number of connections on the database side.

Now we’ll set maxOpenConns=3, as we have some restrictions.

maxLifeTime := time.Second * 10maxOpenConns := 3maxIdleConns := 0db := NewPool(maxLifeTime, maxOpenConns, maxIdleConns)

In this case we already have the opportunity to open three connections that can work in parallel. And the same five requests took 6.2 seconds instead of 11 seconds as in the first case. But at the same time we see that for each request, we go through the stage of connecting to the database again.

Let’s configure the pool to minimize connections. Let’s set maxIdleConns=3:

maxLifeTime := time.Second * 10maxOpenConns := 3maxIdleConns := 3db := NewPool(maxLifeTime, maxOpenConns, maxIdleConns)

As a result we go through the connection stage only 3 times and put the prepared connections into the pool. On subsequent requests to the database, they are taken from the pool with the authorization phase already passed, and you just have to send the command. And in general requests are made faster.

Let’s sum it up

  1. Never leave the connection pool settings at default, you should always customize it to suit your needs.
  2. Set maxOpenConns below the configured connection limits on the database side
  3. maxOpenConns = 0 means unlimited number of connections
  4. If your MySQL server has timeouts configured for idle connections and they are closed on the database side, then there is no need to use maxLifeTime, as the go driver will detect the break and will make the connection again
  5. Choose the value of maxLifeTime to minimize the creation of connections during peak loads on your service, and when the load drops, connections should not hang idle for a long time
  6. Do not forget the method SetConnMaxIdleTime (with go 1.15). Using this method you can reduce the connection timeout in the pool by deleting it if no one is using it. Thus you can save on the number of open connections in the database and reduce memory consumption in the service.
  7. Always release a connection after the request (rows.Close()). Using some methods it is done automatically, but it’s okay if you repeat the Close() call to be safe.
  8. If you use multiple pools for the same database (for example, a read pool and a write pool), then the total number of maxOpenConns of these two pools must be less than the configured connection limits on the database side.
  9. If you scale your service, don’t forget that open connections will also increase. If you add three instances, then make sure that the sum of maxOpenConns of all three is less than the configured connection limits on the database side.
  10. Do not forget about various proxies that, including its timeouts and restrictions, can stand between you and your database. Consider intermediary settings too
  11. Consider special aspects of your service. Take into account the nature of the load and adjust the pool settings.
  12. Do not forget about monitoring, and also keep checking the connection pool.

Thank you for taking the time to read my article. If you have any questions, please leave a comment below and I will be happy to answer.

--

--