MongoDB Cluster, Haproxy and the failover issue
Our MongoDB cluster with one replica set connects to Haproxy. We’re using a simple health check to identify the master and let our applications connect to it. When restarting the proxy, or bringing down a Mongo instance it will cause a strange error. It tried to connect some hostname to an IP address. But it fails. Mongo’s own replication has started to interfere with the health check in Haproxy. When this happened, our Rails apps using MongoId started displaying empty results with no related errors.
Our health check, checks port 27020 for the master. We added a Xinetd script to check for master which replicates a HTTP request, like this:
We thought MongoId couldn’t reconnect automatically but the documentation stated there was a connection pool. After trying all sorts of timeout settings on both Haproxy as MongoId we couldn’t solve the issue. I started to research the connection pool and discovered a very important setting which bypasses MongoDB’s internal failover scheme. It was connect ‘direct’. Haproxy should see the mongo cluster as a single instance. The health check makes sure there is failover and MongoId shouldn’t have to worry about that.
For the normal mongo-ruby-driver gem you can use the following connection URL;
So if you have a mongo cluster loadbalanced by haproxy, be sure not to use mongo’s internal failover system.