How we scaled MongoDB (part 1)
At Touch4IT, we frequently use MongoDB as a database solution for client projects. MongoDB has some interesting properties, however it generally uses a lot of RAM, therefore we are exploring options, how to scale MongoDB outside defined replica set running in docker.
This is the first post from our upcoming series on scaling MongoDB.
Intro
A slightly advanced understanding of Docker, networking and Linux is assumed.
In this series we show how we scaled our MongoDB across different VMs, without any downtime. The starting state consists of a MongoDB cluster running on a single VM (nodeA)
. The desired state is to have 4th MongoDB replica on a separate VM (nodeB
).
Assume the following docker-compose.yml
version: '3.6'
services:
mongo1:
image: mongo:3.4.20
command: bash -c "mongod --replSet rs1 && mongo"
container_name: mongo1
expose:
- 27017
volumes:
- ./mongorc.js:/etc/mongorc.js
networks:
mongo:
ipv4_address: 172.30.0.11
mongo2:
image: mongo:3.4.20
command: bash -c "mongod --replSet rs1 && mongo"
container_name: mongo2
expose:
- 27017
volumes:
- ./mongorc.js:/etc/mongorc.js
networks:
mongo:
ipv4_address: 172.30.0.12
mongo3:
image: mongo:3.4.20
command: bash -c "mongod --replSet rs1 && mongo"
container_name: mongo3
expose:
- 27017
volumes:
- ./mongorc.js:/etc/mongorc.js
networks:
mongo:
ipv4_address: 172.30.0.13networks:
mongo:
driver: bridge
name: mongo
ipam:
config:
- subnet: 172.30.0.0/24
And mongorc.js
rs.initiate(
{_id : "rs1", members: [
{ _id: 0, host: "mongo1:27017" },
{ _id: 1, host: "mongo2:27017" },
{ _id: 2, host: "mongo3:27017" }
]}
);
rs.slaveOk();
The challenge
Since the MongoDB is set up in a way, that only local access is allowed, the challenge is to make all MongoDB containers accessible from outside (e.g. nodeB
), without any downtime (no docker restart commands or other tricks). The reason for this is to add a new MongoDB replica (call it mongo4
, on nodeB
), which is located outside of the docker mongo
network. This will allow the cluster to be scaled or moved to other nodes, the first step towards high availability.
By default, only containers in the docker network called
mongo
have access to the MongoDB cluster.
When interacting with MongoDB cluster from the hosting machine, ping mongo1
works as expected, but what happens when you want to access mongo1
(172.30.0.11
) from another machine?
Suppose nodeA
is running the docker-compose.yml
from the above snippet, and we want to scale replica set outside nodeA
, eg, to nodeB
. nodeA
and nodeB
are directly connected using subnet 192.168.123.0/24
nodeA
has IP .10
and nodeB
has .1
Well the first step is to set up static route on the the nodeB
, so it knows about the docker mongo
network.
Static route
# route to mongo network via nodeA IP
nodeB$ ip route add 172.30.0.0/24 via 192.168.123.10
Great, now we should be able to access mongo1’s IP from node B.
Well, no.
The router
nodeA
must act as a router, to forward traffic to MongoDB, therefore we need to enable ip_forward
, e.g. like this:
nodeA$ echo 1 > /proc/sys/net/ipv4/ip_forward
OK, now it should work, right?
Well, no.
The firewall
To understand what else we need to do, we will take a look at the nftables
(new iptables
).
nftables is considered as the new and improved alternative to iptables. This goal can be achieved with iptables as well, just translate the language of nft to the language of iptables and you are good to go. To learn more about nft see the wiki.
To list the current state of firewall run:
nodeA$ nft list ruleset
To further understand what is going on, when we want to reach MongoDB, we may use nft monitor trace
to our advantage. First, we enable monitoring of packets by adding a simple rule:
# enable tracing for all packets that reach prerouting table
nodeA$ nft add rule ip nat PREROUTING meta nftrace set 1# run live trace
nodeA$ nft monitor trace# in another terminal, we try to ping the mongo container
# one packet is enough, so we don't spam
nodeB$ ping -c 1 172.30.0.11# notice the monitor command, it will output something like:
We can see that the rule execution ends on FORWARD
chain, which has the default policy drop
. If we examine the FORWARD
chain, we see that the first rule jumps to DOCKER-USER
chain. Hence we can use this conveniently created chain for our custom rule, which will allow traffic to MongoDB containers. The last piece of the puzzle is:
# add rule so the communication towards mongo will be accepted
nodeA$ nft add rule ip filter DOCKER-USER ip saddr 192.168.123.0/24 accept
As of now, we should be able to connect to MongoDB container from nodeB
.
nodeB$ mongo 172.30.0.11
MongoDB shell version v4.2.1
connecting to: mongodb://172.30.0.11:27017/test?compressors=disabled&gssapiServiceName=mongodb
WARNING: No implicit session: Logical Sessions are only supported on server versions 3.6 and greater.
Implicit session: dummy session
MongoDB server version: 3.4.20
WARNING: shell and server versions do not match
Server has startup warnings:
... blah blah ...
>
# and here we are!
Conclusion
We managed to connect to a running MongoDB docker cluster, without touching the containers and most importantly without downtime. We are now prepared to create a new MongoDB and connect it to the existing replica set. To sum it up, these were the actions we took:
- Add a static route to MongoDB network
- Enable
ip_forward
- Allow connection using
nftables
- TADAA!
In the next part, we will talk about creating a new MongoDB replica.