Design A Highly Available MySQL Clusters With Orchestrator And ProxySQL In GCP — Part 1
Recently we have migrated one of our customer's infra to GCP and post the migration we published some adventures on ProxySQL which we implemented for them.
- Reduce MySQL Memory Utilization With ProxySQL Multiplexing
- How max_prepared_stmt_count can bring down production
Now, we are going to implement an HA solution with customer filter for failover. We have done a PoC and the blog is about this PoC configurations. And again the whole setup has been implemented in GCP. You can follow the same steps for AWS and Azure.
A big thanks to Shlomi (Developer of Orchestrator) who helped a lot to setup Orchestrator.
René Cannaò (Author of ProxySQL) who cleared our doubts while merging Orchestrator setup with Proxysql.
Background of the DB setup:
- We are using MySQL 5.6.
- GTID Replication Enabled.
- ProxySQL is configured on top of MySQL Layer.
- ProxySQL implemented to split read/write workloads.
- One Master and 4 slaves (and one more slave for an existing slave)
- Slave 1,2 will handle all the read workloads and data science/analytic queries.
- The 4'th replica has a separate replica. And this server is used for internal Application and other services but it needs 90% of Production tables with real time data. Writes are enabled on this replica. But writeable tables are not available on Production, So replication never gets affect because of this write.
- Slave 1,2 are splitting read loads, So we can achieve HA even with if one node down. So as of now, we didn’t set any addition HA for read group servers.
- Slave 4 has additional writable tables, and internal applications are using this. So HA is mandatory for this.
- Finally Master node. Yeah, of course, it should be in HA or in other terms auto failover must be in place.
- During the main master failover, there should not be any data loss between the failed node and the new master.
Our approach for HA:
- For read groups, we already have 2 instances. It not a big problem for now.
- Slave 4 already have a replica. So we’ll make automatic failover to that replica. And we can use Virtual IP and during the failover will swap that IP to the new master.
- Still, we have one more replica (Replica 3). So we can use this for failover the master.
The Risks/Restrictions in Failover:
We can achieve the above solution with some HA software for MySQL. But we have some restrictions.
- If the master goes down then the HA tool will failover to
Replica3. It shouldn’t promote any other nodes.
- Once the master failover has been performed then all other slaves will start replicating from the newly promoted Master.
Report Serveris an intermediate Master. So if it went down, the responsibilities of this server will move to its own replica means the
Replica of Report Serverwill continue the replication from any node (from the master or any slaves) and the report application will communicate to the replica server. In simple words, the replica will be promoted like a master.
- I have simulated some various failover scenarios from the below image.
Top Layer - Reader (Blue)
Middle Layer - Writer (Yellow)
Low Layer - Report (Pink)
Problems with Other HA tools:
- They will promote any replica as a master. In our case finance db is also a replica of Production DB, So there are many possibilities that the Finance DB will become a master for production.
- Due to network outage between HA tool and Master Node, but the connectivity between App servers and Slave nodes are fine. Then they will consider that maser has been down and start promoting any slaves.
- There is no guarantee for make sure the HA tools are not in single point of failure.
- During the planned maintenance, graceful master failover (manual failover) is not possible.
To achieve the auto failover with the filters (don’t promote the report server and prefer Replica3 for new master) we decided to use
Orchestrator. There are couple of impressive features of Orchestrator made us to try it.
- If the master is down, Orchestrator will ask all the slave nodes that the master is really down or not. If slaves responds as Down, then the failover will happen.
- We can define, which node can be a new master.
- Graceful failover will help us during the maintenance window, so weCan perform the maintenance activity on all the nodes without breaking the replication and major downtime.
- We can setup HA for the Orchestrator tool. So the Orchestrator service will not be in a single point of failure.
- Once the failover has been done, Orchestrator will help the slaves to replicating from the new master.
- Web UI is also available.
- HOOKS are there to perform/run/automate any scripts during the detection of failure/pre-failover/post-failover.
- You can read more about Orchestrator here.
ProxySQL — An alternate for Virtual IP:
After the promotion has been done, Orchestrator will make
READ_ONLY=OFF on the promoted replica. In general mostly we are using VIP. Once the failover has performed, then a hook or script will switch the VIP from the Dead node to the new Master. But in many cases this is not a real quick task.
Here ProxySQL will help us and act as an alternate for VIP. How this magic happens? In ProxySQL there is a table called
mysql_replication_hostgroups. Once we insert the data initially with which host ID id current read and writer. Then every N seconds(I guess its 3) they will check the
read_only flag. Whenever it identified there is a change in this flag, and mean while writer node is down, then it’ll understand that the failover has been performed and the read become a master. So it’ll swap the read nodes and writer nodes host IDs. So all the connections will continue to go with the
Hostgroup id 1.
This is a great feature to prevent dataloss during the failover. With this replication master will wait to get the acknowledgement from any one of the slave and the commit the transaction. So we enabled
Semi-Sync between Master and Replica3.
Alias IP in GCP:
This the point where we got stuck for sometime. Because for Report server, we are not using ProxySQL. So once the failover done, then there is no way to intimate that the replica of Report server is the new master. Report server and its replica are in same subnet but different zone.
So we decided to use VIP for this. In GCP, we can use Alias IP address for this. And we can swap this IP address to the any node which is available on the same subnet. We need to worry about zone. (AWS and Azure — they don’t provide a subnet across multi zones, so your report and replica must be in a same subnet).
- We used ProxySQL servers in managed instance group and at point of time 2 instances must be running with autoscaling.
- We have deployed the Orchestrator also in managed instance group and keep 3 nodes at any time.
- Orchestrator needs a MySQL backend to work. But they will not care about this database. So we used a tiny CloudSQL instance with Failover.
In the next part, we explained how to configure this setup in GCP. So stay tune and hit the below link.