Encryption and security have been important right from the days of Caesar ( well even before that ) as empires, kingdoms, individuals, enterprises and all have sought to protect their data albeit for their own reasons.
But if one isn’t careful, a single duplicate certificate can wreck havoc into your own secure system.
Now this story won’t be telling you how to create certificates or how to even enable the TLS/SSL options in case of Glusterfs.
The strange error
Suppose it’s a beautiful Friday evening and you’ve already done with most of your tasks for the day. The web applications are deployed in the cluster, db is optimized, load balancing is handled and you’re just left with one small errand — “Enabling TLS in glusterfs”.
The document is pretty simple. You also know what to do. You go on to create self signed certificates, i.e. create some N number of certs for N nodes in glusterfs and then concatenate all the public certs in one single file called glusterfs.ca. Now you could’ve gone with the other approach of using a Root CA and going for a CSR etc..etc. But then again it is a Friday evening and the weekend camping trip seems very near now. Just one command line execution away.
Things start going berserk. The peers don’t connect with each other. The storage itself is not accessible and as the DB is using this storage and the web apps are using the DB…you get the picture right ?
Good bye camping!
So what might’ve gone wrong? You ask..
You go through the logs and find one entry…
[2021-02-18 11:11:11.12312] E [socket.c:246:ssl_dump_error_stack] 0-socket.management: error:0B07C065:x509 certificate routines:X509_STORE_add_cert:cert already in hash table
Ring any bells ?
Then here’s the part which had gone wrong.. glusterfs.ca
The very error description says that there’s a cert already in hash table ( Now technically you just added the certs in a file. ). The takeaway is that there’s a duplicate certificate entry inside the glusterfs.ca. Plain and simple.
So after a sigh of relief, you go to the glusterfs.ca, weed out the duplicate certificate entry, restore the dependent services and go for that camping trip.
For those who want to prevent such scenarios wherein they want to prevent time being wasted in first unknowingly adding a ( or multiple duplicate entries of certs ) inside the .ca file, destabilizing the cluster and it’s dependent services then one has to be pre-emptive.
Pre-emptive check of duplicate cert entry
Now one can manually go through the glusterfs.ca ( Please don’t! ) but what if you have a cluster with multiple nodes ?
So the better way is to use a script to find duplicate entries before the glusterfs.ca is even deployed into the nodes.
I’ve created a script which goes over the glusterfs.ca and finds the duplicate entries and just informs as to which public cert is repeated. For those who want to use it, the repo link is duplCertDetect.
The aforementioned script can be improved to even remove the duplicate entries. ( Any takers for contributing some code ? )
In conclusion, most of the errors and failures are due to some contribution some a human entity and manual verification is never the answer and nor is the method of running things and then seeing if we hit any errors. Many a times, a simple linting or even parsing can prevent grave failures which can ruin your camping trip ( or whatever is your weekend plan ).