Silly mistake with ceph-mon

My deployment script for ceph was half-backed, and I forgot about it. After running it I got ‘almost working ceph’ with one simple issue: one of the monintors is down. It was a first monitor (bootstrap monitor), which is configured by separate entity in my ansible playbook.

It looked like this:

...
1 mons down, quorum 0,1 mon3,mon2
monmap e1: 3 mons at {mon1=x.x.x.109:6789/0,mon2=x.x.x.108:6789/0,mon3=x.x.x.107:6789/0}
election epoch 4, quorum 0,1 mon3,mon2

Logs on that monitor was very clear, but ugly:

log_channel(audit) log [DBG] : from=’admin socket’ entity=’admin socket’ cmd=mon_status args=[]: finished
log_channel(audit) log [DBG] : from=’admin socket’ entity=’admin socket’ cmd=’mon_status’ args=[]: dispatch
log_channel(audit) log [DBG] : from=’admin socket’ entity=’admin socket’ cmd=mon_status args=[]: finished
cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
mon.mon1@0(probing) e0 ms_verify_authorizer bad authorizer from mon x.x.x.108:6789/0
–x.x.x.109:6789/0 >> x.x.x.108:6789/0 pipe(0x560fee691400 sd=21 :6789 s=0 pgs=0 cs=0 l=0 c=0x560fee46ff80).accept: got bad authorizer
cephx: verify_authorizer could not decrypt ticket info: error: NSS AES final round failed: -8190
mon.mon1@0(probing) e0 ms_verify_authorizer bad authorizer from mon x.x.x.107:6789/0
-x.x.x.109:6789/0 >> 188.42.217.107:6789/0 pipe(0x560fee690000 sd=21 :6789 s=0 pgs=0 cs=0 l=0 c=0x560fee708d80).accept: got bad authorizer

Real issue was very simple: bad /var/lib/ceph/mon/ceph-mon1/keyring

After I fixed it, monitor was up.

TL; DR;: ‘bad authorizer’, ‘could not decrypt ticket info’ — just a sign of inconsistent keys (check in keyrings of services or look to ceph auth list).