When increasing the designate worker, it doesn’t work the service_statuses.

Hirose Takahito
4 min readApr 30, 2019

--

I was setting the designate configuration. I had found the one of issue. designate service_statuses was not working, when I increased the workers.

So I investigated why service_statuses doesn’t work.

Fist of all, I checked the relevant code

https://github.com/openstack/designate/blob/master/designate/service_status.py#L45-L99

The link above is relevant point code.

case of worker is 1

Because If this case is not working, it will become very very very high priority issue.

Please check the below log. I added the some logging point. So We can check the process ID and _running flag status.

2019-04-04 12:14:52.670 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.__init__: 96400 __init__ /usr/lib/python2.7/site-packages/designate/service_status.py:53
2019-04-04 12:14:52.730 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: 96400 _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:75
2019-04-04 12:14:52.730 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: _running=False _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:76
2019-04-04 12:14:52.731 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: 96400 start /usr/lib/python2.7/site-packages/designate/service_status.py:100
2019-04-04 12:14:52.732 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: _running=True start /usr/lib/python2.7/site-packages/designate/service_status.py:102
2019-04-04 12:14:57.735 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: 96400 _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:75
2019-04-04 12:14:57.735 96400 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: _running=True _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:76
2019-04-04 12:15:02.732 96400 DEBUG designate.service_status [req-2b1269ed-5802-4787-a2f2-8ef583d01bd7 - - - - -] designate/service_status.HeartBeatEmitter._emit_heartbeat: 96400 _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:75
2019-04-04 12:15:02.733 96400 DEBUG designate.service_status [req-2b1269ed-5802-4787-a2f2-8ef583d01bd7 - - - - -] designate/service_status.HeartBeatEmitter._emit_heartbeat: _running=True _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:76
2019-04-04 12:15:07.731 96400 DEBUG designate.service_status [req-b3c89e7e-a321-4c4c-8a0b-d8bb880962bf - - - - -] designate/service_status.HeartBeatEmitter._emit_heartbeat: 96400 _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:75
2019-04-04 12:15:07.732 96400 DEBUG designate.service_status [req-b3c89e7e-a321-4c4c-8a0b-d8bb880962bf - - - - -] designate/service_status.HeartBeatEmitter._emit_heartbeat: _running=True _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:76

I can see this log, it’s OK. :D

And, I checked data.

> select * from service_statuses where service_name='api'\G
*************************** 1. row ***************************
id: 4e24db8d45be496f8119fda6eb706d96
created_at: 2019-04-04 03:14:57
updated_at: 2019-04-04 03:37:18
service_name: api
hostname: dns001.host
heartbeated_at: 2019-04-04 03:37:18
status: UP
stats: {}
capabilities: {}
1 row in set (0.00 sec)

It’s ok too. Great!! :D

Finished checking of case of 1 worker.

Case of worker are 2 or more

This is point of this article. Because I guess it is not working, when setting the worker is 2 or more.

I set the number of worker is 5 and set the same as logging point.

2019-04-04 12:37:24.512 98588 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.__init__: 98588 __init__ /usr/lib/python2.7/site-packages/designate/service_status.py:53
2019-04-04 12:37:24.533 98598 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: 98598 start /usr/lib/python2.7/site-packages/designate/service_status.py:100
2019-04-04 12:37:24.533 98598 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: _running=True start /usr/lib/python2.7/site-packages/designate/service_status.py:102
2019-04-04 12:37:24.536 98599 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: 98599 start /usr/lib/python2.7/site-packages/designate/service_status.py:100
2019-04-04 12:37:24.536 98599 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: _running=True start /usr/lib/python2.7/site-packages/designate/service_status.py:102
2019-04-04 12:37:24.537 98600 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: 98600 start /usr/lib/python2.7/site-packages/designate/service_status.py:100
2019-04-04 12:37:24.538 98600 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: _running=True start /usr/lib/python2.7/site-packages/designate/service_status.py:102
2019-04-04 12:37:24.563 98601 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: 98601 start /usr/lib/python2.7/site-packages/designate/service_status.py:100
2019-04-04 12:37:24.563 98601 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter.start: _running=True start /usr/lib/python2.7/site-packages/designate/service_status.py:102
2019-04-04 12:37:24.636 98588 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: 98588 _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:75
2019-04-04 12:37:24.636 98588 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: _running=False _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:76
2019-04-04 12:37:29.637 98588 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: 98588 _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:75
2019-04-04 12:37:29.637 98588 DEBUG designate.service_status [-] designate/service_status.HeartBeatEmitter._emit_heartbeat: _running=False _emit_heartbeat /usr/lib/python2.7/site-packages/designate/service_status.py:76

__init__ and _emit_heartbeat are same process id. But start is not same as those.

__init__ , _emit_heartbeat: 98588

start : 98598, 98598, 98599, 98600, 98601

then I checked concern about process IDs.

# ps auxwwf | grep designate-api
designa+ 98588 1.9 0.7 347336 62380 ? Ss 12:37 0:02 /usr/bin/python2 /usr/bin/designate-api --config-file /etc/designate/designate.conf --log-file /var/log/designate/api.log
designa+ 98598 0.2 0.8 392348 69792 ? S 12:37 0:00 \_ /usr/bin/python2 /usr/bin/designate-api --config-file /etc/designate/designate.conf --log-file /var/log/designate/api.log
designa+ 98599 0.2 0.8 392344 69784 ? S 12:37 0:00 \_ /usr/bin/python2 /usr/bin/designate-api --config-file /etc/designate/designate.conf --log-file /var/log/designate/api.log
designa+ 98600 0.2 0.8 392344 69784 ? S 12:37 0:00 \_ /usr/bin/python2 /usr/bin/designate-api --config-file /etc/designate/designate.conf --log-file /var/log/designate/api.log
designa+ 98601 0.2 0.8 392348 69784 ? S 12:37 0:00 \_ /usr/bin/python2 /usr/bin/designate-api --config-file /etc/designate/designate.conf --log-file /var/log/designate/api.log

98588 is parent process ID, but others are child process.

So if _running flag update to True, It never start the _emit_heartbeat.

How do I fix this problem

# diff -u /usr/lib/python2.7/site-packages/designate/cmd/api.py.upstream /usr/lib/python2.7/site-packages/designate/cmd/api.py
--- /usr/lib/python2.7/site-packages/designate/cmd/api.py.upstream 2019-04-03 18:10:09.908918965 +0900
+++ /usr/lib/python2.7/site-packages/designate/cmd/api.py 2019-04-03 18:09:49.886038819 +0900
@@ -40,4 +40,5 @@

server = api_service.Service(threads=CONF['service:api'].threads)
service.serve(server, workers=CONF['service:api'].workers)
+ server.heartbeat_emitter.start()
service.wait()

I fixed to call the start method from parent process. I think all process have same problem. So I think It is need to fix same way.

If you fix this issue on this way, you can manage the service status.

updated: I requested the fixing code to upstream. it was merged on upstream.

https://review.opendev.org/#/c/657382/

--

--