DevOps toolkits: Supervisor — managing multiple processes
A web crawler is thrown to a DevOps. He then asks the following questions:
- what if it crashes?
- how to make it reliable?
- how to scale it up?
- how to monitor the process effectively?
Developing workable code is easy, but making it robust is hard. One of the job of a DevOps, or a mature programmer, is to make their code reliable.
Supervisor
First, we will introduce a simple yet powerful tool — supervisor. It is a process control system, with features:
- Simple
one just need to edit the config file — supervisord.conf
to get most of the work done
- Efficient
Supervisor doesn’t rely on PID files and periodic polling to restart failed process. Instead, it starts subprocesses via fork/exec and subprocesses don’t daemonize. The OS signals supervisor when a process terminates.
- Extensible
With a event notification protocol, one can implement features like event-driven email notifier, or integrate with other tools like ELK easily.
- User-friendly
There’s a CLI and web dashboard to monitor and control processes. Multiple versions of open-source dashboards are provided. You can easily manage your processes and even twist the dashboard into your own workstation.
Usage
Installing supervisor:
pip install supervisor
Supervisord is most stable in python 2. However, it doesn’t mean that your applications have to be implemented by python 2, or even by python. It can be anything else.
Suppose we have a python crawler, called crawler.py
. We can do the following things to make it easily integrated with other services
Keeping it alive
The supervisord.conf
file will implement the following features:
- start the process when supervisor is started
- restart the process if exit code is not 0 nor 2
- output log in specific path
- enable rotate log
[supervisord]
nodaemon=true
loglevel=info [program:crawler]
command=python crawler.py
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM
That’s it! It is everything we need to start a minimal yet functioning supervisor.
To start it, open terminal, go to the directory where you placed the config file, type:
supervisord
Then you can see the logging messages.
Place the log files
We will then specify where the log files should be located.
[supervisord]
nodaemon=true
loglevel=info
logfile=../log/supervisord.log
logfile_maxbytes=50MB [program:crawler]
command=python crawler.py
stdout_logfile=../log/crawler_out.log
stderr_logfile=../log/crawler_err.log
logfile_maxbytes=50MB
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM
Note that we can use relative path in the conf file
Enable dashbaord
Enable supervisorctl by adding the following lines in the same config file:
[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
serverurl=http://127.0.0.1:9001 ; [inet_http_server]
port = 127.0.0.1:9001 [rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
Then when we visit 127.0.0.1:9001
, a supervisor web dashboard will be shown.
Or just by typing supervisorctl
in the terminal, we will enter the CLI interface.
Implement a event-driven email notifier
Stdout and stderr events have to be enabled in order for the event listener to get the data.
We will code a listener in python to send email when the crawler is crashed with unexpectedly.
config file:
[program:spider]
command=python crawler.py
stdout_events_enabled=true
stderr_events_enabled=true
stdout_logfile=../log/crawler_out.log
stderr_logfile=../log/crawler_err.log
logfile_maxbytes=50MB
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM [eventlistener:listener]
command=python listener.py
events=PROCESS_STATE
autorestart=true
stdout_logfile=../log/listener_out.log
stderr_logfile=../log/listener_err.log
listener.py:
import sys
from send_email import send
from supervisor.childutils import listener def write_stdout(s):
sys.stdout.write(s)
sys.stdout.flush() def write_stderr(s):
sys.stderr.write(s)
sys.stderr.flush() def main():
while True:
# transition from ACKNOWLEDGED to READY
write_stdout('READY\n')
headers, body = listener.wait(sys.stdin, sys.stdout)
body = dict([pair.split(":") for pair in body.split(" ")])
if body['processname'] == 'spider' and headers["eventname"] == "PROCESS_STATE_STOPPED":
send('spider stopped', 'spider stopped', <your_email>)
if body['processname'] == 'spider' and headers["eventname"] == "PROCESS_STATE_EXITED":
send('spider exited', 'spider exited', <your_email>)
write_stderr("Headers: %r\n" % repr(headers))
write_stderr("Body: %r\n" % repr(body))
write_stdout("RESULT 2\nOK") if __name__ =='__main__':
main()
helper function — send_email.py:
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import smtplib msg = MIMEMultipart()
password = <password>
msg['From'] = <email_sender> def send(sub, body, rep):
msg['Subject'] = sub
msg['To'] = rep
body = body # add in the message body
msg.attach(MIMEText(body, 'plain')) #create server server = smtplib.SMTP('smtp.gmail.com: 587')
server.starttls() # Login Credentials for sending the mail
server.login(msg['From'], password)
# send the message via the server.
server.sendmail(msg['From'], msg['To'], msg.as_string())
server.quit()
return
Final version of supervisord.conf
:
[program:crawler]
command=python crawler.py
stdout_events_enabled=true
stderr_events_enabled=true
stdout_logfile=../log/crawler_out.log
stderr_logfile=../log/crawler_err.log
logfile_maxbytes=50MB
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
serverurl=http://127.0.0.1:9001 ;[inet_http_server]
port = 127.0.0.1:9001[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface[eventlistener:listener]
command=python listener.py
events=PROCESS_STATE
autorestart=true
stdout_logfile=../log/listener_out.log
stderr_logfile=../log/listener_err.log