DevOps toolkits: Supervisor — managing multiple processes

Part 2 — ELK stack

James
Coinmonks
Published in
4 min readSep 10, 2018

--

A web crawler is thrown to a DevOps. He then asks the following questions:

  • what if it crashes?
  • how to make it reliable?
  • how to scale it up?
  • how to monitor the process effectively?

Developing workable code is easy, but making it robust is hard. One of the job of a DevOps, or a mature programmer, is to make their code reliable.

Supervisor

First, we will introduce a simple yet powerful tool — supervisor. It is a process control system, with features:

  • Simple

one just need to edit the config file — supervisord.conf to get most of the work done

  • Efficient

Supervisor doesn’t rely on PID files and periodic polling to restart failed process. Instead, it starts subprocesses via fork/exec and subprocesses don’t daemonize. The OS signals supervisor when a process terminates.

  • Extensible

With a event notification protocol, one can implement features like event-driven email notifier, or integrate with other tools like ELK easily.

  • User-friendly

There’s a CLI and web dashboard to monitor and control processes. Multiple versions of open-source dashboards are provided. You can easily manage your processes and even twist the dashboard into your own workstation.

Usage

Installing supervisor:

pip install supervisor

Supervisord is most stable in python 2. However, it doesn’t mean that your applications have to be implemented by python 2, or even by python. It can be anything else.

Suppose we have a python crawler, called crawler.py. We can do the following things to make it easily integrated with other services

Keeping it alive

The supervisord.conf file will implement the following features:

  • start the process when supervisor is started
  • restart the process if exit code is not 0 nor 2
  • output log in specific path
  • enable rotate log
[supervisord] 
nodaemon=true
loglevel=info
[program:crawler]
command=python crawler.py
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM

That’s it! It is everything we need to start a minimal yet functioning supervisor.
To start it, open terminal, go to the directory where you placed the config file, type:

supervisord

Then you can see the logging messages.

Place the log files

We will then specify where the log files should be located.

[supervisord] 
nodaemon=true
loglevel=info
logfile=../log/supervisord.log
logfile_maxbytes=50MB
[program:crawler]
command=python crawler.py
stdout_logfile=../log/crawler_out.log
stderr_logfile=../log/crawler_err.log
logfile_maxbytes=50MB
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM

Note that we can use relative path in the conf file

Enable dashbaord

Enable supervisorctl by adding the following lines in the same config file:

[supervisorctl] 
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
serverurl=http://127.0.0.1:9001 ;
[inet_http_server]
port = 127.0.0.1:9001
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface

Then when we visit 127.0.0.1:9001, a supervisor web dashboard will be shown.

Or just by typing supervisorctl in the terminal, we will enter the CLI interface.

Implement a event-driven email notifier

Stdout and stderr events have to be enabled in order for the event listener to get the data.
We will code a listener in python to send email when the crawler is crashed with unexpectedly.

config file:

[program:spider] 
command=python crawler.py
stdout_events_enabled=true
stderr_events_enabled=true
stdout_logfile=../log/crawler_out.log
stderr_logfile=../log/crawler_err.log
logfile_maxbytes=50MB
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM
[eventlistener:listener]
command=python listener.py
events=PROCESS_STATE
autorestart=true
stdout_logfile=../log/listener_out.log
stderr_logfile=../log/listener_err.log

listener.py:

import sys 
from send_email import send
from supervisor.childutils import listener
def write_stdout(s):
sys.stdout.write(s)
sys.stdout.flush()
def write_stderr(s):
sys.stderr.write(s)
sys.stderr.flush()
def main():
while True:
# transition from ACKNOWLEDGED to READY
write_stdout('READY\n')
headers, body = listener.wait(sys.stdin, sys.stdout)
body = dict([pair.split(":") for pair in body.split(" ")])

if body['processname'] == 'spider' and headers["eventname"] == "PROCESS_STATE_STOPPED":
send('spider stopped', 'spider stopped', <your_email>)
if body['processname'] == 'spider' and headers["eventname"] == "PROCESS_STATE_EXITED":
send('spider exited', 'spider exited', <your_email>)
write_stderr("Headers: %r\n" % repr(headers))
write_stderr("Body: %r\n" % repr(body))
write_stdout("RESULT 2\nOK")
if __name__ =='__main__':
main()

helper function — send_email.py:

from email.mime.multipart import MIMEMultipart 
from email.mime.text import MIMEText
import smtplib
msg = MIMEMultipart()
password = <password>
msg['From'] = <email_sender>
def send(sub, body, rep):
msg['Subject'] = sub
msg['To'] = rep
body = body
# add in the message body
msg.attach(MIMEText(body, 'plain'))
#create server server = smtplib.SMTP('smtp.gmail.com: 587')
server.starttls()
# Login Credentials for sending the mail
server.login(msg['From'], password)

# send the message via the server.
server.sendmail(msg['From'], msg['To'], msg.as_string())
server.quit()
return

Final version of supervisord.conf:

[program:crawler] 
command=python crawler.py
stdout_events_enabled=true
stderr_events_enabled=true
stdout_logfile=../log/crawler_out.log
stderr_logfile=../log/crawler_err.log
logfile_maxbytes=50MB
autorestart=unexpected
exitcodes=0,2
stopsignal=TERM
[supervisorctl]
;serverurl=unix:///tmp/supervisor.sock ; use a unix:// URL for a unix socket
serverurl=http://127.0.0.1:9001 ;
[inet_http_server]
port = 127.0.0.1:9001
[rpcinterface:supervisor]
supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface
[eventlistener:listener]
command=python listener.py
events=PROCESS_STATE
autorestart=true
stdout_logfile=../log/listener_out.log
stderr_logfile=../log/listener_err.log

--

--