Writing a simple scheduling service with APScheduler

Chetan Mishra
2 min readJul 26, 2018

--

If you are looking for a quick but scalable way to get a scheduling service up and running for a task, APScheduler might just be the trick you are looking for.

You can start using this scheduler right from the start of the development of your application and scale it as per your product scales.

A simple flask app to do just this would look something like this

from flask import Flask, request
from apscheduler.schedulers.background import BackgroundScheduler
from datetime import datetime
schedule_app = Flask(__name__)

# initialize scheduler with your preferred timezone
scheduler = BackgroundScheduler({'apscheduler.timezone': 'Asia/Calcutta'})
scheduler.start()



@schedule_app.route('/schedulePrint', methods=['POST'])
def schedule_to_print():
data = request.get_json()
#get time to schedule and text to print from the json
time = data.get('time')
text = data.get('text')
#convert to datetime
date_time = datetime.strptime(str(time), '%Y-%m-%dT%H:%M')
#schedule the method 'printing_something' to run the the given 'date_time' with the args 'text'
job = scheduler.add_job(printing_something, trigger='date', next_run_time=str(date_time),
args=[text])
return "job details: %s" % job


def printing_something(text):
print("printing %s at %s" % (text, datetime.now()))

If I hit my server at with a curl I get the following response

curl -X POST http://127.0.0.1:5000/schedulePrint  -H 'content-type: application/json' -d '{"time":"2018-07-27T01:25","text": "apscheduler"}'job details: printing_something (trigger: date[2018-07-27 01:24:54 IST], next run at: 2018-07-27 01:25:00 IST

And at the scheduled time it printed — printing apscheduler at 2018–07–27 01:25:00.005645

Now, if we don’t mention any ‘jobstore’ for the scheduler then it will be an in-memory store, which means the scheduled jobs will be lost if your application restarts.

To tackle this problem and scale up, you might want to use a persistent data storage. At this stage, a light sql db would suffice — sqlalchemy.

from flask import Flask, request
from apscheduler.schedulers.background import BackgroundScheduler
from datetime import datetime
schedule_app = Flask(__name__)

# initialize scheduler with your preferred timezone
scheduler = BackgroundScheduler({'apscheduler.timezone': 'Asia/Calcutta'})
# add a custom jobstore to persist jobs across sessions (default is in-memory)
scheduler.add_jobstore('sqlalchemy', url='sqlite:////tmp/schedule.db')
scheduler.start()



@schedule_app.route('/schedulePrint', methods=['POST'])
def schedule_to_print():
data = request.get_json()
#get time to schedule and text to print from the json
time = data.get('time')
text = data.get('text')
#convert to datetime
date_time = datetime.strptime(str(time), '%Y-%m-%dT%H:%M')
#schedule the method 'printing_something' to run the the given 'date_time' with the args 'text'
job = scheduler.add_job(printing_something, trigger='date', next_run_time=str(date_time),
args=[text])
return "job details: %s" % job


def printing_something(text):
print("printing %s at %s" % (text, datetime.now()))

If the product scales further and you move to a more distributed setup, you could just replace the jobstore with a central zookeeper or mongo to scale up.

scheduler.add_jobstore('zookeeper', path=<path_in_zk_to_store_metadata>, client=KazooClient(hosts=<zookeeper_hosts>))

--

--