Writing a simple scheduling service with APScheduler
If you are looking for a quick but scalable way to get a scheduling service up and running for a task, APScheduler might just be the trick you are looking for.
You can start using this scheduler right from the start of the development of your application and scale it as per your product scales.
A simple flask app to do just this would look something like this
from flask import Flask, request
from apscheduler.schedulers.background import BackgroundScheduler
from datetime import datetime
schedule_app = Flask(__name__)
# initialize scheduler with your preferred timezone
scheduler = BackgroundScheduler({'apscheduler.timezone': 'Asia/Calcutta'})
scheduler.start()
@schedule_app.route('/schedulePrint', methods=['POST'])
def schedule_to_print():
data = request.get_json()
#get time to schedule and text to print from the json
time = data.get('time')
text = data.get('text')
#convert to datetime
date_time = datetime.strptime(str(time), '%Y-%m-%dT%H:%M')
#schedule the method 'printing_something' to run the the given 'date_time' with the args 'text'
job = scheduler.add_job(printing_something, trigger='date', next_run_time=str(date_time),
args=[text])
return "job details: %s" % job
def printing_something(text):
print("printing %s at %s" % (text, datetime.now()))
If I hit my server at with a curl I get the following response
curl -X POST http://127.0.0.1:5000/schedulePrint -H 'content-type: application/json' -d '{"time":"2018-07-27T01:25","text": "apscheduler"}'job details: printing_something (trigger: date[2018-07-27 01:24:54 IST], next run at: 2018-07-27 01:25:00 IST
And at the scheduled time it printed — printing apscheduler at 2018–07–27 01:25:00.005645
Now, if we don’t mention any ‘jobstore’ for the scheduler then it will be an in-memory store, which means the scheduled jobs will be lost if your application restarts.
To tackle this problem and scale up, you might want to use a persistent data storage. At this stage, a light sql db would suffice — sqlalchemy.
from flask import Flask, request
from apscheduler.schedulers.background import BackgroundScheduler
from datetime import datetime
schedule_app = Flask(__name__)
# initialize scheduler with your preferred timezone
scheduler = BackgroundScheduler({'apscheduler.timezone': 'Asia/Calcutta'})
# add a custom jobstore to persist jobs across sessions (default is in-memory)
scheduler.add_jobstore('sqlalchemy', url='sqlite:////tmp/schedule.db')
scheduler.start()
@schedule_app.route('/schedulePrint', methods=['POST'])
def schedule_to_print():
data = request.get_json()
#get time to schedule and text to print from the json
time = data.get('time')
text = data.get('text')
#convert to datetime
date_time = datetime.strptime(str(time), '%Y-%m-%dT%H:%M')
#schedule the method 'printing_something' to run the the given 'date_time' with the args 'text'
job = scheduler.add_job(printing_something, trigger='date', next_run_time=str(date_time),
args=[text])
return "job details: %s" % job
def printing_something(text):
print("printing %s at %s" % (text, datetime.now()))
If the product scales further and you move to a more distributed setup, you could just replace the jobstore with a central zookeeper or mongo to scale up.
scheduler.add_jobstore('zookeeper', path=<path_in_zk_to_store_metadata>, client=KazooClient(hosts=<zookeeper_hosts>))