What does Kubernetes cronjob’s `startingDeadlineSeconds` exactly mean?
I was working on Kubernetes CronJob and I was wondering what
startingDeadlineSeconds is. There is official documentation, but I am still confused after reading it.
After looking at the source code, I think
startingDeadlineSeconds means that if a CronJob controller cannot start a job run on its schedule, it will keep retrying until
startingDeadlineSeconds is reached.
Before showing a few examples, we need to clarify some concepts:
Controller check: CronJob controller checks things (watching and syncing jobs) every 10 seconds.
Schedule: the time to execute the job according to the given schedule expression.
Job run: a job object is created about once per execution time of its schedule.
Below, I will use a few examples to demonstrate its use cases.
Example #1. Assume that the schedule is 8:30, 9:30, … and
startingDeadlineSeconds is 60 seconds. During 8:29-8:35, there is downtime to start a job run. In this case, the job will not be executed because when the system comes back to healthy, the time is 8:35 that is after the deadline 8:31. The schedule at 8:30 is lost and the next schedule will be at 9:30.
This is the case when the interval between schedules is greater than
startingDeadlineSeconds. What if it is lesser?
Example #2. Assume that the schedule is 8:30, 8:31, 8:32, …, for every minute and
startingDeadlineSeconds is 2 hours. The downtime is between 8:30 and 10:20.
What will happen in this case? The CronJob controller will keep trying to start a job according to the most recent schedule. Once the number of schedule misses reaches 100 times, the controller will not try to start a job anymore and logs the error:
Cannot determine if job needs to be started. Too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew.
Related GitHub issue
I also found there is some redundant code that can be cleaned up in this part so that I submitted an issue: