Real world problem of NestJS/Schedule and practical solution

5 min readDec 25, 2023

Server often takes responsibility for periodic tasks. Kinds would vary, like DB back-up, weekly sales, analytics, etc. Nest offers abstracted way using decorators like Cron or Interval. These tools are quite useful, but sometimes not enough, because production environment is complicated.

Multiple server instances

Usually, server developers do not deploy just single instance to make robust server. And It means our Cron jobs attached to Node Processes by NestJS/Scheduler, will be executed on multiple instances.

If job is idempotent, it will cause no problem. but often the tasks should be executed only one time. Imagine your server instances produce wrong data due to its duplicated executions. It’s a disaster. So, how can we guarantee single execution?

Can’t I just make another instance only for cron jobs?

It’s a bad idea for sure. It just makes infinite loop problem. We’ve encountered the problem because we deployed multiple servers to achieve stability. But deploy single instance for cron jobs? Who will assure the executions of that single instance? should we make another multiple instances to assure the jobs? Nonsense.

In addition, you do not want make another server to maintain. Keep it simple as possible. Do not waste your time for another labor. reuse your server resources like DB connections or methods.

In fact, multiple attempt for a job is not bad at all. This guarantees the execution of job. Imagine one of your server panicking when scheduled time has arrived.

Old friend helps us again

In fact, these kind of problem is very common in Computer Science. And we can leverage the implementation of DB(no matter what DBMS is).

DB can assure atomicity and serial procedure in many ways. And we’ll extend DB lock to our application layer. We’ll not cover about locks in this article, but if you are not familiar with concepts of lock, just think lock as a helper for serial executions.

Main Concept

We’ll maximize stability using multiple cron job invocations of servers, to prevent failure, and guarantee single execution using DB lock.

There are certain queries that lock record(or document for NoSQL DBs) like SQL statements SELECT … FOR UPDATE or ‘findAndModify’ family in MongoDB. We’ll use MongoDB in this article, but this idea is universal for all DBMS. Even redis has similar queries like SET or SETNX.

If you have dedicated cache storage outside the servers, it would be more suitable for our purpose. Locks have only state, so it does not have to be stored in persistent disk. And as you know, memory caches are super fast.

Extending the DB lock to our application

So, How actually this concept will solve our problem? Let’s imagine we have three server instances. If a cron job invoked(let’s say every 06:00:00), our servers will query for lock first. And because of certain queries on same record runs serially, server can decide whether to execute or not.

Enough with explanation. Let’s code!

Take a look our goal first

@Cron("0 0 6 * * *", {
  name: LOCKS.TASK_1
})
async task_1() {
  const lock = await this.getLock(LOCKS.TASK_1);

  if (!lock) return "This server instance did not get lock!";

  try {
    ...EXECUTIONS...
  } catch(e) {
    ...ERROR HANDLING...
  } finally {
    await this.releaseLock(LOCKS.TASK_1);
  }
}

As you can see, a job MUST obtain the lock for execution. otherwise, job will just return.

Implementing Locks

We’ll inject the function itself(of course you can choose another way. like injecting class or not injecting at all). So first, we’ll implement LockModule and it’s provider using useFactory.

// why this is not const enum?
// we'll use this object in next section
export const LOCKS = {
  TASK_1 :'TASK_1',
  TASK_2 :'TASK_2',
} as const

export type Locks = typeof LOCKS[keyof typeof LOCKS]

export const enum LOCK_CONTROL {
  GET_LOCK = 'GET_LOCK',
  RELEASE_LOCK = 'RELEASE_LOCK',
}

export type LockDoc = {
  name: Locks;
  locked: boolean;
  last_locked_at: Date;
  last_released_at: Date;
}

export type LockControl = (lock:Locks) => Promise<LockDoc>;


@Module({
  providers: [
    {
      provide: LOCK_CONTROL.GET_LOCK,
      inject: [InjectConnection('YOUR_CONNECTION')],
      useFactory: (conn:Connection) => {
        const locks = connection.db.collection<LockSchema>('locks');

        return async (key: Lock) => {
          // findOneAndUpdate is internally findAndModify
          // so this query can guarantee atomicity
          const lock = await locks.findOneAndUpdate(
            {
              _id: key,
              locked: false,
            },
            {
              $set: {
                locked: true,
                last_locked_at: new Date(),
              },
            },
          );

          return lock.value;
        };
      }
    },
    {
      provide: LOCK_CONTROL.RELEASE_LOCK,
      inject: [InjectConnection('YOUR_CONNECTION')],
      useFactory: (conn:Connection) => {
        const locks = connection.db.collection('locks');

        return async (key: Lock) => {
          const releasedLock = await locks.findOneAndUpdate(
            {
              _id: key,
              locked: true,
            },
            {
              $set: {
                locked: false,
                last_released_at: new Date(),
              },
            },
          );

          return releasedLock.value;
        };
      }
    }
  ]
})
class LockModule {}

And in service layer,

@Injectable()
class SomeService {
  constructor(
    @Inject(LOCK_CONTROL.GET_LOCK)
    private readonly getLock: LockControl,
    
    @Inject(LOCK_CONTROL.RELEASE_LOCK)
    private readonly releaseLock: LockControl
  ) {}

}

What happens now?

Let’s go back. 3 instance will invoke cron job at every 06:00:00. All three instance will query using ‘findOneAndUpdate’, and this operations must be executed serially.

So, only one instance can get a lock for the job. other two? they will just terminate the procedure because didn’t get the lock.

when the job done, lock will be released in ‘finally’ block.

Don’t miss understand

This approach assure that a task never done in parallel(multiple instances). it means this may not appropriate when a job is done in really short period of time(I mean, really short). because if a get lock query arrive at DB after job is done(after lock released), it will be evaluated as valid job.

If this is your case, you have to implement another way like checking dates or execution record.

Leveraging TS exhaustive check to sure all locks are handled

As you can see, lock communication needs initial record in DB. So, it’s good to automate the procedure to check all init record is existing. We’ll use Exhaustive Check trick. If the lock is not handled, TS won’t compile.

(in below code, insertLockIfNotExists is fake function. you can implement your own)

@Module({...})
class LockModule implements OnModuleInit {
  constructor(
    @InjectConnection('YOUR_CONNECTION') private readonly connection: Connection,
  ) {}
  async onModuleInit() {
     for (const lock of Object.values(LOCKS)) {
      switch (lock) {
        case LOCKS.TASK_1: {
          await insertLockIfNotExists(this.connection, lock);
          break;
        }

        case LOCKS.TASK_2: {
          await insertLockIfNotExists(this.connection, lock);
          break;
        }

        // allocationg value on variable 'never' can check switch statement have handled all branches
        // it is typically called 'exhaustive check'
        default: {
          const _exhaustiveCheck: never = lock;
        }
      }
    }
  }
}

Even better ways

You can make all these features using decorators(maybe name like MutexCron?) or packaging. Your choice. Good luck!

Thanks for reading