Stop creating Sling Scheduler in AEMaaCS. Instead…

Saravana Prakash
4 min readDec 19, 2023

--

Photo by CHUTTERSNAP on Unsplash

Why is my job running multiple times at Cloud?

Problem Statement:

Below is the easiest, smallest scheduler we have been writing for years in AEM.

import org.apache.sling.commons.scheduler.Scheduler;

@Component(immediate = true, service = Runnable.class)
public class MyScheduler implements Runnable {
@Reference
private Scheduler scheduler;
@Activate
protected void activate() {
scheduler.schedule(this, scheduler.EXPR("0 * * * * ?"));
}
@Override
public void run() {
// do something
}
}

Or simply with @Component(immediate=true, service=Runnable.class, property={"scheduler.expression=0 * * * * ?"})

This method uses Sling Commons Scheduler.

When I wrote and this scheduler, successfully tested local and ran from AEMCaaS, the result was,

Job got executed multiple times

I reverified code and cron expression. Works perfect clean at local. But confirmed 100% runs multiple times on higher environments.

WHY?

Its explained in Apache documentation here.

yes. there is a typo in Apache documentation for word ‘environemnt’.

When Sling Commons scheduler are triggered at Clustered Cloud environment (AEMCaaS), EACH Cloud Instance triggers job separately. When I checked logs, I see multiple instances, one per cluster.

Worse, during deployments, or cloud upscaling, when cluster count increases/decreases, resp multiple jobs are fired. This will possibly lead to Resource conflicts if same resource needs to be updated.

Solution:

Controlled way to manage jobs at AEMCaaS is to use Scheduled Jobs. Its almost similar to traditional Commons scheduler above, split into Producer+Consumer

The run() method is split into JobConsumer.java and schedule() method is split into JobProducer.java.

Important different between the Commons Scheduler and this Scheduled Jobs is

Implementation:

  1. Create a JobConsumer.java
@Component(service = JobConsumer.class, immediate = true, property = {
JobConsumer.PROPERTY_TOPICS + "=my/job/test"})
public class MyJobConsumer implements JobConsumer {
@Override
public JobResult process(final Job job) {
// do something
return JobResult.OK;
}
}

Simple class extending JobConsumer and override process() method. Make sure to return JobResult.OK after successfully completing job

2. Create a JobProducer.java

As per Apache Documentation, we can extend JobExecuter and write executers for more controlled executions. My usecase was very simple,

When job is fired, launch a workflow that does job processing.

So I wrote a servlet as my JobProducer like this

@Component(
service = Servlet.class,
name = "Servlet used to start/stop my Jobs ",
property = {SLING_SERVLET_PATHS + "=/bin/job/producer", SLING_SERVLET_METHODS + "=" + HttpConstants.METHOD_GET})
public class SalsifyJobProducer extends SlingSafeMethodsServlet {
public static final String TOPIC = "my/job/test";
@Override
protected void doGet(final SlingHttpServletRequest slingRequest, final SlingHttpServletResponse slingResponse) {
MODE modeEnum = MODE.valueOf(slingRequest.getParameter("mode"));
if (modeEnum == MODE.start) {
startScheduledJob();
} else if (modeEnum == MODE.stop) {
unScheduleExistingJobs();
}
}
private void unScheduleExistingJobs() {
Optional.ofNullable(jobManager.getScheduledJobs(TOPIC, 0, null))
.orElse(CollectionUtils.emptyCollection())
.forEach(ScheduledJobInfo::unschedule);
}

private void startScheduledJob() {
unScheduleExistingJobs();
Collection<ScheduledJobInfo> myJobs = jobManager.getScheduledJobs(TOPIC, 0, null);
if (myJobs.isEmpty()) {
/* Good to schedule job */
JobBuilder.ScheduleBuilder scheduleBuilder = jobManager.createJob(TOPIC).schedule();
scheduleBuilder.cron("0 * * * * ?");
scheduleBuilder.add();
}
}
private enum MODE {start, stop}
}

This is simple servlet executed by running /bin/job/producer?mode=start servlet from author. This triggers the job. If any mishaps or rollback required, I simply run /bin/job/producer?mode=stop . Once coast is clear, start the job again. This gives me better control on job at Production.

How to check Sling Jobs status at AEMCaaS?

AEM OnPrem world had liberty of accessing the /system/console felix manager. It was very easy to verify job status using JMX. But with the AEMCaaS cloud world, screws are tightened. Here is how to verify Sling jobs

  1. Logon to developer console of the environment. You need to have developer access at Cloud Manager

2. From developer console, select Sling Jobs

3. Filter for your topic name. This will show active running topics

This way, if multiple topics are running than intended, you can stop all topics and restart in controlled way.

Conclusion:

Sling Commons Scheduler has been the de-facto method on writing Sling jobs. But this will no longer work for AEMCaaS, since multiple instances of jobs gets fired per cluster. More controlled way is to use Sling Scheduled Jobs. This persists the active jobs under /var/eventing/scehduled-jobs, preventing the multiple job execution issue.

--

--

Saravana Prakash

AEM Fullstack Enthusiast. Working on AEMCaaS, Adobe EDS, Adobe IO and other Adobe Marketing Cloud tools