Automating Node Deletion in AEM with Sling Job and Workflow

Varoon Kulanthaivel
Activate AEM
Published in
7 min readSep 3, 2024

Author: Varoon Kulanthaivel

Introduction

Managing and keeping a large-scale Adobe Experience Manager (AEM) instance often involves tasks such as cleaning up obsolete nodes. Automating such tasks can save considerable time and reduce human error. The goal is to create an automated solution for removing specific nodes in AEM based on criteria provided in the form of a payload through a CSV file. From one of the previous implementations, there were few nodes created and later when the feature was discarded, the nodes remained in the Production Environment which we wanted to delete now. This article outlines how to set up a Sling Job in AEM to automate node deletion using a workflow triggered by a CSV payload, the proposed solution will use a Sling Job Consumer and a workflow to process the CSV and trigger the node deletion.

Context

Automating node deletion involves several steps:

  1. Creating a Sling Job Consumer to handle the node deletion.
  2. Setting up a workflow to trigger the Sling Job with the necessary CSV payload.
  3. Processing the CSV to decide which nodes to remove.
  4. The CSV will contain root path, page template and a relative path.

Example:

Removal of an existing Spacer component from all parent pages(parent_page_1, parent_page_2 ,parent_page_3) which are no longer required

This can be done by removing the spacer node from all parent pages

Before :

parent_pages under project/language-masters/en containing a Button and a Spacer component

After :

Removal of Spacer Component

Payload CSV :

General

Specific to this Example

In this article, we will walk through the implementation of these components, with a focus on the Sling Job Consumer and how the workflow integrates with it.

The Sling Job Consumer

A Sling Job Consumer plays a crucial role in processing asynchronous jobs within AEM environments. It registers itself to specific job topics, defining which tasks it can handle. Jobs are processed using the process method, which returns outcomes like success, failure, or the need for rescheduling. The advantages of using a Sling Job Consumer include efficient management of long-running tasks, enhanced system performance through asynchronous processing, and robust error handling with automatic retries.

This mechanism ensures tasks are executed at least once, even if there are failures during processing, making it essential for managing asynchronous tasks such as creating and deleting nodes, updating node properties, and sending email notifications

Advantages of Using a Sling Job Consumer:

  1. Asynchronous Task Processing: Manages background tasks independently of the main application threads, thereby improving overall system performance.
  2. Reliable Execution: Ensures tasks are attempted at least once, leveraging persistent job states and retry mechanisms in case of failures.
  3. Error Handling and Retrying: Built-in support for handling errors ensures tasks complete reliably even under challenging conditions.
  4. Scalability: Separates job creation from execution, enabling efficient management of a large volume of tasks without impacting system responsiveness.
  5. Simplified Maintenance: Clearly defined job definitions and execution steps streamline the management of complex workflows, making the system easier to maintain.

How do we declare a Job?

@ConsumerType
public interface JobConsumer
@Component(service = JobConsumer.class,
immediate = true,
property = {
Constants.SERVICE_DESCRIPTION + “= Job Name”,
JobConsumer.PROPERTY_TOPICS + “=” + JobTopic
})

How do we trigger a Job?

Every Job typically requires two key parameters:

  1. TOPIC: This property is unique and assigned to a specific Job Consumer. It serves as a categorization mechanism to route jobs to the appropriate consumer capable of handling tasks related to that topic.
  2. PAYLOAD: This parameter consists of a key-value map containing serializable objects. The payload serves as the inputs or data necessary for the Job Consumer to execute its tasks.

There are 3 common ways of triggering the JobConsumer file

Workflow: Workflows in AEM are sequences of steps that automate processes like content approval, form submission handling, or any custom business process. Workflows can trigger jobs by defining steps that create or update job nodes in the JCR (Java Content Repository), specifying the job topic and payload. When a workflow step executes that triggers a job, it effectively initiates asynchronous processing by a Sling Job Consumer registered for that job topic.

Servlet Trigger: A servlet in AEM can be designed to handle HTTP requests. This servlet can be configured to receive requests that include a payload, either as request parameters or in the request body. When such a request is received, the servlet can trigger a job by invoking the JobManager API or by directly adding a job node to the JCR with the appropriate job topic and payload.

Scheduler: AEM allows for scheduling tasks using CRON expressions. These scheduled tasks can trigger jobs at specified intervals or at specific times. When a scheduled time is reached, AEM triggers the job, passing any necessary payload or parameters to the registered Sling Job Consumer. This enables periodic execution of tasks such as content publishing, report generation, or data synchronization.

Specific to this case, we found it possible to trigger the JobConsumer using a Workflow, as we start the Workflow to trigger the job, running only once to cleanup the nodes.

The Sling Job Consumer is a key component that handles the actual node deletion process. Given below is a Code Snippet breakdown of the essential functional parts of the NodeRemovalJobConsumer class.

CODE SNIPPET:

public class NodeRemovalJobConsumer implements JobConsumer {

@Reference
private transient ResourceResolverFactory resolverFactory;

@Override
public JobResult process(Job job) {
String payloadPath = job.getProperty("payloadPath").toString();
try {
ResourceResolver resourceResolver = resolverFactory.getServiceResourceResolver(null);
Resource payloadResource = resourceResolver.getResource(payloadPath);
Asset payloadAsset = payloadResource.adaptTo(Asset.class);

HashMap<String, String> map = readCsvFile(payloadAsset);
String rootPath = map.get("rootPath");

if (rootPath != null && !rootPath.isEmpty()) {
AemQuery aemQuery = new AemQuery(resourceResolver);
aemQuery.inPath(rootPath).byType("cq:PageContent").byProperty("cq:template", map.get("template")).limitTo(-1);
Optional<SearchResult> searchResult = aemQuery.query();

if (searchResult.isPresent()) {
Iterator<Resource> resources = searchResult.get().getResources();
Session session = resourceResolver.adaptTo(Session.class);
while (resources.hasNext()) {
Resource resource = resources.next();
Resource relativeResource = resourceResolver.getResource(resource.getPath() + map.get("relativePath"));
if (relativeResource != null) {
resourceResolver.delete(relativeResource);
session.save();
}
}
}
}
return JobResult.OK;
} catch (Exception e) {
return JobResult.FAILED;
}
}
}

Key Components

  • Job Topic: The job topic is defined as node_removal/job. This is the identifier for the job that the workflow will trigger.
  • Resource Resolver: Utilized to gain access to the AEM resources.
  • CSV Parsing: The readCsvFile method is used to read and parse the CSV file to get the required data for node deletion.
  • AEM Query: AemQuery is used to search for nodes based on criteria specified in the CSV.
  • Node Deletion: The nodes identified by the query are deleted.

Triggering the Job with a Workflow

The workflow initiates the Sling Job using the CSV payload. Here’s an overview of the JobProducerWorkflowProcess class, which is responsible for triggering the job.

CODE SNIPPET :

@Component(
immediate = true,
service = WorkflowProcess.class,
property = {
"service.description=Trigger Job Consumers",
"process.label=Trigger Job Consumers"
}
)
public class JobProducerWorkflowProcess implements WorkflowProcess {
@Reference
private JobManager jobManager;

@Override
public void execute(WorkItem item, WorkflowSession session, MetaDataMap args) throws WorkflowException {
String payloadType = item.getWorkflowData().getPayloadType();
String metaArgs = args.get("PROCESS_ARGS", String.class);

final Map<String, Object> props = new HashMap<>();
if (StringUtils.equals(payloadType, "JCR_PATH")) {
String path = item.getWorkflowData().getPayload().toString();
props.put("payloadPath", path);
}
jobManager.addJob(metaArgs, props);
}
}

Key Components

  • JobManager: The JobManager service is used to create and manage jobs. The addJob method is called to add a new job to the queue.
  • Workflow Process Execution: The execute method is the entry point for the workflow process. It reads the payload and arguments, and then triggers the job.
  • Payload Handling: The workflow process extracts the payload path from the WorkItem and adds it to the job properties.

How the Workflow Uses Payload

  1. CSV Upload: The CSV file is uploaded to a specific path in the DAM.
  2. Workflow Launch: A workflow is triggered upon the upload of the CSV.
  3. Job Creation: The workflow creates a job with the topic node_deletion/job and includes the path of the uploaded CSV as a payload.
  4. Job Processing: The NodeRemovalJobConsumer processes the job and deletes the nodes as specified in the CSV.

Conclusion

By automating node deletion using a Sling Job and workflow in AEM, you can streamline maintenance tasks and reduce the risk of human error. This approach leverages the flexibility of AEM and the power of automated jobs to keep your content repository clean and organized. Implementing this solution involves setting up the job consumer, parsing the CSV payload, and integrating it with a workflow for seamless operation.

--

--