Adobe I/O-Powered AEMaaCS Assets Export Implemented in Real-Time or Batch Mode
AEMaaCS Assets offers a cloud-native digital assets management solution, which is scalable, reliable, and hassle-free. It’s never been a better time for enterprises to adopt such a solution as their centralized DAM system. To share or distribute assets (or their renditions), AEMaaCS provides multiple ways such as asset link share, asset selector, brand portal, etc. But still, there are common use cases where assets (or renditions) just need to be exported to external cloud storage like Azure blob store, S3 bucket, SFTP, etc. (for other systems to further consume) or directly into 3rd party systems like Marketo, Adobe Campaign, PIM, etc. as long as they provide secured asset ingestion endpoints.
Traditionally in AEM 6.x, this usually can be accomplished by “publishing” assets into downstream systems (within a workflow or something) via a custom replication agent. This won’t work in AEMaaCS since AEMaaCS is relying on content distribution rather than replication agent anymore. There is also a paradigm shift in terms of how AEM should be integrated with external systems in a cloud world, which is more event-driven and microservices-based. In this sense, App Builder (Events and Runtime action) can really help bridge the gap.
Let’s take a look at a use case like this: upon assets publishing, assets’ original rendition (or with one or more other renditions) should be published/exported to either Azure cloud storage or a 3rd party system such as Marketo (eventually can support any 3rd party destination as long as it provides a public asset ingestion endpoint), which is driven by an export destination property defined in assets metadata, something like below,
We will see how easily Adobe I/O can make it happen in either a real-time (sync) or a batch (async) fashion, with literally no custom code required in AEMaaCS. First of all, let’s quickly go over what Adobe I/O encompasses for implementing such use case.
Adobe I/O Events
Adobe I/O Events enables building reactive, event-driven applications, based on events originated from various Adobe services or solutions. In our case, the AEMaaCS author will be registered as the event provider. You can follow this to set up AEMaaCS I/O integration via package event-proxy.
asset_published is not an event type that comes with I/O event-proxy by default but you can follow this to create a custom asset_published event mapping. Once event-proxy is successfully configured with asset_published mapping, you should get an event payload like below when publishing an asset,
Adobe I/O Runtime Action
Adobe I/O Runtime is Adobe’s serverless computing platform, in which our actual asset exporting & ingestion functions would be executed. Those functions are usually written in node.js and deployed to Runtime as “actions" but as a developer, you don’t need to worry about the overhead of spinning up or configuring servers. As the key ingredient of Adobe I/O event-driven application, runtime action can provide easy access to Adobe cloud services, data, and contents, orchestrate custom processes, and respond to Adobe I/O events.
I/O Runtime serverless platform is powerful, but we still need to be mindful of its limitations and find workarounds if necessary. Because of this, our implementations of asset export would differ (real-time vs. batch). We will talk more about them in the next sections.
Adobe App Builder
Adobe App Builder is a complete development framework that simplifies the processes of building and deploying custom Adobe I/O serverless applications. It is strongly recommended to leverage App Builder to implement any web (React SPA) or headless (microservices) application that runs on Adobe I/O serverless platform. App Builder provides the following convenient developer tools and services out of the box,
- aio CLI
- SDKs for Adobe cloud services
- Cloud storages for data and files
- Seamless integration with Adobe IMS authentication
- Developer console
- CDN, etc.
To save development effort and ensure best practices, our AEMaaCS assets export is a headless application that is initially bootstrapped from a App Builder template.
Now let’s take a look at how assets export can be implemented in an event-driven way powered by Adobe I/O events and actions.
In Adobe I/O developer console, any AEMaaCS event of your interest can be subscribed, in our case, it is the “Asset Published” (asset_published) event we mapped earlier. Then assets export runtime action can be “web-hooked” up with that event directly.
The I/O event flow and handling can be depicted as below:
- An asset is published from AEM author to publisher.
- “Asset published” event is sent to I/O events.
- I/O events get notified and invoke the runtime action that is registered as the webhook. Event payload is also passed to the assets export action securely (see payload example above).
- Assets export action parses event payload and gets the reference of the asset then makes a call back to AEM author to get more metadata (what kind of metadata is subject to particular use case. For our case, we use
exportImmediatelyto determine whether to export now and
exportDestinationto determine to which destination the asset should be exported, such as Azure blob store, Marketo, etc.).
- Assets export action fetches asset binary from AEM author and ingests it into the downstream system based on the specified destination. With the factory pattern, it is easy to extend
FileIngestorwith any 3rd party system as long as it has a public asset ingestion endpoint.
As you can see the whole event-driven flow happens in almost real-time. As soon as the asset is published from AEM author, the asset binary is immediately ingested into the destination system. This works well for individual asset publishing that is relatively small and fast. However, when facing a situation like bulk asset publishing (i.e. > 100 assets), you may observe 429 error in the I/O console’s Debug Trace like below,
This is because the
concurrent limit (no more than 100 activations concurrently submitted per namespace) of I/O runtime has been reached. In such a situation, Adobe I/O events will retry the request up to 5 times using an exponential backoff strategy within the next 31 mins. Hopefully, during that window, the concurrent number of actions from your runtime namespace would drop under the threshold otherwise the export action/webhook would be marked as disabled and all of the inflight events would be unhandled and lost (events handled by webhook are fired-and-forgotten). If such a situation can’t be tolerated in your application, you may need to consider using the journaling approach.
I/O events journal is an ordered list of events. To consume it, you just need to use the journaling API endpoint.
The events will be persisted in the journal for up to 7 days, and your action can be defined as a scheduled/cron job that reads and processes them based on your pace, 10 events hourly, 50 events daily, etc. totally up to your needs. Unlike webhook’s real-time even handling, journaling is more suitable for batched asset processing/bulk exporting but with more guaranteed event handling since it is dependant on your action’s availability.
- One or multiple assets are published from AEM author to publisher.
- Bulk “Asset published” events are sent to I/O events and persisted in the I/O journal.
- A scheduled/cron assets export action fetches events (with event payloads) from the journal regularly and starts to process them. For the next action run to pick up where it left off,
aio-lib-stateSDK is utilized here to store the last position that has been reached in the journal (see code snippet below).
- For each event fetched, similar to the webhook approach, assets export action gets the asset reference and calls back to the AEM author to get more metadata.
- For each asset, assets export action fetches asset binary from AEM author and ingests it into the downstream system.
As you can see, webhook (event push model) and journaling (event pull model) are both powerful event handling mechanisms. Usually for the use cases that require real-time handling with some short-lived actions, but the guarantee of event handling is not that critical (either can be recovered by retries or next runs), webhook is a great fit. On the other hand, if the execution of your events needs to be guaranteed and/or events emitting usually happens concurrently at a large scale, journaling is the right direction you should look into.
Full Adobe I/O sample code for this use case can be referenced here https://github.com/kelvinxzf/aemaacs-firefly.