Zone’s head of .NET development, Andy Butland, on why we should avoid the trap of always building additional features directly on to the web application…
At the heart of many of the technical solutions we have built at Zone for our clients is a content management system (CMS), usually one of the Episerver, Umbraco or Sitecore triumvirate. It’s not always a necessary component depending on the requirements for the application we are building, but any time an intuitive, feature-rich user interface for clients to manage their own content is necessary, we’ll usually reach for a Scandinavian-born, .NET-based CMS platform, of one form or another.
Given these platforms are particularly extensible, there’s a temptation to build each additional feature required by the application on and as part of the CMS/web application itself. Perhaps there’s a need for some data to be presented that isn’t generated by the editors themselves, and if so, we could import it and store it as content with the CMS. Or maybe we need to expose some data for mobile or other clients, so we can look to build an API as an additional feature of the website itself. There may also be necessity of integrations with other systems, which, if they expose an appropriate interface, we can call as needed, server or client-side, directly from the web application.
There are certainly good reasons why we might build things this way; it may even be the only sensible approach for a given requirement. And it’s tempting to do so, as we already have the necessary hosting and deployment infrastructure in place. We should question it though, and not fall into the trap of simply building directly on to the web application because it’s the initially easiest approach to take.
I first saw this note of caution a couple of years ago, as an entry in the Thoughtworks technology radar, a bi-yearly publication where the company summarises its view on different trends in technology, asking whether different ideas are ready for adoption, review or retirement. A key quote from this entry was:
“While we are very supportive of providing content producers with the right tools and workflows, for applications with complex business logic we tend to recommend treating your CMS as a component of your platform… …cooperating cleanly with other services, rather than attempting to implement all of your functionality in the CMS itself.”
In this article I’m going to discuss some examples of different ways we’ve considered and adopted (or sometimes rejected) this approach in recent CMS projects at Zone.
If there’s a bandwagon in the CMS space at the moment it’s almost certainly the idea of headless. And a justified one, as the approach brings a lot of potential benefits. Traditionally a CMS platform will handle both the content management and the presentation layers of a website experience, but with headless, the focus is just on the former. A headless CMS will provide a back-office interface for editors, along with features like workflows and version control, but avoids any concern with the presentation of the website via templates. Instead, it will provide an API that another component — which might be a mobile app, a chat bot or a separate website application — reads and takes the responsibility for rendering the content in the appropriate way.
We’re seeing a number of relatively new players coming into this space — such as Contentful, that we use for the Zone website — as well as existing providers, including Sitecore, Episerver and Umbraco, all releasing features or additional products to address this need. In doing this, they expand the reach of the platform into other developer ecosystems — even if the CMS is built in .NET, you don’t need to be a .NET developer to build a website using it.
Whether it’s the right approach to take for a given project depends a lot on how the information managed in the CMS will be consumed by the end users. If this is not a “typical” website using server-side templating — instead being perhaps a single page application (SPA), a generated static site, or for some other client like a mobile app or chat/voice interface — then the separation of concerns this provides is very valuable. We’re able to independently make changes and deploy different client applications, working against a stable and consistent content API.
Where a dynamic, search-engine-optimised, public website is being created though, we have to weigh up these benefits against what we lose by taking on responsibility for the presentation layer in a separate application we need to write and maintain — namely the sophisticated templating, preview and content retrieval features provided by the “traditional” CMS. Even here though, a hybrid approach is possible: one we’ve used in our work with Greene King, where we have an Umbraco CMS, with standard, server-side templating, but with islands of more sophisticated user interfaces, provided by single page applications using custom APIs exposed from the CMS content.
Heterogeneous data sources
Another area where we may look to implement part of a website solution outside of the CMS itself is when it comes to data that isn’t directly generated as content by the editors themselves. In other words, the “record of truth” for the information isn’t the CMS, but instead is some other database that either may already exist, or might be something to be built and maintained independently.
We broadly have two approaches here: either build some form of on-demand or scheduled import component, to create copies of the data as content in the CMS; or to keep the data mostly in the original system, perhaps just importing an identifier field, such that we can link it with content that is created in the CMS.
We often find this situation where we have a need to present some “factual” information — held in an internal client system, and being data in the form of detailed information such as prices, addresses, scores etc — along with “marketing” content, such as images and descriptive text, that is CMS managed.
Often the decision between importing and referencing isn’t an easy one to make, as there are a number of factors to consider. Firstly we’d have to ensure that if the information isn’t held in the CMS, it’ll need to be accessible in some way such that we can access it when pages are being rendered, to present both the CMS-held marketing content and the external factual data in a single page. As my colleague Peter has discussed, we achieved this for the Greene King website in the presentation of the venue detail pages by utilising a separate MongoDb database, populated via a scheduled process from a feed provided from the client’s systems. On page render, we use a common identifier field, to retrieve the appropriate data for the page being rendered and present all the details together.
If the data isn’t accessible in this way though, and building and maintaining a separate database doesn’t appeal, we might consider importing into the CMS, but have to be careful to understand how the content will be treated “under the hood” by the platform. The particular concern here is versioning. Normally with a CMS platform, each change to content made by editors is maintained as separate version, allowing us to preview, rollback and perhaps personalise the content by presenting different experiences to different user groups. If we’re importing data from another system that is the source of truth though, we likely don’t want to maintain versions within the CMS — in fact if the data changes often, it might well be poorly performant to do so. There’s also a danger of contention between editor changes and import driven amends. For example, if we’re allowing editors to augment the data with marketing information, we need to be careful we don’t prematurely publish, or even lose, their updates as part of the scheduled import and publish operation.
In a recent project for one of the UK’s largest housebuilders, we had exactly this trade-off to make with regard the detail of prices and configuration of the various houses they wanted to present for sale on their website, along with images and descriptions that were to be managed by editors in the CMS. Here we were using the Sitecore CMS, which has the concept of unversioned (called “shared”) fields. It also supports a component-based architecture, which allowed us to cleanly separate the “factual” and “marketing” data, and hence the changes made by automated updates from those for editors. Given that, we felt the import approach was the best to use here. The CMS features allowed us to work around the downsides, and by having all content in the CMS, the downstream processing — page presentation, content indexing and searching, caching and invalidation — was more straightforward as we only had a single data source to contend with.
As a further example, we recently added a new feature to the suite of websites we build and support for Electrolux, using the Episerver CMS. They had partnered with a knowledge management specialist, Comaround, to provide contact centre staff with a portal for creating, searching and accessing support articles when they are in telephone conversations with customers. They wanted to make this content available to customers directly, but the portal wasn’t ideal for this purpose, as although styled similarly to the websites, it would be clear site visitors would be jumping to a separate site. They would also lose much of the SEO benefit of the content as it would be hosted on a separate domain.
Whilst we explored the idea of a content provider — an Episerver feature that allows content from other sources to be presented “as if” it’s managed locally — we didn’t find this worked well for content in multiple languages as was required here. As again we wanted to get the benefits of caching, indexing, template rendering and SEO friendly URLs that we’d get most straightforwardly if the content was imported into Episerver, we opted for that approach here. A scheduled job runs, querying the Comaround API, and creating or updating pages as appropriate. From that point on, the content can be treated for rendering just the same as any created by editors in the CMS.
When we have a transactional website that’s doing more than just presenting content, there’s value in breaking down what needs to happen before a user receives a response following their form post, and what can potentially happen afterwards. The classic example here comes from e-commerce. When an order is placed, there are likely some checks we need to do in real-time — validating the details provided, perhaps making inventory checks — that we’ll do before accepting the order and providing a confirmation to the customer that it has been accepted. There may be others though, such as booking a delivery slot, that either can or must be handled at a later time.
It’s for those types of tasks we can look to break another piece off of our CMS/web application monolith, and handle in a separate component, as an asynchronous process.
In essence, our transaction logic on the web application itself becomes much simpler. We carry out some quick validation, ensure the user’s details are safely stored — likely in a queue — and return a confirmation to the user. We can then build and deploy a separate component such as a serverless function that monitors this queue and either carries out or triggers the necessary additional steps.
We do have some additional complexity here to handle in the case of informing the user on updates to their transaction, after they’ve received the confirmation. For very quick updates, we might use something like SignalR/web sockets to send an update to the user via their browser that a change has occurred. For longer timescales, likely we’d want to use email or some other form of communication. There are also edge cases to handle, such as unexpected issues downstream in the transaction processing, that means rollback of earlier changes and further notifications.
But by doing this we gain benefits too. We don’t need to couple the deployment of changes to the transaction processor with the website itself. We can also scale independently — if there’s a spike in transactions, we can, ideally in an automated way, spin up additional instances of our specific processing components without having to do this across the solution as a whole. We also provide a more responsive experience to our users as they don’t have to wait for all the processing to complete before they get confirmation that their form submission has been accepted.
With the aforementioned housebuilder’s websites, users can transact in various ways via a set of forms, for example downloading brochures, requesting a call back or booking an appointment to view a development at a specific date and time. When analysing these, it’s clear that for the latter, we’ll probably want to do the processing synchronously, in real time. It’s likely the user wants to know they can definitely show up at the requested location and time, and accepting the request but later rejecting wouldn’t be a good experience to provide. We also want to be sure we don’t accept more coincident appointments than a sales adviser can cope with.
For the others though, there’s no reason really why we need to make the user wait whilst we updating the necessary internal systems to register the brochure or callback request. Here we could simply accept the request, ensure it is valid, and return to the user; carrying out the rest of the work via decoupled components.
Another area of functionality we could implement on our web application, but may not have to, is background tasks. With tools like Hangfire, we do have the facility for reliably running such workloads that aren’t triggered by web requests, but if our work at least partially doesn’t depend on running within the context of the web application or CMS, we can reduce the complexity and coupling by hiving this work off into separate components.
Again with the housebuilder’s example, we had a need to provide a set of XML feeds to various third party property portals, who would show the houses for sale on their websites. The content of the XML feeds was sourced from the CMS — partly from marketing content prepared by editors and partly factual details like prices that have been imported into the CMS from external systems.
We decoupled this by utilising the Solr search index that is populated automatically when configured for use with Sitecore, as changes to content are made. This then became our the data source for our portal feed generation component, which was implemented using a set of Azure durable functions. Durable functions are a serverless offering, but rather than handling single, short-lived tasks, they can be strung together to carry out longer running and linked operations. So here we had a number of steps: importing from Solr into Azure table storage using direct Solr queries, aggregating data across developments, preparing the XML feeds for the different providers and finally distributing to them via individual blob storage containers.
Breaking off micro-services
Whether it’s an existing application we’re updating, or a new one we are planning, in all these examples of looking to move away from CMS as a platform, we’re effectively following the the approach of deconstructing the monolith application into micro-services (another hot topic in software architecture). With a CMS-centric solution, particularly one still involved with the presentation layer, we’ll not get all the way to a micro-service architecture of course, but we can still learn from this approach and gain benefits by breaking pieces off into separate, independently built and deployed, components.
As well as arguably reducing complexity by having more but simpler components to work with, and adhering to separation of concerns at an architecture level, the benefit we get here is decoupling and independence. Deploying a complex CMS solution is often not a straightforward task and, much as we might like to get to a continuous deployment scenario, it’s a hard thing to achieve in practice. The reason is that it’s not just code we’re deploying. There’s all the visual and styling aspects, which, while not impossible, are difficult to automatically regression test. We often have to deploy changes to CMS structure along with code, to create required content for new features to use and run re-index or bulk publish operations. Given that, for many projects, “release day” hasn’t gone away yet.
Of course we can and should look to make the CMS release as automated and pain-free as possible, but given the constraints there, if we can take the chance to independently amend, build and deploy part of the solution without having to update the CMS and web application itself, we’ll likely benefit by being more responsive to our users and stakeholders.
We’ve taken this approach with our work with Electrolux, systematically removing and re-implementing certain features that previously existed as part of the CMS, and planning new ones, as independent micro-services. The details of products, imported from internal systems and presented on the website, are held in a separate database, and exposed for reads and updates via a REST API, which is used by the website and is also available for other systems. When integrating with an event booking service, rather than using the provider’s API directly, we built a micro-service that encapsulates that integration, and exposes its own, much simpler API with appropriately cached responses, for access from the web platform.
While our CMS platforms give us a great blend of out-of-the-box features and extensibility, it’s important to not consider them the hammer for every nail, nor the home for every feature of our website solutions. I’ve discussed here a number of ways where we have looked to break off pieces of data and functionality, implement them as decoupled components and gain benefits of responsiveness, maintainability and performance.