Creating an amazing space for audience participation!
At the BBC, we put audiences at the heart of everything we do, so it’s important they feel they have a safe space to interact with all of our online products. Whether it’s a child uploading a picture for their favourite CBBC show, or an avid fan of current affairs sharing their thoughts on the latest breaking news story, the BBC is built to give a voice to everybody.
At the core of this is moderation. As text, pictures, videos and audio are submitted by the public to the BBC, we need to make sure our platform is an inclusive, enjoyable, and safe place to be. Not only for those of you at home browsing our online presence but also for the moderators who sift through user-submitted content.
Although often underappreciated, moderation is a fascinating area, both technically and philosophically. As well as offering challenging implementation problems, it also raises the question of what should be circulated or censored online.
The purpose of this article is to explore the BBC’s moderation platform, and explain what goes on behind the scenes. As this is an engineering blog, we will include some technical information, however this will be quite high level, so non-developers don’t be discouraged!
What exactly does the moderation platform do?
There are plenty of places for audiences to submit content to the BBC. However, not all of this content is suitable for BBC platforms. The job of moderation is to take submissions from the public, assess risk, surface content to moderators, then communicate decisions out to commenting systems, the account team, our uploader tool, or any other moderation service consumers.
Why we’re rebuilding moderation
One of the best things about engineering is the pace at which technology moves. As languages, frameworks and platforms sprint forward, new and exciting possibilities open up. In order to embrace these changes, it was decided to move on from the old engine that had done so well for so many years, and instead build something new.
How we’re going about it
One of the most brilliant and most challenging things about starting a new project is choosing the technologies to work with. At the BBC there is a definite preferred stack, so some choices were made for us.
Initially, we’re coding in Java using Spring Boot. This allows us to rapidly spin up APIs, and the wider Spring framework gives us access to some really useful tools and patterns. The front end as accessed by moderators is developed in Node and React, with work going ahead to implement WebSockets, allowing for a more responsive experience.
In terms of infrastructure, we’re mainly utilising Amazon Web Services, where we work closely with their client teams to discuss best practices, performant patterns and their latest offerings. We’ve employed a lot of the more basic functionality, but also had the chance to deep dive into products like Step Functions for serverless service orchestration. This is particularly useful when wiring up our risk assessors and content enrichers, something we’ll touch on later.
Let’s construct these factors into a (very) rough diagram of the system architecture.
One of the more tricky technical issues we’ve faced is the development of a client agnostic submission model. The BBC has a rich tapestry of services, and we need to build a way of integrating with all of them. Comments on sports articles, photographs for CBBC campaigns and even display names for accounts all need to filter through the same system, so deciding how best to structure our data was a really interesting challenge.
Testing is also something very much close to our hearts, and the complex nature of the system has seen the team adopt a plethora of new tools. One of the main obstacles we have is testing our system end to end, and adapting a functional framework to integrate with our serverless components.
Additionally, this is a system whose traffic will only ever increase. As more sites and services are on-boarded we need to be able to deal with a large range and quantity of data. Having all of the different AWS infrastructure function effectively under heavy pressure has required a high level of developer engagement, and a lot of learning.
Technology is great, but processes are also crucial. The entire operation has been underpinned by the constant and conscientious adapting of our Agile workflow. Due to COVID it’s been an undeniably difficult time, putting a variety of pressures on all members of the team. By constantly reviewing our ways of working we’ve been able to ease pressure on those who needed it, whilst keeping the project moving forward at an impressive pace.
Automated moderation assistance
So far we’ve offered a brief insight into the development processes and technology behind the moderation re-platforming. However, before we go, I thought we should highlight one of the most interesting facets of the new system.
Previously we mentioned the idea of risk assessors and content enrichers. ‘Risk assessor’ is a term given to our services that provide some extra information regarding the suitability of a submission — a common example would be highlighting potential profanity. A content enricher supplies supplementary information.
By employing a service orchestration tool such as an AWS Step Function, it allows us to easily expand our range of risk assessors and content enrichers to incorporate the latest offerings from a number of suppliers, including our very own R&D department. The automated detection of inappropriate images or hate speech become simple plug-ins, and we can easily transcribe audio, or pull text from video.
We have given a whirlwind introduction to moderation at the BBC. At a very high level, we have discussed our technologies and processes, illustrating some of the patterns that make our tool a flexible, forward-looking project.