What are radio programmes made of? Getting more data out of radio production

Published in

BBC Product & Technology

6 min readOct 5, 2018

The BBC makes hundreds of hours of radio content every week. We use in-house producers and independent production companies to create this content. A lot of our radio content is broadcast live, from our own studios. We want to find out how we can best serve the public by creating more and better quality programme metadata from those studios.

Creating Metadata

The definition of metadata is “data which gives information about another set of data”. In the context of BBC Radio programmes, the set of data in question is the audio that you listen to. The metadata is a list of facts about that audio. Metadata can be as simple as the name of the programme. Or it could be as complex as the topics being spoken about, related topics, or details about or the music being played.

In digital photography, each photograph has metadata embedded within it (Exif metadata). A lot of this metadata is technical details with tells us how the image was captured. One example may be how wide the aperture in the lens was (or f-stop). Another may be what the shutter speed was (in fractions of a second). It is impossible (or at least wildly implausible) to derive these measurements from the actual photograph. The simplest way of capturing this data is to record the settings from the camera itself.

The BBC has a wealth of metadata about programmes. The structure of a programme is especially interesting — the constituent parts that make it up, and the metadata that relates to those parts. We call these Segments, and James Calendar has written about the Segments API that we are using to store these. Radio 4’s magazine programmes Woman’s Hour or You and Yours can have segments in the form of “chapters”. Music focussed programmes like Radio 1’s Future Sounds with Annie Mac may have a mixture of music tracks and interviews or an exclusive guest DJ mix.

Some of these segments are already created automatically — music tracks, for example, are automatically created by our audio playback systems. However, other types of segments must be entered manually. Whilst there are internal tools available for producers to manually enter this data, the additional workload to do so is significant — and our producers have already created this information in other forms during the planning of their programmes. However, sometimes programmes diverge from their plan — often in well-defined ways, but even so — simply taking the planned content as truth is not necessarily an accurate representation of the programme that was created and published.

It’s clear that we need a mix of data about what was planned, as well as what actually happened, in order to construct an accurate set of metadata about a programme.

The “what-actually-happened” aspect of this has been somewhat elusive until relatively recently. Modern audio production technology — audio mixers and audio playback systems — can now send telemetry data over an IP network about what they’re currently doing. Telemetry data tells us measurements about the operation of those systems — this technique is widely used in other fields, for example in aircraft where measurements from sensors and controls in the vehicle are stored in a “black box” flight recorder, which can be used to reconstruct the exact situation which was occurring in the case of an accident or fault, in order to understand how it came to occur. Or like the digital camera mentioned earlier, we know the state of the machine which created the audio, and we can record that.

The challenge is converting this telemetry data into meaningful metadata that describes the content of a programme, rather than simply how the studio operator created it. For that, we need to look at the audio signals themselves.

Deriving Metadata

Whilst the telemetry data can tell us about how the programme was created, the audio content contains information about the content in the form of speech. The advent of increasingly accurate speech-to-text algorithms has made the prospect of deriving metadata from an audio signal much more approachable.

Using audio interfaces to the studio, we have access to every individual audio source via a stream on an Audio over IP network, and therefore have the luxury of being able to isolate each source and process them separately. This means we can improve the accuracy of the algorithms by supplying it with clean audio of just speech, with no background “noise” (or music, as we sometimes call it). Just like trying to understand a conversation with a friend in a noisy pub or bar, speech-to-text algorithms struggle to perform well when speech is mixed with other irrelevant sounds. This is such a problem that many speech-to-text algorithms intentionally turn themselves off when they detect other sounds like music — so the more we can remove these extraneous sounds ahead of running them through the algorithm, the better.

If we take these automatically generated transcripts as inputs to natural language processing, sentiment analysis, and the plethora of algorithmic methods of analysing textual content — we can begin to understand the topics that are being spoken about in a programme or segment of a programme.

If we combine this with the data we have about the plan for a programme, we can really start to understand exactly what a radio programme is made of, and using technology a lot of the metadata can be built automatically without much extra effort — so our producers are free to concentrate on creating the best content for the audience, rather than spending valuable time filling in forms.

Publishing Metadata

We’ve been working with the Segment API team to build tools that will allow radio producers to build an accurate representation of their programmes in real time, providing as much automatic assistance as possible, whilst giving the producer the ultimate control over their metadata output. We can then deliver this to the various BBC products (both to enhance the experience of the audience online and for internal tools) that might want to make use of this — using the entire infrastructure which the BBC has available to deliver this at scale.

Of particular importance is ensuring there is a “human-in-the-loop” as part of any automated metadata flow to the audience. This ensures that we retain editorial control of not only the content itself, but also the metadata that relates to it. For this we need to develop tools that are not intrusive to the production process but powerful enough to give producers a complete view of the metadata that they are producing.

Archiving Metadata

Ultimately, all programmes are archived — and their metadata is too. Whatever the means of recording the metadata we create needs to be as independent of technology as possible, to ensure the useful lifetime of the metadata extends far longer than the system that was used to produce it. Additionally, the metadata for a programme may change after it’s broadcast — for example as better tools become available, or inaccuracies must be corrected — which introduces a new set of challenges

The topic of archiving in general is quite wide ranging, from developing standards for storing the metadata to ensuring that the data will be readable when all of our current technology is obsolete and deserves a blog post of it’s own!

Conclusion

We are striving to create and deliver more metadata for our programmes. We have a variety of tools available to do this — and many of them can be machine-assisted, allowing us to deliver more value than the proportional effort required to create it, and this will give our producers new and innovative ways of presenting BBC content to our audience.