How artificial intelligence can improve online news

Introducing Orbit

Over the past years we at Bakken & Bæck have been working closely with media companies of different shapes and sizes. Few industries are as fun to work with, and the challenges they are facing are tremendous. The “innovate or die” mantra is stronger here than in any other industry on the planet, and the desire to invest in new opportunities and experiment with unique ways of providing news and valuable content to consumers is high.

That being said, the user experience for online news sites today is very much like it was ten and fifteen years ago (see the slideshow showing the evolution of You enter a homepage where a carefully selected combination of articles on sports, celebrity reality shows, dinner recipes and even actual news scream for your attention. There’s a huge focus on page views, and hardly any attention given to personal relevance for the reader. Smart use of technology could improve the online news experience vastly by just adding a bit more structure. That is why we created Orbit.

Rich structured data is the foundation for taking the online news experience to the next level. Orbit is a collection of artificial intelligence technology API’s using machine learning-based content analysis to automatically transform unstructured text into rich structured data.

Orbit can extract entities like locations, people and organizations from any given text, and group them together.

By analyzing and organizing content in real-time and automatically tagging and structuring large pieces of text into clusters of topics, Orbit creates a platform where you can build multiple data rich applications.

The now 5-month-old leaked innovation report from the New York Times pointed to several challenges for keeping and expanding a digital audience. To face some of the most critical issues you need to create a better experience for the reader by:

1 Serving up better recommendations of related content
2 Providing new ways to discover news and add context
3 Introducing personalization and filtering

Better recommended stories

Relevance is essential to creating loyal readers, and even more so in a time where more and more visits to news sites go directly to a specific article, mainly due to search and social media, avoiding the front page altogether. Readers arriving through side doors like Twitter or Facebook are less engaged than readers arriving directly, which means it’s important to keep these visitors on the site and convert them into loyal readers. Yet, so little is being done to improve the relevance of recommendations and create a connection to the huge amounts of valuable content that already exists.

Orbit understands not only the topics a piece contains but also related topics. It thereby understands the context of the article and can bring up related content that the reader wouldn’t otherwise have seen, extending the reader’s time spent on the site and increasing page views.

Understanding context means that the cluster of topics related to an article on China signing a historic gas deal with Russia includes topics such as Russia, Ukraine, Putin, Gazprom and energy — thus creating recommendations within that cluster and creating connections between content.

Good recommendations are important on article level to build loyalty from new readers.

New ways to discover news

Rich structured data opens up for new ways to navigate and discover news. The classic navigation through carefully edited front pages has pretty much been the same since the dawn of online news publishing. Structured data enables the reader to follow certain topics or stories, improves search and enables timeline navigation of a news story to help the reader better understand the context of the story and how it has developed.

At the same time, a journalist writing a story on the uproar in Ukraine has no possible way of knowing how the story will unfold in the weeks to come. Manual tagging of news stories leads to inconsistent and incomplete structures due to a subjective understanding of which topics are important and related. Machine learning-based content analysis can identify people, organizations and places and relate them to each other in real-time, thereby identifying related stories as they unfold and cluster them together.

As the NYT Innovation report brought up, the true value of structured data emerges only when the content is structured equally throughout.

An illustration showing how the story on the Greek debt crisis unfolded and branched out into several different stories. Source: The PhD work of Dafna Shahaf

News and content apps like Circa, Omni and Prismatic, and news sites like Vox, have incorporated some of these elements and are experimenting with how to develop original ways to discover news.

Personalization and filtering

There are many arguments against personalization, and they are often related to the dystopian fear of a «fragmented» public sphere or the horrors of the echo chamber. That doesn’t mean personalization can’t be a good thing; it merely means being aware of what a particular type of user wants at a particular time. We are not talking about a fully customizable news feed based on your subjective interests, meaning I will not only see articles related to Manchester United, Finance, TV-shows and Kim Kardashian, and be uninformed on all other topics. We are merely suggesting a smart filtering system and adjustments of what subjects you would like to see more and less of on your feed. After all, we do have different interests. For example you may be entirely disinterested in Tour de France during its three week media frenzy in July each year; unfollow topic, or turn the «volume» down.

Follow suggested topics for a better personal experience.
Personalization […] means using technology to ensure that the right stories are finding the right readers in the right places at the right times.
- The New York Times report on innovation

Today, getting the news isn’t the hard part. Filtering out the excessive info and navigating the overwhelming stream of news in a smart way is where you need great tools.

A foundation of rich structured data will not only benefit the reader, but make life easier for journalists and editors as well.

Enriching your content

To provide context to a story about Syria you could add several components of extra information that would enrich the article: A box of background information on the conflict, facts about Bashar Al-Assad and the different Syrian rebel groups, and so forth.

Illustrate large, complex or popular stories with engaging visualizations.

With rich structured data in place, you can automatically add relevant fact boxes and other interactive elements to a piece of content, based on third party content databases such as Wikipedia. Topics can automatically generate their own page with all the related articles, facts, visualizations and insights relevant for that specific topic cluster.

Moreover, you can use the data to create new and compelling presentations of your content, including visualizations and timelines that give the reader a better experience and new insights.

News content generally has a short lifespan, but this doesn’t mean that old content can’t be valuable in a new context. A consistent structuring of archived content will give new life to old content, making it easier to reuse and resurrect articles that are still relevant and create connections between old and new articles.

Analytics, insights and making more money

What are the trending topics, people or organizations this week? What regions got the most media attention? How many of the sources were anonymous, how many were women versus men? Knowing more about your audience’s preferences will make it easier to create good content at the right time.

Better organized content creates a strong foundation for good insights into how content is consumed and why. With a better ecosystem for your content, including higher relevance and more contextual awareness, you can present better context-based ads to your advertisers and give better insights into who is watching and acting on them.

By using the right technology in smart ways, journalists and editors can focus on what they are best at: creating quality news content.