Some thoughts on a new project: Leveraging NLP to augment intellectual exploration routines

12K
5 min readApr 17, 2018

--

The ability to familiarize oneself with new topics quickly is a key competency for professionals in all innovation-related sectors, be it venture capital, corporate strategy or consulting. Dynamic market environments shaped by accelerating innovation cycles and blurring industry borders require us to intellectually explore new technologies, market sectors and industry ecosystems continuously. Following our original vision of “building technology that helps you stay in the know” we’re currently working on a software that helps users get familiar with new topics. Here are some first thoughts on it.

How does intellectual exploration work?

We spoke to many friends and colleagues working in the industries mentioned above to understand how they explore topics. The conversations revealed that intellectual exploration routines are very comparable even if the tools and workflows are highly individual. Most often, the process starts with superficial knowledge on single concepts of a topic — like bitcoin was a starting point for many people to explore the broader field of distributed ledger and blockchain. These concepts are the starting point for an iterative research process across sources like Wikipedia, YouTube, forum threads, expert media, scientific publications or company websites. The research process results first in a better understanding of the initial concepts and, secondly, in the discovery of further related concepts. Over time, we gradually recognize relationships and hierarchies, prioritize the concepts and finally gain a holistic understanding of the topic. On an abstract level, the intellectual exploration of a topic is an iterative process of

  • discovering underlying concepts of the topic,
  • gaining understanding of these concepts and
  • grasping the interdependencies between them.

Augmentation vs Automation

Over the past years there was a lot of buzz around the “automation of knowledge work,” aka “AI killing white-collar jobs” if you prefer a little more sensationalism. We’re deeply convinced that AI technologies, like machine learning or natural language processing, will fundamentally impact knowledge work and already do. But we also believe that augmentation is the better concept than automation to think about how these technologies will change the workflows of knowledge workers. This is especially true for the highly iterative and intuitive process of intellectual exploration where the human can’t be cut out but which offers massive potential for acceleration and objectivization.

The basic idea of our new project, 12K EXPLORER, is to accelerate and objectify intellectual exploration routines. The software is based on four key elements:

  • Data set: Our software is fueled by historical and continuously updated content from innovation-related sources (media, patents, academia). Our data set currently includes more than 4 million documents.
  • Topic model: We use NLP algorithms to transform the unstructured text content into a structured topic model of concepts. Our current topic model covers 1.5 million innovation-related concepts.
  • Graph interface: The graph interface is a visual interface to the topic model. It allows the user to gradually explore topics by detecting related concepts and mapping the interdependencies between concepts.
  • Content ranking: We developed an algorithm to detect the most relevant articles on single concepts or entire topics. The relevance ranking is based on the technology that we use for our algorithmically curated newsletter, 12K FILTER, and considers content properties like topic fit, fact density, article length, article diversity and expert shares.

V1-beta and initial learnings

To give you a first glimpse of the product, we created a topic graph for neurotech (inspired by Clement Vouillon’s and nicolas debock’s great newsletter). Our starting point was the concept “neural interfaces” from where we iteratively added related concepts to the topic graph.

The graph interface suggests related concepts based on the topic model

After a few minutes, we ended up with a nice graph covering key aspects of neurotech, namely

  • brain imaging / neuroimaging / eeg / fmri (🧠)
  • brain machine interaction / bci / brain implants (🤖)
  • neural stimulation / neuroprosthetics / neural implants (⚡)
  • transhumanism / posthumanism / AGI (🔮)
The size of the nodes indicates the attention for the given concept in innovation-related media sources. The color intensity of the nodes indicates the growth of the media attention over the last years.

Besides playing with the tool ourselves, we asked partners, friends and clients to test the beta version and use it in their workflows. Here are the key takeaways from the feedback:

  • The graph interface is a great tool to discover underlying concepts of a topic. Creating the neurotech graph took us about 15 minutes and definitely helped to get an overview of a topic that we knew little to nothing about before.
  • We developed a simple feed based on our content ranking algorithm that shows the top articles on any given concept right next to the graph. But building something that exceeds fantastic services like Wikipedia, Google or Quora in researching a specific concept is not an easy task. That’s why we will focus on the discovery of concepts and the mapping of their interdependencies in the next step.
  • The content ranking is however very useful in another way. We built a newsletter functionality that sends users the top articles on a topic once a week. That feature is a great help in staying up to date on various research topics.
  • The topic graphs are not only helpful for single users to explore topics but also a great tool to create a common understanding of topics within teams or towards clients.

The initial feedback showed us that there is potential for acceleration and objectification even in the research routines of the brightest humans. That motivated us to keep on working on the project. Get in touch if you’re interested to learn more and check out the beta: hi@12k.co.

--

--

12K

A software company operating at the intersection of machine learning and data science. Our specialty is to gather, structure and leverage data. // www.12k.co