Sitemap

CNN insights: What do convolutional neural networks learn about free text? Part 1 of 7

Exploring representations learned by CNN about free-text

3 min readMay 21, 2018

--

Deep learning techniques ‘learn their own representations’ of the inputs we give them. But what representations do they learn? We have some idea what convolutional neural networks (CNNs also known as convnets) are doing when trained on images but there seems to be less insight into what they are doing with text data. Sure, we know we can investigate the word embedding space but what representations have been learned about sequences of word embeddings, ie phrases and sentences?

This blog series discusses techniques for getting insight into the features your CNN extracts when you train it as a text classifier. This series came out of some work I did with my collaborators on a very large clinical dataset using CNNs as a text classifier. Some of this project was written up as a research paper but there were quite a few ideas that didn’t fit into the paper which I thought were interesting so I decided to present them here instead.

Although the domain insights we obtained using these methods were specific to our particular dataset and classification challenge, the methods are more widely applicable and can be applied by people who want to understand how their CNN is fitting to their dataset.

Series overview

We will take a look at what CNNs do when they are trained to classify images and discuss three methods that explore representations learned by CNNs trained on imaging data. In later posts, we will turn to adapting these three methods to interrogate the representations learned by CNNs about text data. You can also read about the specific insights we got into how a CNN works on our domain problem. In order to give these insights some context, we also introduce our dataset VetCompass™ one of the largest clinical corpus in the world comprising 9.5 million animal records and discuss the domain problem that we trained a CNN to solve for us.

Highlights

To do well in our task, the CNN must fit to diagnostic language in the clinical notes: ie to identify sections of the text where a clinician is writing about making or ruling-out a diagnosis. Two of the techniques that I go into detail about later reveal (at least partly) what the CNN is doing. In short : it does often fit to the most relevant sections in the text.

Here are some example short diagnostic token sequences that the CNN fitted to:

risk of cushings / diabetes
rule out cushings / hypot4
rule out cushings / addisons
risks of cushings / dm
rule out cushings , t4
risk of demodex or sarcoptic
rule out diabetis , cushings
indicative of cushings or addisons
risks eg diabetes , arthritis
risk of spay in season
rule out cushings / hypothyroidism
diseases eg diabetes / hyperthyroid
diseases like cushings or diabetes
risks , diabetes , cushings
rule out chellietella or sarcoptes
too possible cushings or hypothyroid

Here is an example visualisation indicating the most relevant diagnostic tokens in the sentence. Don’t worry too much about how to interpret this chart here, but the longer the bars in the chart, the more relevant the CNN thought the token was to a diagnosis of cardiomyopathy:

Press enter or click to view image in full size

Series links

Part 1 : Introduction

Part 2 : What do convolutional neural networks learn about images?

Part 3 : Introduction to our dataset and classification problem

Part 4 : Generating text to fit a CNN

Part 5 : Hiding input tokens to reveal classification focus

Part 6 : Scoring token-sequences by their relevance

Part 7 : Series conclusion

Note to implementors

If you want to implement some of the ideas in this series you should probably already know a bit about the following topics before you start (but this isn’t necessary for the general reader)

All my coding examples assume you have a training set and have already built and trained a CNN model. My code examples are written in Keras.

Next part : Part 2 : What do convolutional neural networks learn about images?

--

--

No responses yet