Decoding imagined and spoken speech using ECoG: insights from neuroscience — Paper Summary

Alexander Kovalev
the last neural cell
4 min readJan 28, 2022

#03 Review.

Imagined speech can be decoded from low- and cross-frequency intracranial EEG features. Paper review.

🧐 At a glance:

Brain computer interfaces (BCI) let you use your brain activity to control devices. You can manage a computer, a prosthesis, and even a speech vocoder.

In the present day, researchers are actively investigating the capabilities of such interfaces. In this paper researchers investigated the possibility of developing “language prostheses”.

paper: link
code: link
data: not availabe. But you can try to ask authors. → link

🤿 Motivation:

This article is about decoding imagined speech. Speech retrieval is one of the most significant and challenging tasks in BCI. We can observe significant progress in explicit (overt) speech decoding. In other words, a person says the words out loud and we can tell from the brain activity what they said.

However, it’s more difficult to decode imagined speech. There’s a problem in that it’s not clear where to measure brain activity and how to process it (which features and signs to use). These are the questions the authors try to answer:

  • What brain regions have the best decoding potential?
  • What’s the most informative neural feature?

It is important to identify the regions specifically correlated with imagined speech in the context of development of a DS-BCI that are independent of movement and therefore not overt speech production and are independent of stimuli and therefore not speech perception.

🍋 Main Ideas

Neuroscience views:

Actually, it’s not easy to say how speech is made in our brains. Several theories exist.

  • Motor hypothesis. Imagined speech and overt speech have similar an articulatory plan in our brain.
  • Abstraction hypothesis. We can produce imagined speech without explicit motor plan.
  • Flexible abstraction hypothesis. Imagined speech is phonemic based (sound of language). In this case, the neural activity depends on how each person imagines speech: subarticulation or perceptual.

Experiments description. What they did?

Electrocorticography (ECoG) — electrodes placed directly on the brain (like EEG but without annoying scalp). It is an invasive procedure.

Patients with ECoG perform language tasks. There are 3 studies with different experimental protocols and different ECoG positions. In this task, you should imagine/speak/listen certain words after a clue.

Experiment visual description.

Feature extraction.

  • Frequence decomposition. Extract 4 frequency bands.
  • Cross-frequency coupling (CFC) allows to link activities that occur at different rates (frequencies). Authors use link between phase and amplitude from different bands.

Analysis.

  • Compare brain activity during different task (listen vs speak vs imagine)
  • Determine influence of each region and feature for word decoding accuracy.

📈 Experiment insights / Key takeaways:

  • Researchers found that overt and imagined speech production had different dynamics and neural organization. The biggest difference between them was that high-frequency activity (BHA)in the superior temporal cortex increased during overt speech, but decreased during imagined speech. The high frequency band is the best for telling the difference between real and imagined.
  • It means means that transition in language decoding from overt to imagined speech might be tricky. (We can not train model on overt speech and use it for imagined speech decoding.)

Difference and similarity.

  • Superior temporal lobe : BHA increased during overt but decreased during imagined.
  • Motor sensory region: BHA increase in both case.
  • Left inferior and right anterior temporal lobe: Strong CFC in the between theta phase in both cases.

Decoding features part.

  • They showed that high frequency band (BHA) provides best perfomance for overt speech decoding.
  • Low frequency bands help decode imagined and overt speech approximately on the same level.
  • Beta-band is a good feature for decoding imagined speech. In terms of power and CFC. Decoding imagined speech is possible due to CFC.
  • Decoding worked better if we use not only articulatery one (motor sensory cortex). It is defined by phonemic rather than motor level only.
These graph represent which features help to decode word. ( Overt left and imagined right).

ECoG signal analysis.

Signal processing

  • DC shifts HP filter on 0.5 Hz + Notch filter 60, 120 and so on.
  • Common average re reference + downsample to 400 Hz (antialising).
  • Morlet Wavelet transform with extraction 4 bands: theta (4–8 Hz), low-beta (12–18 Hz), and low-gamma (25–35 Hz), broadband high-frequency activity (BHA) (80–150 Hz)

Cross frequency copling

  • They use phase-amplitude cross frequency coupling. Perform between the phase of one band and the amplitude of a higher frequency band. This is a measure of interaction of different frequency bands and in this case the measure of locking of higher frequency oscillation to lower frequency phase.

✏️ My Notes:

Firstly, I think it is important that the authors studied the neuroscience side of language decoding and did not focus on algorithm development. However decoding accuracy is not very high and it is interesting to apply advanced ML algorithm for their datasets.

Improvements and next steps:

  • It is essential to develop adaptive algorithms. The position of the ECoG differs significantly.
  • It is interesting to explore how neural networks can be implemented for automatic coupling. Transformers should be investigated for that purpose.
  • For CFC calculation. It might be useful to use transfer entropy (or other causality metrics) between some phases and amplitudes. It is a time-resolved algorithm.
  • Also we can use dynamic PAC time-resolved algorithm for online CFC extraction. site

This review was made in collaboration with Alexey Timchenko.

--

--

Alexander Kovalev
the last neural cell

CEO of ALVI Labs | Machine learning engineer | Brain computer interfaces researcher. 🧠