Brain interfacing applications

Beyond the obvious

Viktor Tóth
Mindsoft
15 min readFeb 2, 2020

--

I’ve been reading articles and scientific publications on future brain interfaces, their use-cases, ethical considerations and security caveats. So far, I’ve barely come across detailed ideas that have any remote ties to neuroscience. Most of them just feed off of the Matrix trilogy or the Ghost in the Shell, while others take the leap to equate the human brain to a computer (once accessed by machines) and claim biologically implausible feats to be inevitable. Art, novels, and movies in particular, have great influence in how we predict the future and hence, how we could exploit it or avoid undesirable outcomes. Compared to the abundant media on human-level artificial intelligence, there is a lack of narrative stories, movies, video-games discussing brain interfacing, which is, I believe, a necessary precursor.

It seems we missed the beat; sorry HAL, we can’t do that, we first have to hook our brains to computers.

In this article, I’d like to present compelling yet reasonable applications of future high-bandwidth brain interfaces. I refrain from dealing with moral arguments; I may present the ethical boundaries that I consider appropriate in another article. I still signal by typing [✗] where I judge the application morally flawed to implement, [?] indicates a questionable case, where more nuance is necessary to determine the ethical substance, and [✓] anywhere where I’d be fine with the particular solution.

Although brain-computer interfaces have been around for a while now, they have yet to impact the everyday consumer. Devices featuring electrodes mounted on the scalp (EEG) have long entered the consumer market (~2008), but they’ve mainly been used for assistive purposes, such as: guide meditation, read rough motor movement signals, and to aid the disabled in few-choice selection applications, like typing. Further research has gone into extracting emotional states, or the attitude on the approach-withdrawal dimension, the accuracy of which remains significant mostly in laboratory settings. Oh yeah, and don’t forget the arcane field of neuromarketing [✗].

Laboratory brain imaging and stimulation devices, like fMRI and TMS, are too expensive, cumbersome and inaccurate to escape the lab environment. Invasive methods that trade accuracy for risk, such as ECoG, are also limited to desperate cases, where carving a wide hole on the scalp is justified; just to remove the electrodes within a couple years because of brain tissue damage or dural thickening that would eventually ruin the signals.

The next generation of brain interfaces

Neuralink’s announcement excited the public, as a consumer brain stimulation device got on the horizon. I’m not going into the technical details in this article, but the theoretical limit of their approach is indeed promising: deep brain stimulation and spike detection from thousands of electrodes would require only a couple tiny holes on the scalp to be drilled. When compared to previous solutions, the risk-accuracy trade-off is exceptional.

However, there are other, less covered, early-stage approaches out there, like the one Openwater pioneers. Their device is completely non-invasive, applies near-infrared light and ultrasound to excite and record neural states by processing the scattered light beams once they journeyed across brain tissue. Kernel, another next generation non-invasive brain interfacing startup, is working behind closed doors as of yet; not much is known about their tech, but the ambitious out-of-pocket $100 million starting capital begs for attention.

You ready?

Applications

I would like to point out that the field of brain-computer interfacing is immensely rich, and I honestly doubt that any idea that I present in this article have not been contemplated by dozens in the field, probably in even more depth than I could elaborate here. I hope not to displease the neuroscience-minded reader with occasional superficial derivations, while I also wish to keep the layman on board. Machine learning concepts are thrown left and right though.

The low-hanging and ethically pure ambitions of making the blind see and the disabled walk stand as clear first steps [✓]; also the diagnosis and remedy of brain diseases [✓]. Neuralink and Openwater further champions telepathy (high-bandwidth, direct, mind-to-mind communication) [✓] as a relevant application for ALS patients, the deaf, as well as the healthy population.

Telepathy

Speech processing in the cortex according to Hickok and Poepel, 2007.

I would differentiate between three gears of telepathy: audio, speech and concept level.

In the first gear (see figure below), telepathy is channeled by reading the primary auditory cortex (A1) of the source brain and stimulating the same region of the target brain according to the recorded activity. The source needs to vocally imagine speech, which should result in a comparable sequence of neural activity than what the actual audible speech would elicit. Auditory cortex activity is then decoded into sound, which in turn is encoded back into brain activity on the target side. This indirect neural(A1) → sound neural(A1) cross-brain mapping can be enabled by two deep learning models one performing the neural(A1) → sound, the other the sound neural(A1) conversion. Theoretically speaking, this approach only requires enough samples of paired neural activity and simultaneously captured audio from both participants (1st gear, first option in the figure below).

The sound decoding process may be further improved by incorporating motor cortex activity in addition to the auditory, in which case the source has to mouth (or imagine mouthing) the words.

Alternatively, a direct neural(A1) → neural(A1) translation would entail the concurrent recording of both participants talking to each other (1st gear, second solution in the figure below). One could and should further assume partially similar encoding between the source and target auditory regions and inject it as inductive bias into the translating deep learning model.

The gears of telepathy: an analogy carried too far. First gear: indirect (neural → sound → neural) and direct (neural → neural) translation of early auditory brain signals. Second gear: indirect (chaining in a first gear translation model, e.g. STS → A1 → sound → A1 → STS) and direct telepathy between speech processing regions; the latter enabled by a speech recognition deep learning model that labels speech to be correlated with neural activity. Third gear: concept-level information transfer between Alice and Bob, chaining in the trained deep learning models of the previous gear, or maybe some other, smarter solutions.

Let’s shift to the next gear by abstracting away the spectrotemporal (sound) features and transmitting only phonological symbols between minds. To encode and decode phonemes we would need to ascend on the auditory pathway and access speech recognition regions; the (left, mid-post) superior temporal sulcus (STS, purple region in the figures above) is a likely candidate, as well as other regions on the auditory ventral stream primarily responsible for speech comprehension. The information content of neural activity equates to 100 ms worth of sound at the level of the STS; hence, it is a higher bandwidth interface for the means of telepathy than just vocal transfer or A1 stimulation. Simply put, the further up we stimulate on the ventral stream, the faster the communication may become, although it’s far from trivial to what extent. Speeding thoughts in the second gear involves the mapping functions introduced in the first gear, chained together with a neural(STS) → neural(A1) and neural(A1) → neural(STS) transformations (2nd gear, first option in the figure above). Alternatively, if relied on speech recognition machine learning models, the phonemes could be extracted from audio and paired up with STS patterns to train the mentioned transformations (2nd gear, second option, bottom of the figure above).

The third gear entails reliable recording and stimulation of associative (prefrontal) areas to encode the message on the concept level. Well, that’s easier said than done: concepts are sparsely distributed in the cortex, high dimensional, interconnected, but more importantly, we lack the explicit representations of them that we could correlate to neural activation. Activation patterns in these higher associative regions are more experience dependent and less determined by the coincident stimuli that we can record. Therefore, deriving a mapping from or to higher, associative regions is significantly harder.

I’ve painted a simplified picture here both about the neural mechanisms supporting speech processing and about the machine learning methods necessary to stimulate the cortex. Considering stimulation, it’s relatively easy to derive a mapping from sound to corresponding auditory neural patterns, when compared to the task of deducing the stimulation sequence leading to the desired brain patterns. There’s no one-to-one mapping between the stimulation and the induced neural activity — the (sparse) positioning of the electrodes, the local neural connectivity and the range of possible stimulation parameters complicate the realization of such a mapping. To obtain it, one would need to apply a huge array of different, varying stimulation parameters and record the unfolding neural activity.

Or, one could optimize the stimulation (given the speech to telepathically transmit) in a reinforcement learning setting: adjust the reward to be proportionate to the similarity between the desired (temporally delayed and spatially downstream) brain pattern and the actual, induced one. So, given a speech audio to transmit and the corresponding STS neural activity of the target brain, one could stimulate A1 and subsequently measure STS activation, then match it with the desired activity; if they match fine, the reinforcement learning model receives a high reward. (Neural stimulation likely contaminates immediate recording in the same brain region, which necessitates the subsequent recording to be spatially downstream, or at least temporally delayed.)

Notice that none of the above mentioned implementations of telepathy assumes or relies on a shared neural coding between the users’ auditory regions. It’s clear that some amount of overlap exists that can be exploited, especially in the early sensory areas. One may apply unsupervised learning methods to discover lower dimensional latent spaces in auditory patterns, which then can be aligned to another user’s similarly derived latent auditory space. Once aligned, translation of neural activity from one to the other becomes straightforward, and won’t require intermediate transformations (other than the above mentioned stimulation problem). This approach, at the least, presumes a similar history of auditory experience between the conversing minds, e.g. shared mother tongue. The three gears analogy applies here as well.

Under unsupervised concept alignment, the transfer of complex ideas would turn into a risky business. What if the target, Alice, hasn’t heard of certain concepts, or assigns different meanings to them? In the absence of concept labels on neural patterns, we could hardly assess the accuracy of the alignment before live testing it.

In conclusion, I would anticipate that we first map the primary sensory and motor areas, before we get to multi-modal regions (integrating senses), and finally to the prefrontal cortex and the limbic system. Such a strategy would enable us to slowly but surely shift between the gears of telepathy. Our structural and functional understanding of primary sensory/motor areas is profoundly more detailed and accurate than our knowledge of any other section of the cortex, which also supports this incremental approach. Moreover, that’s the strategy on which the industry and the scientific community can converge, as direct sensory stimulation thrills the industry, while research will likely focus on the step-by-step functional mapping of brain regions, approximating them with deep learning models. As we have no explicitly defined symbols for fuzzy concepts, we will probably reverse engineer prefrontal activity to its corresponding sensory, lower-level features that have been associated with the given concepts in the past, using such derived patterns to then predict prefrontal activity in another (target) mind; aka telepathy in the third gear.

Sight restoration

In the realm of ethically unambiguous applications, I’m most excited about aiding the blind. My 5-day-long blindfolded experience gave me a marginal insight into the life of the visually impaired; even such a short exposure made me reassess the value of vision and our fundamental dependence on it.

As the types of blinding diseases and corresponding physiological alterations are numerous, finding an all encompassing cure seems unattainable; unless you have a high-bandwidth link to the primary visual cortex (V1), in which case, you happen to skip the major source of variability in said diseases: the eye. However, the visual cortex could suffer apathy and cortical reorganization from the lack of visual experience to the extent that it’s completely or partially irreversible. This is the case for the congenital and early blind, who born visually impaired or lost eyesight before the age of 6. For such, the functional connectivity of the visual cortex resembles the sighted’s only coarsely: the rough polar coordinate system of eccentricity and angle is somewhat reserved, yet the fine small scale connections are not. Without any visual input to work on, V1 becomes a bitch to other modalities, doing dirty, post-processing work for them, while they get lazy and lose neural tissue in the process. We may be able to restore neural tissue by stimulation and kickstart overarching plasticity after the early years have passed, but it is doubtful to fully establish the vision of a 50-year-old congenital blind: total lack of visual experience forms a particular mind that cannot, for instance, think allocentric.

Nevertheless, the late blind has a fair chance to regain actual visual experience. Even hacky approaches like visual-to-auditory sensory substitution — representing visual imagery as sound — led to enhanced visual experience for two late blind users. Second Sight has taken up on vision restoration developing two lines of products: the Argus II retinal implant and the Orion visual cortical prosthesis, the latter of which stimulates V1 subdural, and has shown to invoke phosphenes for the late blind in a spatially consistent manner.

I believe it is worthwhile to exploit sensory substitution studies on the quest for sight restoration. Training the early blind to acquire a brand new sense is analogous to learning audio representations of visual information — although the interface is different (V1 vs audio), both require thorough neural plasticity to take place. Building on sensory substitution literature, we should assemble a comprehensive training procedure for early blind sight restoration using as much immediate, multimodal feedback as possible. Just some ideas:

  • Provide immediate feedback through other sensory inputs to consolidate plasticity: woven textures to touch, haptic screens, solid objects in hand for 3D learning.
  • Start with simple contrasts (lines, triangles, rectangles), textures (square and hexagonal lattices), objects (platonic solids), and simple directional motion; put off complex natural scenes until the final stages of the training.
  • Use depth detection to delineate parts of the visual field that are too far away to be touched, and remove them from the image; the early blind needs (haptic or auditory) sensory feedback to consolidate the information stimulated onto their visual cortex: if the object in the scene cannot be reached, the association between its shape (haptic) and visual 2D representation cannot be made; increase the allowed visual field depth as the training proceeds.
  • Synthesize audio out of the visual stimulus (as in visual-to-auditory sensory substitution) and play them during stimulation to function as artificial auditory side dish to the stimulated image.

All in all, the path to stimulate sight into the late blind seems relatively straightforward, while the training of the early blind pretends to be less trivial. In both cases, there are many moving parts that I ignored here: e.g. I assumed that stimulation is applied to V1, but one could access higher visual areas instead and get away with less electrodes, potentially conveying the same amount of visual information. Moreover, we don’t know how much the stimulation in itself matters compared to the internal predictions, and correspondingly, to the visual experience of the mind. The majority of the input signal arriving at the visual cortex is generated internally; external stimuli is overwhelmed by the internal predictions of the mind, filling up the visual space with imagined content, which sounds counterintuitive given our everyday experience, yet it’s all too apparent under the influence of psychedelics, for instance.

Ambitious applications

Several potential applications have been discussed by the founder of Kernel, Bryan Johnson. They mainly revolve around fixing cognitive biases [?], measuring cognitive power and effort [✓], enhancing creativity [✓] or reading comprehension [✓], substitute caffeine to maintain vigilance and arousal [✓], and feeding other minds’ sensory inputs or states into our own to walk in their shoes [?].

Eliminating cognitive biases of high-impact decision makers is indeed a desirable goal. However, the removal of biases will inherently introduce variance in the possible set of actions we can make, or the variety of choices we will have to cycle through before arriving at a decision. One may argue that most of our cognitive biases are outdated, evolution having a hard time to catch up; but in our everyday life, some are useful or at least can prevent us from spending cognitive power on finding optimal solutions, while a sub-optimal, yet practically comparable one is already within reach, encoded in us.

Let’s take an example: how would one go about correcting loss aversion? What amount of additional value would you assign to the positive outcome compared to the potential loss? By having access to brain patterns and surrounding stimuli, the brain interface is far short of context to “unbias” the human in an objective manner. While blindly encouraging risk-taking behavior is an obvious no-no, gathering more context of the sparse, unique situations people wind up in just seems intractable. In other words, we can only introduce further cognitive bias, which if done right, could oppose our innate biases. However, the magnitude of biasing can hardly be determined objectively without the comprehension of broad contextual information external to the mind being recorded. An alternative plan of attack would be to measure the amount of influence subcortical regions (like the amygdala, responsible for e.g. aggressive responses, among driving other emotion fueled behavior) have on our motor outputs, including both action and speech. Once measured, the user can be notified whether the actions currently taken are prone to be limbicly biased. I’m more in favor of such an approach that makes us more aware instead of just inhibiting limbic control.

Rapid motor learning [✓] By appropriating the Matrix trilogy, brain interface evangelists tend to vent about how the motor control skills of a martial arts expert will be uploaded to our minds in no time. These empty claims make me think of my chiropractor/puzzle games salesman buddy, who told me once that some excessively fat people may even forget how to consciously drive individual muscles in the abdomen or the arms. Not that those muscles don’t exist or function, just their coding in the motor cortex is likely entangled with the movements of other muscles, as they are mostly used in conjunction, if at all. Yet, sportsmen go through cortical motor plasticity while practicing: they build muscle and alter neural connectivity that drives that muscle. Such plasticity takes time, which is my point; it can’t be just uploaded. Alternatively, a viable path to take would be to accelerate motor learning by exciting motor neurons that previously have led to a dopamine rush (internally validated successful move), or 3D scan the user’s movements, fit to the desired movement, and perform subsequent brain stimulation according to the adjustments need to be made. One could also practice dangerous, complicated movements in a virtual world by just imagining moving the muscles: do a backflip in VR before executing it in real life.

Learning languages [✓] Equivalently, I doubt we could just upload the ability to speak foreign languages without active conscious involvement from the user over a longer period of time. Though we can definitely fake it first by decoding speech center activity into words, and just run machine translation on said words (e.g. speak English internally, and the brain interface translates it to Chinese). Translated expressions could then be either fed back to the speaker as audio input or as motor cortex stimulation driving the vocal tract to pronounce the desired phrases. Such an immediate feedback would hasten language learning, be that lexical or spoken, as associations between symbols of different languages can be built on the fly without much of a temporal delay. Just imagine thinking of a word and mouth the translated version of it immediately. It’s way easier than flipping dictionaries. Neural connections are thus can be built between an expression and the retrieval and pronunciation of the corresponding foreign word. These connections will function even when the brain interface is detached.

Some applications from the Wait But Why article on Neuralink: surgeon scalpel as an 11th finger [✓]; decouple sensory experience from the physical world, eating the cake and having it too (or more like not having to have the cake, yet still eat it) [✓]; extinguish pain [✓]; see in the dark by delivering infra cam frames onto the visual cortex [✓].

App-to-brain adaptation [✓] It’s simple: after enough interactions with an application (mobile or IoT), the intent, and corresponding neural patterns, of using its different functionalities can be recorded and recognized later on. Swiping on toxic dating apps won’t burn your thumbs anymore, Facebook and the fridge will open automatically, and you will take selfies at the right moment as seamless as you’ll spew emoji compositions matching your exact thoughts and feelings. I’m pretty sure we could find some meaningful use-cases too. The implementation might not be that simple: at the least we would need to spatially bias the learning models that associate brain activity to app functions; e.g. the intention to swipe should be extracted from the motor cortex.

Some more illustrated

A Neuroengineer’s Guide on Training Rats to Play Doom
Slow down perceptual time = speed up perceptual processing. Given that information transfer on silicon is orders of magnitudes faster than on electrochemical neural pathways, silicon shortcuts between brain regions could accelerate cortical communication, increasing reaction times, for instance.
Mind memes.
Make music in your head: reinforce the imagined signals superimposing on your auditory areas, like pressing a mic to the speaker, but in a more controlled way. Why not then have a music jam session with others in a shared mind space?

I believe that real impactful applications should enable a synergy between the brain and fast, reliable computing. It’s not about laying google maps directions over your visual input; it’s about a constant forth and back with rich feedback mechanisms between the brain and the device, both learning from each other. It’s about augmenting our brain, not about dumbing it down merely exporting functionalities to the device. We have the chance to increase productivity, expand human experience, and train the biological substrate simultaneously — a chance that mobile applications largely missed, merely exploiting our tendency for addiction and instant gratification. Wearing a brain interface should improve our online and offline cognitive capabilities, rather than simply making us feel lost when detached. That’s how human nature, and accordingly, beauty, uniqueness and cognitive variability are maintained, instead of just traded for efficiency.

--

--