# Neural Networks: Latent Space’s *Physics* as a Loss on the Encoder

~ *if a timeseries’ latent space obeys rules, it formed abstractions ~*

**TL;DR** —A*dd a loss to the encoder proportional to how **un-*easy-physics*-esque **the latent space’s **own behavior** is. Anneal toward a latent space which has simple rules for timesteps’ motion of the state vector, **in that latent space itself**. You can also substitute running-the-world-simulation with **running-the-mini-physics** on the state’s latent-space vector.*

When a neural network is asked to *encode* the state of the world as a compressed *feature vector*, we often find that the space those feature vectors inhabit forms Cartesian coordinates. That is, if you measure the ** distance** and

**between**

*direction**encoding*[QUEEN] and

*encoding*[KING], that line closely matches the line between

*encoding*[WOMAN] and

*encoding*[MAN]! That’s a good sign that the

*latent space*these feature vectors inhabit is some kind of sensible ordering of the concepts. The network has ‘made sense of things’.

Yet, we do not know if that sensibility is anything *meaningful*. Are they forming what we would *identify as* real abstract concepts? How might we measure if this has occurred? I don’t assume we can guarantee finding *all instances* of abstraction — instead, I propose one path which, if we *did* find a network like it, would be proof of *existence*. And, it might be darn handy.

So, we would first determine a broad *grammar and lexicon* of what we mean by “physics-esque” behavior. (And, I should clarify: these are explicit and exact *equations* which are being *discovered, tested, and re-assembled* by a mini-neural network…) Then, we form that neural network, whose job is this: *given a latent space’s set of **observed timestep-pairs** (before-after), find an ‘easy physics’ which **describes the motions observed** in this latent space*. If that little neural network cannot *easily (and early in the training regime) discover* a **simple physics**, then** punish the encoder network** a little bit, specifically in the *places* that showed-up as errors in **each** rule the mini-physics network attempted.

Wait, what? So, your encoder produces a latent space. Then mini-physics network tries to find a simple physics for that latent space. When it attempts various rules, there are **locations of error when that rule is presumed**, a loss signal for the mini-physics network, as it hunts for the right rules. Those sites of loss, accumulated, should

*ALSO*be the sites receiving a loss function, on the

**! The**

*encoder*network**fewer**the number of errors, when using a

*single*attempted physics, then the

*stronger**the loss signal*that those errors’ sites receive,

**on the**: “Yes, encoding that input to this location

*encoder***allow a high fidelity reconstruction…but it was the**

*did***that didn’t fit a**

*only part of the latent space**polar rotation! Apply HUGE losses to the*

*encoder**, there.*”

This “greatest-loss-signal-when-nearly-perfect” **concentrates** the loss that the encoder receives from the mini-physics network’s searching, by *ignoring *the numerous physics which each produced many errors, in favor of the ** almost-perfect latent-space-physic’s last remaining errors**. The encoder network

*anneals loss*in the most favorable direction, ‘snapping’ into a physics as it comes closer, because the

*few remaining errors create a signal so strong that it*

**, forcing the**

*overwhelms reconstruction losses***to adapt to the coherent physic’s ‘**

*decoder**insight*’.

[I should mention, as well, that this is intended for the complex, emergent, swirling world and all its strange subspaces. Yeah, neural networks have re-imagined known physics, which is fitting a set of equations to the observed behavior of the *world itself*. That’s distinct from finding a physics *of the latent space*. I’m hoping a neural network can learn *explicit logic* and *exact relationships* from YouTube videos. A latent space *physics* would be the proof by demonstration that such reasoning was occurring. Also, you may need to restrict yourself to certain subspaces of the latent space, for predictions to hold — the other variables may be stochastic! I’ll stop there. Good luck.]