I agree it is pretty cool. The use of interlinguas are used in interlingual machine translation, which makes a lot of sense. Interlinguas are an abstract language-independent representation where a vocabulary common to the widest possible range of languages is created. Learning a machine learning model that generalizes to unseen data is directly analogous to this. In fact, we can think of all machine learning models as ‘interlinguas’ that help map raw input to a desired output (although less ‘mappy’ in the case of unsupervised learning). The ability to make predictions (generalize) is dependent on having a non-explicit representation that only codifies that which is most fundamental to the system. Codifying more than this is to fit noise and lose generalizability. A good model is therefore like a language-independent representation of some system we have captured using data.
Good on you for pointing out the most interesting aspects of the paper. You have a natural affinity for machine learning; keep it up! Even better, go build machine learning products and see what you can come up with. No better way to learn.