How Aliens Speak (Part 1)

Tim KD
ILLUMINATION
Published in
7 min readMay 20, 2024

--

Funny, have you ever wondered what languages two people from entirely different cultures speak to each other? What common rationale would govern any exchange of information between two sentient species?

Prompt by Calum on MidJourney

Let’s make this easier and tougher by laying the following baseline assumptions:

  1. They can only speak through modulated frequencies
  2. They are technologically advanced species.

Introduction to LLMs like ChatGPT

A Large-Languge Model (LLM), is an inference calculator used by ChatGPT, Llama AI, and Grok, that looks at a dictionary or encyclopedia and uses it to answer any questions from that information.

The interesting part about LLMs is their use of word associations. For instance, in my mind, the way I think is I see patterns in the world. From a young age, I associated some of these patterns, such as A for the word apple. But the older I got, I associated for instance, three letters, such as SNV, for the name of the company one of my friends used to work in. So I see SNV and I remember that friend.

The funny part about LLMs is they see the world just like that. They form word pairs and associate the meaning or context of a certain group of words according to the other word pairings in that same family of meaning. An orange and apple would all be found in the fruit family, but the LLM wouldn’t necessarily know what a fruit is.

We can use LLMs to derive some level of abstract meaning and to break down complex physics terminologies into already established simple ideas in our minds. For example, explain the Fourier transform for a grade four student, and this is what one LLM would say:

Alright, imagine you have a big box of LEGO bricks. Each LEGO brick represents a different sound or piece of music. Now, let’s say you want to understand what makes each piece of music special. The Fourier transform is like a special machine that helps you break down each piece of music into its different LEGO bricks or building blocks.
You see, every piece of music is made up of different notes, just like how a LEGO house is made up of different types of bricks. The Fourier transform helps us understand how many of each type of note, or LEGO brick, is in the music and how strong each one is. This helps us understand if the music is loud or soft or if it has a lot of high-pitched sounds (like a whistle) or low-pitched sounds (like a drum).
Just like you can use LEGO bricks to build many different things, you can use the Fourier transform to understand many different types of sounds or music. It’s like having a magic tool that helps us understand the secrets hidden inside every piece of music!
-ChatGPT

The self-interpreting code

It is not enough to simply share a large-language model with an alien, for we would have to make a lot of assumptions:

i. It is able to understand our computer machine language, the language of logic gates and control units. Given some sort of manual, this would enable it to translate the binary form of the LLM.

ii. It would have to spend time encoding all the written data of their civilization in the same LLM for it to be able to decode our alien language.

Humans learn foreign languages by making word associations that matter. That way, we were able to crack Sumerian and Ancient Egyptian hieroglyphics. We know the words Ptah and Pneuma might have some correlation because of the way they are spoken (phonetics) and the context in which they are used (to refer to a spirit).

With Sumerian, an Akkadian-Sumerian dictionary helped, but with Ancient Egyptian writing we used phonetic similarity with the Coptic languages. That was how one researcher was able to infer the name Ramses which appeared in a recurring hierogylph. The Romans’ Rosetta stone multilingual announcement helped crack it more.

What are the basic words which we would need to establish, for us to communicate properly with an alien race? How do we communicate their meaning?

Computer Vision

Tesla’s full self-driving software works by assigning some measure of “danger” to differently labelled objects it tracks. A bag on the road, for instance, would be given a danger weighting of 0, while a rock or a pedestrian would have 10. It learns the “solidness” of an object by assigning the solidity a danger parameter.

Solids, liquids, and gases

In that same way, I’d argue the first way we could encode the basic structure of our lingual world map is by first encoding what we identify as solids, liquids, and gases. One scientifically advanced sentient species talking to another should have at least established the existence of the basic elements of matter, so this would not be hard to do.

Step One

Hydrogen would be encoded either in Binary (or Morse) as one binary digit (i,e, one, or zero), representing its number of Protons. It would also include another binary digit, let’s say 1, to represent its atomic weight, which is also unitary. Therefore, the resulting composition of Hydrogen on our communication channel would appear as 11 or 00.

The binary would be converted as a certain measure of frequencies or certain amplitude of wave when transmitted via a wave medium.

The next would be Helium which has two and four for its atomic number adnd atomic weight. This would be conveyed as 2-4 or in Binary as 10–100.

Lithium is three and six when rounded down, and so forth until the element we would last wish to convey. The final element we share, might be interpreted as the final element we are confident about sharing to facilitate our basic discussion, or it might be interpreted as the final element we are aware of, indicating a seniority or inferiority of technological advancement.

I would propose we end at 20, Calcium, because we can now continue to communicate our intent.

This would be our first signal sequence of the 20 periodic table elements:

11–24–36–49–510–612–714–815–918–1020–1122–1224–1326–1428–1530–1632–1735–1839–1939–2040

or in binary:

11–10100–11110–1001001–1011010–1101100–1111110–100010001–100110010–101010100–101110110–110011000–110111010–111011100–111111110–10000100000–10001100011–10010100111–10011100111–10100101000 [1].

Step Two

We would now proceed with sending a signal that clusters only the gaseous elements in our sequence together, i.e. elements with the atomic numbers 1, 2, 7, 8, 9, 10, 17, 18. This would help us establish two things:

One is that these elements share something in common, which on Earth is that they remain gaseous at room temperature.

Two, that we are going to use this idea of “gas” to aide our discussions from here and out.

We may as well cluster the remaining elements to define what is solid, i.e. those with the atomic numbers 3, 4, 5, 6, 11, 12, 13, 14, 15, 16, 20.

So our signal now transmits two things,

1.) The first twenty elements of the periodic table according to their atomic numbers and masses.

11–10100–11110–1001001–1011010–1101100–1111110–100010001–100110010–101010100–101110110–110011000–110111010–111011100–111111110–10000100000–10001100011–10010100111–10011100111–10100101000.

2.) A signal that encodes the solidness and gaseous property:

The cluster of atomic numbers 12789101718, the atomic numbers of all our gases, would be assigned a unique number, say 100, to define what gas is, while solids could be assigned 200.

So, (without their atomic weights):

12789101718100 (gases), 345611121314151620 (solids), or:

112789101718100345611121314151620200 (gases and solids).

This means from now on, whenever we use the numbers 100 and 200 in Binary, they know we will be referring either to that family of elements or we are talking about the gaseous or solid property.

How will we communicate liquids?

We will use Mercury of course, and we might even take advantage of the unique compound water or methane which is the most abundant liquid in the universe and choose to group them together.

Mercury would be represented as 80200. 80 its atomic number and 200 its atomic weight. If we choose to use just the element Mercury to define liquids, we might also, as a forethought, need to consider another number to define solids, not 200. Alternately, we could also leave it there, assuming they would understand 200 means two things now, like Orange the colour and the fruit, and Apple if you get my drift.

Liquids would, in this case, be clustered with 80200 for Mercury, 1181511 for Hydrogen-Oxygen-Hydrogen, i.e., water, and 11612111111 for Methane.

80200–1181511–11612111111–300.

300 would be our property for liquids.

Hoorah! We have basically communicated solids, gases, and liquids as part of our universally decipherable self-interpreting code, which would look like this:

11–24–36–49–510–612–714–815–918–1020–1122–1224–1326–1428–1530–1632–1735–1839–1939–2040–12789101718100–11278910171810034561112131415161920200–80200-1181511-11612111111–300

or in Binary:

11–10100–11110–1001001–1011010–1101100–1111110–100010001–100110010–101010100–101110110–110011000–110111010–111011100–111111110–10000100000–10001100011–10010100111–10011100111–10100101000–11011110001001101010001100101100100–1110010111010111100110111101111100001010011001000–101000011001000–100101100 [2].

In fact, you might not even need to convert to Binary if your signal can take in the different decimal numbers and modulate them slightly differently. Binary would be the easiest to interpret however, in the receiver’s defense.

We can now communicate to aliens.

How do we now define the things that matter? What we’ve achieved is good; we have established fundamental axioms in as much as a self-driving car is concerned, to assign some measure of importance to the solidity of what we are about to say, but what about emotion, empathy, worldviews, morality, religion?

Photo by Rick Han on Pexels

Author notes: I stumbled on this question on how to make a self-interpreting code after re-reading the Three-Body problem, which I highly recommend, available for purchase on the affiliated link below:

Footnotes

[1] For the C++ code of how I generated an output of Hydrogen to Calcium in Binary, refer to this link or the code below, which can be run on any online compiler such as W3schools.

[2] I left out the compounds, just in case you wondered.

--

--

Tim KD
ILLUMINATION

Coder | Lover of everything above and beneath