Caster’s Baby (Part 2 & 3 ) { You and The Machine }

Opemipo Durodola
5 min readSep 18, 2018

--

A certain author said “we will need to plunge even more deeply into our comparison between speech and vision if we wish to have a truly comprehensive picture of the situation…… and that it now seems fairly certain that both human speech and vision are implemented within the brain in stack-like fashion.”

Art and Artificial Intelligence by G. W. Smith

There is a Czech proverb that says, “As many languages you know, as many times you are a human being.” Speaking two different languages makes me feel I am perceived differently. I always want to believe it has to do with my ability to express myself with the vocabularies that exist in them. This is because certain events leave me with no words to describe them in one language. So, I tend to combine two languages to express myself. In that moment of expression, people who are able to understand my description can empathize because of our common multilingual abilities. However, people’s perception of me has little to do with any common language we speak because I am no different. The difficulty to express myself completely in a language sometimes depends on the context, my preferred language and the language that adequately captures the context. How can I present an idea about something I don’t have a word for?

Imagine a world where a machine could empathize human feelings. Onyx’s bias was created out of syntax rather than semantics. This being that Onyx has no basis for is bias other than the words it has acquired which he has no understanding of. This implies that true understanding of words you listen to helps create your own bias. Hence, structured placement of words do not necessarily form models for conversations about an idea, rather, it is formed from one’s perception towards the idea, which gives room for another possible word to describe such idea.

How is a word understood? What comes to your mind when a word is uttered? A blue car, a house, a child, an office, a school. The first images that comes to mind when each of these words are mentioned or thought of are your understanding of those word. This is where your imagination comes in. That is why it is more difficult for you to understand abstract concept till an analogy of a factual concept is used to illustrate the idea. Very often, your representation of this concept is linked to an image you either saw first or recently. Now you would see that this concept and process is very similar to how your babies actually began to understand the world you brought them into.

I propose a scheme that attempts to make machines (Artificial) form their own Artificial Natural Language based on sounds and visuals which a human (Natural) is capable of learning and yet could be replaced by another machine. By ascribing sounds or a group of sounds to an object, scene or action, hence, enhancing its understanding. With Artificial Natural Language, a machine can understand and see things from a different perspective allowing it create its own empathy.

This scheme attempts to find semantics and context rather than correct syntactic structures. After all, syntax only gives birth to ambiguity. For example, sometimes children say words and make grammatical errors but we do not fail to find meaning in the idea they are trying to convey.

The learning agent(Machine) would be sensitized with mics and high definition cameras; with machine learning algorithms tailored for pattern recognition in speech streams, thus identifying high and low pitch, rhythm and loudness of syllables, matching using time stamps and gesture recognition and analysis as a basis for forming opinions.

In it’s infant stage, the machine will be in a controlled environment as an observer where a finite known set of entities would be intentionally placed. Activities would be carried out in this controlled environment such that spoken words relate to existing items in the environment. The infant stage is completed when the agent finally has a persistent sound or word it utters for every existing item in that environment. Such that it can carry object recognition without a labeled training data-set. Consequently it would be allowed to interact with such environment making attempts to interact with objects it recognizes.

Gradually foreign objects would be introduced into the environment with its actual word being uttered occasionally. Hence allowing us to test for curiosity and speculation in the agent. This would allow us see how quickly it can now associate words or sounds with an object. Uttered sounds will be noted and will be used in an attempt to communicate back with the agent. Eventually items that would never be pronounced would be introduced into this environment. This is done in an attempt to see how the machine would invent a sound for an object it has no previous word or sound for.

The next stage (formative-stage) would be a stage where we observe how the machine recognizes actions by being exposed to streams of repeated actions, where known objects exist. This actions would be repeated in real life, with expectations of it able to recognize this actions mostly as a result of consequences of actions to known objects. This phase is an attempt to learn the consequences of certain actions towards items and how it affects their states. After this phase the machine would be exposed to the ambiguity of this world and would have to find meaning from it with the guidance of a patron. The patron would interact with the agent using Motherese.

The motherese, in the context of this research, is formed by the agent and learned by the patron by playing sounds generated by the agent’s algorithm which analyzes the patterns it finds in speech. This whole methodology is based on the critical study of how babies learn their first words. Hence an attempt to create a digital approximation of every stage involved in child language acquisition. The various stages are designed so as to cover longitudinal and cross sectional acquisition.

In Conclusion, at the end of this research one would expect to have a machine that can communicate its idea, not necessarily by following correct syntax in any given language; hence a machine that can empathize. Empathize in the sense that an inanimate machine excels to a degree at manifesting something like warmth and compassion which is a branch of how these qualities emanate in human minds. This said mind being a biochemical machine. After all, symbols were not the primary means of communication for early men but sounds. Symbols only came in when knowledge had to be annotated in caves hence the birth of semantic annotation. So why should machines be any different if the goal is to model them to act like humans and simulate the human mind?

Caster’s Baby (Part 1) {Your Baby}

--

--