Not all Artificial Intelligence is Generative AI (yet)

Published in

SDG Group

19 min readApr 8, 2024

The term Artificial Intelligence (AI) is more popular than ever nowadays, everybody is talking about quasi-divine AI`s capabilities of doing complex tasks with good results, ChatGPT almost replaced Google as a search engine (and teachers too), Midjourney makes amazing drawings and translating a video in real-time to another language is almost a trivial task, but do we know what is Artificial Intelligence? This article's goals are to go deeper into the definition of AI, to give a brief chronology of AI, and to give an overview of what fields exist within AI and where Artificial Intelligence may evolve.

What is Artificial Intelligence?

Colloquially, almost anyone intuitively understands that the definition of Artificial Intelligence could be “A machine/computer that is capable of performing tasks like a human”, which is not very far from reality, but it is somewhat imprecise, for example, a calculator can perform a task that a human can do, arithmetic operations, so is a calculator Artificial Intelligence? So it is common to call systems (a system can be understood as a program running on a computer that has an input and produces an output) that are capable of performing increasingly complex tasks, such as classifying elements, recognizing patterns in images, processing orders in natural language, etc., these systems are often referred to in the computer world as Intelligent Systems.

However, the EU definition is rather more complex and focused on technical aspects that are not intuitive or easily understandable:

Artificial intelligence (AI) refers to systems that display intelligent behavior by analyzing their environment and taking actions — with some degree of autonomy — to achieve specific goals. AI-based systems can be purely software-based, acting in the virtual world (e.g. voice assistants, image analysis software, search engines, speech and facial recognition systems) or AI can be embedded in hardware devices (e.g. advanced robots, autonomous cars, drones or Internet of Things applications)

In addition, there have been multiple definitions of AI over time, which focus on different aspects of AI, for example:

A new and exciting effort to make computers think… machines with minds, in the broadest literal sense, by Haugeland in 1985.
The automation of activities that we link to human thought processes and activities such as decision-making, problem-solving, learning, by Bellman in 1978.
The study of mental faculties through the use of computational models, by Charniak and McDermott in 1985.

Given all these possible definitions, all of which are true, it is of interest to study the classification of AI/Intelligent Systems proposed by Stuart Russel and Peter Norvig:

Systems that THINK like humans
Systems that THINK rationally
Systems that ACT like humans
Systems that ACT rationally

As can be seen, there are two dimensions to classify possible definitions of Artificial Intelligence:

Acting/reasoning: This is classified according to whether the ability to reason (imitating human mental processes) or the behavior that is produced is being assessed. The main difference between acting and reasoning is that the former is usually considered a quick reaction, which does not take into account the knowledge available, and the latter requires deliberating with the knowledge available and making a decision.
Being human/being rational: This is classified here in terms of reproducing human thought/actions, which may not necessarily be the best possible due to possible internal biases, such as emotions, unknown information, physical limitations, etc., or actual rationality, where “rational” is generally understood as “obtaining the maximum possible benefit from an action/decision”.

A simple example to understand this division could be the following: A son has been accused of a crime, and the father decides to confess and take sole responsibility for the crime, which he has not committed so that his son can go free, so the father has acted humanly due to his personal biases, but not in a rational way since he will be accused of a crime he has not committed, while the rational behavior would have been not to confess and not to be accused of a crime he has not committed (This example is based on the classic example of The Prisoner’s Dilemma).

It is also interesting to note the relationship of the field of AI with other fields, which in principle might seem unrelated, such as economics (how can I obtain the greatest possible benefit with the least possible effort in my actions), logic (how can I reason according to the available facts), linguistics (how can I understand commands), game theory (how can I interact with other systems), psychology (how can I simulate mental processes), etc.

Based on the above, we can see that the intuitive definition of AI as “Systems that are capable of acting like a human” is well-focused, although incomplete, and that there are other fields within AI that also simulate the behavior of other species, such as swarm intelligence, or purely biological behavior, such as evolutionary algorithms.

A brief chronology of Artificial Intelligence

The beginnings of Artificial Intelligence date back to the 1950s, when the necessary foundations began to appear, especially with the work of Alan Turing, who began to create systems that could carry out tasks that until then required human labor, such as decoding the ENIGMA code, or the well-known Turing Test, which was used to determine whether a machine was intelligent or not.

The emergence of the term Artificial Intelligence is commonly attributed to John McCarthy, who defined it as the construction of computer programs that engage in tasks that are currently more satisfactorily performed by human beings because they require high-level mental processes such as perceptual learning, memory organization, and critical reasoning.

First Golden Age of Artificial Intelligence, 1950s and 1960s

During the era of the 50s and 60s, there was great initial enthusiasm, as multiple successful projects began to appear in the field of Artificial Intelligence, for example, General Problem Solving or “GPR” (or SGPR in other languages), which through the application of logical rules would be able, in theory, to solve any problem similar to that of a human. Another major milestone of this era is the emergence of LISP, which was used as a major programming language in Artificial Intelligence, especially in the field of Expert Systems. Despite the advances, much of this work remained at a theoretical level, as there was not enough computing power to carry out a real implementation of what was proposed, for example, neural networks appeared in the 60s but could not be implemented until decades later.

First Winter of Artificial Intelligence, 1970s

Due to the great initial success of Artificial Intelligence, many experts of the time assured that in the next 10 years, we would have computers capable of thinking, creating, and learning, this was mainly due to an excess of confidence due to the good results in simple problems, although most of these systems failed when they started to work with more complex or real problems.

Another decisive factor in the failure of these systems when working with real problems was the lack of real knowledge in many of the fields in which they worked, for example, in the first real translation systems many problems were encountered because most of the developers were not linguists, and were not able to deal with complex aspects of language, such as idioms, dialects, polysemous words, or ironies.

Some theories also appeared in this period, which, if well-focused, were capable of solving quite complex problems, such as genetic algorithms, which initially proposed that by generating a set of small mutations in the machine code of any program, it was possible to generate a program with good performance, this type of theory could not go any further due to the inability to handle the combinatorial explosion of possible changes that they produced.

Second Golden Age of Artificial Intelligence, 1980s

The late 1970s and early 1980s saw the emergence of Expert Systems, introduced by Edward Feigenbaum, which initiated the first golden age of Artificial Intelligence, often referred to as the “era of symbolic models” or the “second golden age”. This is based on developing search mechanisms to find specific solutions to domain-specific problems.

In general terms, Expert Systems were proposed as a system in which formalized knowledge is available from one or more experts on a specific problem/domain, and can be used by means of a set of logical rules to obtain conclusions. They were very successful, especially in the field of Medicine, being one of the most successful and well-known Expert Systems MYCIN, which was able to give a diagnosis for infectious diseases in the blood, MYCIN was able to reason on the basis of “facts”, the patient’s symptoms, and through a set of “rules”, the knowledge provided by the doctors, determine what was the disease suffered by the patient and prescribe a treatment.

Expert systems began to be included in the industry in the mid-1980s and remained highly successful systems until almost the end of the 1990s. In the late 1990s and early 2000s, due to the exponential increase in computing power, the field of Machine Learning became popular, because they are able to obtain acceptable results based on data sets, recovering some techniques proposed in the 1960s, such as Artificial Neural Networks, and proposing some new ones, such as decision trees. This field has reached such popularity that the use of “models” generated by machine learning, due to its ease and acceptable results, generates a new problem, results are obtained, but the reason for those results, in most the cases, are unknown, since most of them are not explainable.

Second Winter of Artificial Intelligence, 1990s and 2000s

Although in the 80s and 90s Expert Systems, and in part the beginning of the popularisation of Machine Learning, made the field of Artificial Intelligence very successful again, the lack of new proposals meant that the only significant advances were the re-imagination of existing methods and techniques, reaching the limits of these. For example, Expert Systems are becoming increasingly difficult to implement because the problems and domains they have to work with are becoming larger and more complex, making their development technically and economically unfeasible.

The Third Golden Age of Artificial Intelligence, today

Currently, due to the even greater capacity of computation, storage, and generation of large volumes of data, Machine Learning techniques have been pushed to the limit, thus entering a “third golden age of Artificial Intelligence” (although this is still under discussion), specifically Neural Networks, creating a new paradigm called “Generative AI” in which a single model generated is capable of carrying out multiple tasks when until now it was more optimal to generate a model for a single task. However, this paradigm also has aspects of improvement that are under discussion: training one of these models requires large volumes of data, in the order of Terabytes or Petabytes, it takes a long time to train these models, and they are not very explainable, etc. Another aspect of why it is often said that we are currently in the “third golden age of artificial intelligence” is the popularisation and integration into the everyday life of ordinary people, going from being something used practically only at an academic/industrial level to being used by the general public in everyday tasks, such as information search/text generation with ChatGPT, image generation with Midjourney, and other similar projects.

Artificial Intelligence fields

Although, indeed, there are currently many fields within AI, the following are commonly considered to be the main fields:

Natural Language Processing: How we can interact with a computer using natural language, or how to make a computer understand natural language.
Knowledge Representation: How the available knowledge can be formalized and stored.
Automatic Reasoning: How it is possible, on the basis of previously stored knowledge, to use it in order to obtain new knowledge and draw new conclusions.
Machine Learning: How to learn based on different experiences/data in order to improve performance based on them, being able to extract patterns.
Computer Vision: How it is possible to perceive specific objects or patterns in images.
Robotics: How it is possible to manipulate, modify or move objects and/or other types of entities, both in the real world, as hardware robots, and in digital environments, as software robots.

In this article, we are only going to present some of the best-known or most widely used since their appearance up to the present day.

Machine Learning

The field of Machine Learning is one of the most popular fields of Artificial Intelligence at the moment. The general definition of this field could be “Creating systems/models that are able to improve their performance based on previous data sets”, in other words, systems that are able to learn to perform a task based on a set of training data.

Generally, this field is further divided into:

Unsupervised Learning: In this group there are methods and techniques that are able to carry out their training without direct human intervention, only with the data provided, commonly referred to as “features”. These methods and techniques focus on “clustering” of the different existing features into similar “groups”, “dimensionality reduction” of the existing data in order to process them in a simpler/more efficient way, and “association” which aims to find possible associations between existing features. The image below shows an example of how such techniques would work, in which existing elements are put together in “groups” based on the similarity of their characteristics.

Supervised Learning: In this group there are methods and techniques that require direct human intervention to carry out their training, in the form of “labelled data”, which generally means that each of the existing data, also called features, will have an associated “label” that assigns it a category or value. These methods and techniques focus on “classifying” (assigning a class/category) new data based on the learning done or calculating the “value” (commonly called inference) that the new data to be studied will have. The image below shows an example of how this type of technique works, based on “labelled” data, in this case the colour of each circle, we would be able to infer which would be the colour of a new case, in this example the circle without colour.

Reinforcement Learning: In this group of methods and techniques are found that carry out their training following the “Action and Reward” scheme, that is, the system will decide what action to carry out, obtaining a reward depending on its performance, so that over time and with the aim of achieving the best possible reward, it will take the best possible action, this way of carrying out the training is quite similar to human learning.

One of the most popular methods currently in the field of Machine Learning, within the Supervised Learning, are the Artificial Neural Networks, which are a directed graph composed of Artificial Neurons, which try to mimic the behavior of the human brain, below is a graphical example of how based on a set of training data could infer the class of a new element based on its characteristics, the image has some symbols and concepts that will be explained in more depth in future articles.

Today, Artificial Neural Networks are at the heart of so-called Generative AI, since, for the most part, Generative AI is based on neural networks. The first mention of Generative AI appears with “Adversarial Generative Neural Networks”, which are actually two artificial neural networks working together, originally one of these networks took as input data an image, modified it as much as possible, and passed this modified image to the other artificial neural network, which tries to find out if the image has been modified or not, so over time the first one learns how to modify the image without the second one detecting it (i.e. learning how to cheat it), and the second one learns how to detect if an image has been modified (i.e. learning how to detect cheating).

Knowledge Representation and Automated Reasoning: Knowledge Engineering

These two fields are usually closely related since the first is usually about how to formalize the available knowledge and the second how to use this knowledge to reach conclusions and obtain new knowledge, the most common field, in this case, being knowledge engineering and Expert Systems. An Expert System has a symbolic model of knowledge and is able to obtain conclusions based on it, these usually have three components:

Fact base: Here we usually store facts that exist in our context and are defined by experts. For example: In a room, there are 45 degrees, wood is a combustible element, etc.
Rule Set: These usually represent the logical reasoning that exists in the context and that experts usually define. For example: To produce a fire you need oxygen, a temperature above 40 degrees and a combustible element, etc.
Inference engine: These are the different logical rules that can be used to obtain new knowledge based on the defined rules and the facts. For example: As in the room where we are there are more than 40 degrees and a combustible element, wood, therefore, a fire can be produced.
There may be some other components, such as graphic user interfaces to interact with the system, but these may vary depending on the system and/or context.

In general, Expert Systems have been very successful, using languages such as Prolog or Lisp, but they require experts in the context and formalize their knowledge in a usable symbolic model. This is something that has reached such a degree of complexity that it has been necessary to implement feasibility tests in order to study the viability of the system. The following is an example of how a simple context, the possibility of a fire starting in prolog, could be implemented:

// It is considered high temperature if it is over 40 degrees
high_temperature(Temperature):- Temperature > 40. 

// Elements that are considered combustible
fuel_element(wood).
fuel_element(oil).
fuel_element(cotton).

// A fire can start if there is a high temperature and a combustible element
fire_started(Temperature, Element):- high_temperature(Temperature), fuel_element(Element).

Enter the current context with the query:

// We are in a room at 45 degrees and with wood
fire_started(45,wood).

By inputting these facts, and using the inference engine, you would get:

fire_started(45,wood)
:- true // Oh, better go out that room

This indicates that based on the facts introduced, the temperature is over 45 degrees and there is a combustible element, wood, a fire can start.

Natural Language Processing

As previously exposed, natural language processing consists of human-computer interaction through natural language. This field began in the 1940s, although later new objectives were added to this field, fundamentally the compression of natural language at a higher level and size, going from tasks more focused on simple interpretation or translation between languages, to much more complex tasks, such as summarising texts, detecting key terms within collections of documents or discovering the topics that a collection of documents deal with.
Furthermore, when working with natural language, certain aspects arise that complicate its processing: not all languages have the same grammar, with simpler languages, such as English or Russian, or more complex languages such as Spanish or Portuguese, as well as variations within a language itself, such as dialects, regional expressions, terms with different meanings depending on the region, etc. What is perhaps the most complex within Natural Language Processing is the processing of language ambiguity, concepts such as polysemous words (words with several meanings), irony, and how similar two similar words/concepts are some of the great challenges still being to be resolved.

To give an overview of this field, some of the most interesting and widely used concepts and techniques will be presented:

Stopword: A stopword is a word that lacks semantics, but is used for grammatical reasons, for example, articles, prepositions, and adverbs are usually stopwords, in the sentence, “The dollar falls against the yen” the words “The” are lack semantics, but is used for grammatical reasons.
Named Entity Recognition: This technique consists of being that words in a text are entities that exist in the real world, for example, in the sentence “The dollar falls against the yen” the words dollar and yen are currencies that exist in the real world.
Tokenisation: This technique consists of dividing a text into different “tokens” (token can be understood in this case as word and/or concept), if a text is not divided in a correct way it can lose part of its meaning, for example, words/terms that include blanks or special characters could be split into several parts and lose their meaning.
Stemming: This technique consists of reducing a word to its morphological root (the stem) to facilitate its processing, eliminating aspects of the word that could complicate its processing, words in plural, conjugated verbs, etc, an example of this could be: the stem of the word running is run.
Ontologies: This technique consists of the creation of a knowledge model called “ontology”, which contains a set of terms, their definition, and their relationship between them. For example, the medical vocabulary is an example of ontology, in which there are a multitude of clearly defined and related terms. One of the most complex and interesting problems that arise in the use of ontologies is the relationship of similarity between different terms, for example, monitor is a synonym of screen, but it is also a synonym of teacher, depending on the context it is more feasible to exchange one word for the other as they are synonyms, for example in the sentence “I have bought a new monitor” it is easy to change the term monitor for screen, but it would not make sense to exchange it for teacher.
Reverse Index: This technique consists of the reverse indexing of the documents in which a term appears, for example, the term “brave” appears in the documents identified as 1, 5, and 7. This technique is useful when working with large collections of documents, thus being able to quickly retrieve the necessary documents based on the terms they contain.
Topic Modelling: This technique consists of creating a series of topics for a collection of documents based on the terms that appear in them.

Soft Computing

While this is not one of the main fields discussed above, it is an interesting field to mention, as it has achieved some success and popularity due to the different techniques and methods it includes. The field of Soft Computing is a set of methods and techniques that are able to deal with, or are tolerant to, imprecision, uncertainty, partial truth, and approximation in order to find solutions that are tractable, robust, and computationally inexpensive.

This field begins to be defined to handle the inaccuracy of language due to Lotfi A. Zadeh’s work on fuzzy logic, which states that an element can not only belong or not to a group but that it has a value of belonging to that group between 0 and 1, for example, is someone who is 1.75m of height tall? Surely each person would give a different answer, so the concept of tall is fuzzy. Based on this concept, others have been introduced to be able to handle these fuzzy concepts such as the Mamdani and Sugeno type systems, in which membership intervals are defined for each of the possible categories, continuing with the example given previously, the concept of tall refers to the height of a person, so we are working within the context of the height of a person, being able to establish the following membership intervals or categories:

Short: the maximum membership of this group (membership value equals 1) will be when someone is 1.50m (or less) and the minimum (value 0) 1.60m or more.
Medium: the maximum membership of this group (membership value equals 1) will be when someone is between 1.60m and 1.70m, and the minimum values (value 0) when someone is less than 1.60m or more than 1.80m.
Tall: the maximum membership of this group (membership value equals 1) will be when someone is 1.80m tall (or taller) and the minimum (value 0) when someone is 1.70m tall.

The following image shows graphically what these membership intervals would look like.

Fuzzy representation of height categories, according to the intervals of membership

Based on the defined membership intervals it is possible to assign a category to a person, for example, someone who is 1.72m tall would belong to the different categories shown in the following image, seeing that the highest membership value is in the Medium class, so he/she will be assigned to this category.

Assignment of a class to a person to a height category according to the intervals of membership.

How to add intelligence to your system, a small practical example.

As previously exposed, artificial intelligence can be a system that can make a decision in which with the least effort the maximum benefit is obtained, we have called this rationality, in many Intelligent Systems (commonly called as AI’s), this concept is also often called performance measure, which serves to measure how good is our intelligent system, so a first approach to make any system intelligent is to design this performance measure or rationality.

To demonstrate how we could design this performance measure and how it makes our system intelligent we will use a classic example of a “greedy algorithm”, which always takes the best partial solution, in this case, we want to load packages with a value and a weight (in kg), with a forklift with a fixed capacity, so our performance measure will be the value that we can obtain with the capacity of the forklift, let’s suppose that the following packages are available and the capacity of the forklift is 10 kg.

=== Generated bundles ===
Bundle weight: 7 & value: 5
Bundle weight: 5 & value: 4
Bundle weight: 4 & value: 3

The greedy solution would be to load the highest-value object at a time, in this example we would load the 5-value package first, thus exhausting the capacity of the forklift and obtaining a total benefit of 5. How can we make this solution smart? An intuitive idea that a human might have when faced with this problem is to move the packages that offer the most benefit (the value of the package in this case) versus the effort required to move it (the weight of the package), in other words, move the packages according to the value/weight ratio, continuing with the same example:

=== Generated bundles ===
Bundle weight: 7 & value: 5, ratio: 0,71
Bundle weight: 5 & value: 4, ratio: 0,8
Bundle weight: 4 & value: 3, ratio: 0,75

The smart solution would have us load the 0.8 ratio lump first and then the 0.75 ratio lump, exhausting the capacity of the forklift, and obtaining a total profit of 9, so that our performance measure, the value/weight ratio, has improved the value obtained. Below is an example in Python of how this problem would be implemented, with the greedy solution and the smart solution.

import random

class Bundle:

    def __init__(self):
        self.weight = random.randrange(1,10)
        self.value = random.randrange(1,25)
        self.ratio = self.value/self.weight
    
    def __str__(self):
        return f"Bundle weight: {self.weight} & value: {self.value}, ratio: {self.ratio}"

TROLLEY_CAPACITY = 10
N_OF_BUNDLES = 10

def main():

    # Generate Bundles
    bundle_list = []
    for i in range(1, N_OF_BUNDLES):
        bundle_list.append(Bundle())
    
    print(" === Generated bundles ===")
    for bundle in bundle_list:
        print(bundle)

    # Greedy version
    
    ## Order the bundles by value
    greedy_bundle_list = sorted(bundle_list, key=lambda x: x.value, reverse=True)
    greedy_bundle_list_carried = []

    ## Take bundles
    weight_carried = 0
    while weight_carried < TROLLEY_CAPACITY and len(greedy_bundle_list) > 0:
        bundle_to_carry = greedy_bundle_list[0]
        greedy_bundle_list = greedy_bundle_list[1:]
        greedy_bundle_list_carried.append(bundle_to_carry)
        weight_carried += bundle_to_carry.weight

    ## Results
    print(" === Greedy bundles ===")
    greedy_value = 0 
    for bundle in greedy_bundle_list_carried:
        print(bundle)
        greedy_value += bundle.value
    print(f"Total value carried: {greedy_value}")

    # Smart version
    ## Order the bundles by ratio
    smart_bundle_list = sorted(bundle_list, key=lambda x: x.ratio, reverse=True)
    smart_bundle_list_carried = []

    ## Take bundles
    weight_carried = 0
    while weight_carried < TROLLEY_CAPACITY and len(smart_bundle_list) > 0:
        bundle_to_carry = smart_bundle_list[0]
        smart_bundle_list = smart_bundle_list[1:]
        smart_bundle_list_carried.append(bundle_to_carry)
        weight_carried += bundle_to_carry.weight

    ## Results
    print(" === Smart bundles ===")
    smart_value = 0 
    for bundle in smart_bundle_list_carried:
        print(bundle)
        smart_value += bundle.value
    print(f"Total value carried: {smart_value}")

if __name__ == '__main__':
    main()

I hope you enjoyed this article, future articles will explore in more depth the concepts of each of the AI fields that have been presented.
If you are interested in a particular one, please post it in the comments!

Thanks for reading.