The boring state of Artificial intelligence

kh al
24 min readApr 17, 2023

--

I recently had a debate with a friend of mine about the intelligence and crazy advancement of AI models in recent months.

While some people are in awe of AI, I don’t quite share their excitement or fear of its current advancements. And so… to make things interesting we made a bet that I would fall for the current AI hype within six months 🙃️.

So, I thought it would be fun to write this blog post and share my perspective in the hopes of convincing a few other bored souls to join my side of the conversation!

Disclaimer N 1 :: This is my first time … writing a blog post (or anything else to be honest), so i’ll do my best 😤.

Chapter 1: Why are people so hyped/frightened about AI?

The first issue is what I like to call the “Wikipedia scientist” syndrome:

This refers to the common tendency for people who lack knowledge in a particular subject to believe that reading one or two articles on the topic makes them an expert. However, this is far from the truth. True mastery of a subject requires much more than just a surface-level understanding. It takes a strong theoretical foundation and, most importantly a humongous amount of practice to develop real expertise.

If you want to become an artist, for example, it will take more than just reading a few books. You’ve got to grab a pen, a brush, or whatever you fancy and start creating. Draw, paint, experiment, and embrace failure until you start to get the hang of it. It takes months, if not years, of hard work and dedication to become a true artist. Of course, if you’re a genius, you might get there a bit quicker!
Similarly, reading a blog post, a Wikipedia article, a book, or even a hundred books about rocket science or artificial intelligence won’t make you an expert.

However, the real problem arises when some of these Wikipedia scientists have a large audience of followers -sometimes very religious followers-. This can be particularly concerning, as their lack of genuine expertise and experience can result in the spread of misguided advice and misinformation. It is important to be discerning about the sources of information we rely on and to seek advice from true experts.

The second issue is “The Early Optimistic” Syndrome :

This tends to occur whenever a new technology first appears. For example, when cars were first invented, people immediately began dreaming of flying cars and futuristic cities, believing that these innovations would appear overnight. More than a hundred years later and ….

With AI, this phenomenon has occurred for the N-th time in history. Over the past century, new breakthroughs in AI have arisen every few years, leading people to become overly excited and start making wild predictions about sentient AI, robots taking over, the end of the world, and other bla-bla-bullshit.

Disclaimer N 2 :: Let’s get one thing straight right off the bat. I am by no mean an expert. I’ve been working as a Data Scientist/ Engineer for ~6 years now and i still lack a lo-ooooooo-t. So most of what i’ll be sharing is my own personal understaing of the matter and not the absolute truth. Take it with a grain of salt.
With that said, buckle up and let’s dive into the boring/bored and very opniated realm of what i think!

Chapter 2: What is Artificial Intelligence?

Now that introductions have been introduced let’s dive into the meat of the matter!

So what is AI?

Artificial intelligence -in this case, I’m referring mostly to the family of Neural networks algorithms- is a Universal approximation function.

That’s it?

No. For now, that’s all that matters and what you should remember more than anything else for the rest of this blog post and the rest of your life!
If you want a more boring yet comprehensive definition of AI you can just Wikipedia it 😉

Disclaimer N 3:: The term “art” is broad and sometimes subjective, encompassing a wide range of creative expressions such as visual arts, performing arts, literature, and more. In this context, I will be specifically referring to 2D art as it is currently the primary medium for Generative Art models.

Chapter 3: Midjourney the Artist AI?

Our initial exploration will delve into Generative Art models, with a particular emphasis on examining the extent to which they can be considered true artists.

So what is it that makes a true Artist?

That’s an interesting question😤! In my opinion, there are three main components that make a great artist: technical skills, creativity, and control.

  • Technical skills involve a deep understanding and mastery of art fundamentals, such as color theory, form, lighting, anatomy, and composition…. All of these art fundamentals can be defined as pure and beautiful mathematical functions.
    Fundamentals such as anatomy require a high level of precision with very little margin for error.- if an artist draws the nose or the hand even slightly inaccurately, it will be noticeable and may detract from the overall quality of the art piece. However, intentionally distorting or exaggerating these features can be a deliberate artistic choice that requires an even greater mastery of anatomy. This level of skill allows an artist to manipulate the human form to their will, resulting in a unique and expressive style.
    Other fundamentals like the color theory can be inferred. Many artists discover or rediscover the different equations of color through experimentation, which allows for greater creativity and innovation in their work.

Below is An example of an anatomical study by Michale Angelo.

The study of human anatomy, in particular, is considered one of the most challenging technical skills for artists to acquire. Some artists even dedicate their entire lives to mastering this subject.

  • Creativity is the personal and unique characteristic that defines an artist’s style and imagination. It’s what sets them apart from other artists and allows them to create something truly original and captivating.
  • Control is one of the most challenging skills for any artist. It requires a combination of technical and creative expertise at their highest level. Control is the ability to intentionally manipulate every brush stroke on the canvas with the purpose of eliciting a desired emotional response from the viewer — whether it be deception, anger, or joy. Every aspect of the artwork, including the colors, lighting, and posture, has a specific meaning and intention behind it. A truly skilled artist is able to execute this level of control with finesse and precision, resulting in a work of art that is both captivating and meaningful.

An example of an art piece by Piotr Jabłoński , one of my favorite digital artists who demonstrate exceptional mastery in the areas of lighting, color, composition, anatomy, and a unique, unsettling imagination.

Alfred Broge’s “Morning Sunlight” is a beautiful traditional painting that showcases the artist’s mastery of his craft. In fact, many of Broge’s works are considered masterpieces because of the exceptional care and precision he put into every detail. There is very little to criticize in his paintings, as they are executed with such skill and artistry.

So, can Midjourney be called a true Artist?

In short, the answer is yes.
The reason I have chosen to focus on Midjourney as opposed to Stable diffusion or Dall-e is because I believe the individuals behind the Midjourney model possess a keen sensitivity to art.

In my opinion, the quality of the art produced by Midjourney can be attributed to both the dataset and loss function used during training. While Stable Diffusion and Dall-e were trained using a brute force approach, which prioritizes quantity over quality of images during training, it’s possible that Stable Diffusion was trained on a curated dataset. Additionally, it’s possible that some of the art fundamentals were incorporated into the loss function used during training, penalizing the model for producing images that don’t meet a certain standard of quality.

By exclusively being trained on high-quality art pieces, the model is not even familiar with mediocre art, but.

But that’s not enough, While Midjourney may produce amazing art pieces. It is far from being even a good artist.

Let’s evaluate it based on the various components that make an artist.

  • When it comes to technical skills such as lighting, color, and composition, Midjourney’s performance is impressive and may even surpass that of many existing artists.
    However, its performance in terms of anatomy and overall structure is not as strong. The model has been observed to generate some extremely poor anatomical structures for humans and other living or imaginary creatures. Additionally, there are noticeable issues with the overall structure of the generated drawings, such as irregular shapes, misplaced elements, and a lack of coherence.
  • In terms of Creativity and Imagination, it is very hard to judge.
    Imagination is viewed as the ability to form mental images or concepts of things that are not present in the physical environment or that have never been experienced before
    However, this definition can be misleading as it implies that imagination is a mystical or innate ability that requires a special kind of genius.
    In art, imagination is referred to as a “visual library”. To improve their imagination skills, artists often begin by studying existing subjects, such as animals, objects, and scenery, and learning to manipulate them through drawing from different perspectives, under different lighting conditions …. Once an artist has mastered a subject, it becomes part of their visual library and can be incorporated into future works, either wholly or partially. Building a large visual library provides artists with the ability to combine elements at will, resulting in the creation of imaginary creatures and fantastical scenery.
    With that in mind, it’s clear that Midjourney possesses an incredibly vast visual library compared to any human artist, having been trained on millions of images. Thus, when it comes to imagination, Midjourney reigns supreme, but.
  • Although Midjourney possesses an extensive visual library that enables it to produce highly imaginative and precise images, it does not actually create art. It is the prompter who provides the input and guidance to generate a specific image. Developing one’s own personal visual library can be a challenging and time-consuming process. Midjourney can be seen more as a tool that enables artists to skip the arduous process of building a personal visual library and instead focus on the art of the matter.
  • In terms of control, there isn’t much to discuss, since Midjourney doesn’t independently create art. Even if we omit this part, the generated art lacks a specific direction or style, it is just random noise shaped to mathematically fit a certain input.

Disclaimer N 4:: Art is a highly subjective field, and while certain fundamentals may be considered more essential than others, there is no set formula for creating great art. Midjourney apart, many artists deliberately choose to focus on a limited number of fundamentals and can still create astonishing pieces.

In this art piece by Nathan Fawkes. The focus is not on anatomy or structure. Instead, this piece is a study of color and light, with the artist choosing to abstract any details that do not contribute to this goal. As a result, the piece showcases Fawkes’ ability to use color and light in a way that captivates the viewer, while also demonstrating his willingness to prioritize certain fundamentals over others.

On the other hand, Steve Huston, an anatomy master, has created a masterful painting that focuses mostly on the subject’s anatomy and gesture. In this piece, Huston demonstrates his skill in capturing the intricate details of the subject’s anatomical structure while also conveying a sense of movement and expression through the subject’s gesture. The painting showcases Huston’s ability to prioritize and emphasize specific artistic elements, in this case, anatomy and gesture, to create a powerful and engaging work of art.

In conclusion, while Midjourney’s capacity for generating high-quality images is impressive, it cannot be qualified as an artist in the traditional sense of the term. Midjourney lacks the creative spark, intentionality, and emotional resonance that characterize human-made art. the art generated by Midjourney is not the product of personal expression or cultural context, nor is it part of any larger artistic tradition. Midjourney generates images based on statistical patterns it has learned from its training dataset, without any underlying purpose or meaning. While these algorithms can simulate various styles and techniques, they cannot replicate the nuanced and subjective decision-making that goes into human-made art.

Disclaimer N 5:: I chose to start the discussion on AI with the topic of generative art to establish a common ground for all readers. Art is something that everyone has some degree of familiarity with. By doing so, it will be easier for everyone to comprehend the concepts being discussed …

Chapter 4: Sparks of AGI (Artificial General Intelligence) in chatGPT?

To begin, it’s important to address the concept of intelligence and its definition. Next, we’ll compare it to the current AI models and the way they function to finally try and answer the question of AGI.

Intelligence is a fascinating and intricate concept that has sparked countless debates among psychologists, philosophers, and computer scientists over the years. While there is no single, all-encompassing definition of intelligence, there is a global-” ish” consensus that intelligence involves a range of mental abilities, including reasoning, planning, problem-solving, abstract thinking, comprehending complex ideas, learning quickly, and applying knowledge gained from experience. -source: https://arxiv.org/pdf/2303.12712.pdf.

The paper titled “Sparks of Artificial General Intelligence: Early Experiments with GPT-4” provides a detailed analysis of each characteristic of intelligence and its relevance to artificial intelligence.
So let’s take a different approach.

To define intelligence in a manner that enables easy comparison and highlights the differences between human and machine intelligence, I would propose the following definition:

I believe that Intelligence can be described as the ability to use existing information in novel ways, either by applying it to a different field or by creating entirely new knowledge and extending the boundaries of the information domain.

ok, let’s see it in action to have a better understanding:

One of the first events in human history and what did set us apart from other species is fyre-🤓- manipulation. We were able to manipulate existing information -In this case, the information was natural occurrences of fire in nature- to our advantage and create entirely new uses for it.
By using fire, we started to cook our food, shape metal, and carry out countless other activities that were previously nonexistent.

While it may appear to be a simple task on the surface, in reality, it is incredibly complex and demands a tremendous amount of intellectual capacity. This is why I believe that the ability to manipulate and harness information in new manners is what intelligence is.

A more visual representation of this concept is shown below. The black box represents the entirety of information that exists. The blue portion represents the combined information domain of all humans, and the white dots represent the personal information domain of a single individual -the scale is not accurate or relevant in this representation-.
One of the key factors that distinguishes humans from other beings is our unique ability to expand our domain of information beyond what is currently known or understood.

To gain a better understanding of neural networks, let’s construct a simple theoretical network and build upon it to explore the workings of artificial intelligence.

Our initial network will be a basic one that can perform simple addition. This network will consist of two inputs that will take in two numbers, and the output will be the sum of those two numbers.

If you remember well, Neural networks are a universal approximation function. But what does this mean exactly?
From the moment it is created, a neural network has the ability to approximate any function -that is as complex as the model is built to support. In this case, our NN is very small and can only learn relatively simple functions-However, the initial output will not be correct, as the function has not been trained yet. This is where the process of training comes in.

To train a neural network, we need to provide it with a set of training data and a loss function. The loss function serves as an objective for the model to minimize, guiding the training process. During training, the network updates the weights of its neurons to create better approximations. The output of the network is compared to the ground truth output -which is the desired output- and the model adjusts the weights of the neurons to minimize the difference between the output and the ground truth.

In essence, training a neural network involves teaching it to improve the weights of its neurons, thereby making the initial approximate function more useful.

In this example, the NN did not truly learn how to perform addition, but rather, it learned a highly accurate approximation function that mimics addition. This function is simply an encoding of the training dataset. Consequently, the network is only capable of performing addition and lacks the ability to perform other basic arithmetic operations such as subtraction or multiplication.

To address this limitation, we can modify the network architecture to support additional operations. We can introduce a third input that specifies the type of operation to be performed -e.g., addition, subtraction, multiplication, or division-. We can also increase the number of neurons in the network to enable it to encode more complex information.

By incorporating these changes, we can train the neural network with our new dataset so that it learns to perform multiple arithmetic operations and enable it to learn a broader range of functions.

Disclaimer N 5:: Drawing these NN graphs is very boring and annoying. i will assume that now you have enough knowledge to imagine the next description of NN yourself

Now let’s improve our model further. Instead of just feeding it two numbers, let’s use natural language as input to perform arithmetic operations. For example, we can ask our model, “Can you please give me the sum of 12 and 35?” and it will respond with a natural language output like, “Sure, the result of 12 + 35 is 47.” To achieve this, we can utilize a large language model -a variant of GPT-, and train it to perform this specific task.

Can our model be considered intelligent now?

No, our NN model is just a calculator, albeit a less perfect one compared to the default calculator. The real value of large language models like GPT lies in their ability to understand and process natural language input, which enables us to perform tasks like arithmetic operations without the need for a separate calculator application.

We can continue to train our model to perform other tasks, such as text summarization, translation, writing style transfer, poetry generation, law document classification, drug discovery analysis, and more. As we continue to train and expand our model, it will become more versatile and capable of handling a wide range of tasks.

The capability of our large language model to perform a vast number of tasks does not necessarily make it more intelligent than the previous calculator model. Instead, It is -mostly- the result of its size, which enables it to encode a large amount of information and develop an approximate function capable of handling different tasks.

This is why I do not find large language models particularly impressive. The tasks that ChatGPT can perform were already possible with previous neural network models. For instance, text summarization, translation, and law document classification were tasks that other models could perform exceptionally well. The only difference now is that we have a single model that can perform all these tasks.

One of the challenges that arise with LLM networks is their tendency to hallucinate. This is probably due to some of the neurons within these models attempting to learn multiple tasks simultaneously, such as addition and text summarization, becoming just noise and leading to incorrect outputs in some cases. Another issue that comes with LLM is that it becomes more difficult to maintain and optimize the model performance across different tasks.

To address this issue, we build even larger models. which can minimize the interactions between neurons, allowing specific sections of the neural network to focus solely on a particular task.

Let’s explore some of the capabilities/in-capabilities of an LLM model -in this case, ChatGPT Mar 23 Version.- through some fun examples.

The actual result of the operation is 1999999982.99999998.

In this case, the model was unable to perform a simple addition operation correctly, demonstrating two distinct failures.

  • Firstly, it rounded off the result and included numbers after the decimal point. A result of 1999999983 would have been accepted. This highlights one of the limitations of LLM models when it comes to even the most fundamental tasks. The model’s inability to perform a simple addition operation accurately suggests that it may have overfit the training data, resulting in an approximation function that is insufficiently flexible.
  • Furthermore, the model’s failure to reason appropriately is also evident in this example. It unnecessarily performs the entire calculation, whereas in this case, a simple equation of 99–23 + 3 + 3 = 82 and concatenation of the remaining numbers would have been enough.

The incapability of LLM models to reason accurately about basic problems is another significant limitation. Reasoning is a fundamental skill that once learned can be applied across various contexts and is not restricted to a single specific case.

Neural network models are very good at approximating the behavior of an existing system, they are not necessarily capable of reproducing or comprehending it, nor are they capable of generalizing it for other purposes. Thus, although it may appear that the model is reasoning, in reality, it is simply mimicking the behavior of a system in a certain case.

I later tried to simplify the operation, and even started a new session

And here is another different example.

Well,

There are several issues with the model’s response. It includes irrelevant information about ethics and gives unhelpful advice about putting a banana on top of a guitar and tries to brag about how it knows that leaving a banana can attract insects. Moreover, the model mistakenly assumes that “Stongbunroftak” is a living being, even though it is a completely made-up word.

This highlights another boring limit of Neural Networks, which is the impact of the training dataset. The model has been trained to prioritize political correctness and safety, causing it to perceive anything slightly unconventional as dangerous and harmful. In this case, the model assumes that I have a nose, a guitar, a banana, a chinchilla, and a Strogbunroftak -whatever it thinks it is- Lying around and me trying to solve the very complex life issue of stacking them.

The second issue we see here is related to the functioning of our model. Unlike our initial calculator model, GPT and other LLM models are designed for next-token prediction -a token can be a word or a group of words depending on how the training data has been set-. At every iteration, the model attempts to predict the most likely next word to follow the given input. As demonstrated in the previous example with irrelevant information about safety and bananas, the model may generate several words that could potentially follow the input, but it cannot confidently determine the correct one that leads to a meaningful result.

This limitation is also apparent when false facts are produced by chatGPT or similar models. This can be attributed to the training datasets that the models learn from. For example, if the model is provided with a statement claiming that the Earth is flat, it may learn that there is a probability that “the Earth is…” can be followed by “flat,”. Additionally, the model doesn’t differentiate between scientific papers and blog posts or tweets written by random users; to the model, it’s all just text. This means that if a false fact is presented in a training dataset, the model might learn to reproduce it as if it were true, regardless of the source’s credibility.

Disclaimer N 6:: I mentionned GPT above as it is the core model behind chatGPT. However, chatGPT incorporates several additional techniques and hacks to function effectively as a chatbot. For more information on these techniques, I highly recommend reading the following article or visiting OpenAI’s website.

Also, During the training of ChatGPT, more weight was given to data from Wikipedia, making the model see it as a more valuable source of information. Also making it our biggest “ Wikipedia scientist,” 😉.

Let’s see another chatGPT example

The level of intelligence displayed here is truly remarkable, bordering on the edge of what could be perceived as a threat to human existence. Is it?

Let’s make it a tad more complex.

i tried another time in a new session and …

Well,

While LLM models excel at generating highly convincing text, it’s important to note that the output is not the product of some superintelligent entity engaging in complex reasoning or formulating intricate formulas. As the example above is extended beyond the common cases encountered during training, the model’s ability to accurately predict the output decreases and it begins generating text with whatever seems probable enough.

One reason why Large Language Models like chatGPT might appear more impressive than models that specialize in a single task to some, Is because it is difficult to perceive the boundaries of their information domain.
When a model is designed for a specific task, its information domain is more clearly defined -A model that has been trained only on text summarization can only do that-.

Earlier, we defined intelligence as the ability to expand our knowledge domain by utilizing existing information.

How well does AI perform in this regard?

AI is more similar to the intelligence of other animals than it is to human intelligence. AI is akin to a monkey being taught a trick in exchange for a treat. It is how AI is trained- and what it learns is merely a distorted approximation that meets our desired output. In other words, AI does not truly understand concepts such as addition, subtraction, or text summarization. Rather, it produces a mathematical estimate to satisfy our desired results.

AI also cannot expand its information domain beyond what it has been trained on. For instance, if Midjourney has only been trained on realistic photographs, it would never be able to create images in the styles of Van Gogh or John Singer Sargent.
Similarly, if a model is trained to perform a specific task such as addition, its information domain is limited to just that. Even if you add other tasks like subtraction and multiplication, the model will only be able to perform those tasks and never tasks like text summarization or health diagnosis. Adding a hundred more tasks will not expand the information domain of the model beyond those tasks.

The image above highlights the information domain of models such as Midjourney and ChatGPT. One key distinction between these models and intelligent beings is that the models are limited to their existing knowledge, whereas a creature possessing intelligence has the capacity to transcend and expand its information domain. However, AI models are exceptionally proficient at interpolating within their respective domains. For instance, a model like Midjourney can produce images mimicking the style of any artist it has been trained on, in addition to creating combinations of those styles with varying levels of nuance. Our domain of information encompasses the entirety of human knowledge and experience, not a single individual, as depicted in the image above.

To showcase chatGPT interpolation skills, I asked it to generate a poem about my two loves — cats and chocolate. However, the resulting poem turned out to be a bit dark due to the model’s training data, which warns of the dangers of feeding chocolate to cats and other bla-bla-bulshit,
— Enjoy 😑

While the model can generate poetry on any subject, it is important to note that its capabilities are limited by the information it has been trained on. Interpolation can only go as far as the encoded information allows.

If an AI model such as chatGPT were trained on data from a time before humans discovered fire, it wouldn’t be able to discover cooking or other technologies that require knowledge beyond that time period. This underscores the complexity of true intelligence and the limits of what machines can accomplish without human guidance and training.

In fact, the information domain of an AI model is often more limited than we might assume. For instance, consider self-driving cars. While these models may be trained on millions of images of cars and roads, the training dataset is usually limited to specific locations and scenarios. As a result, the model may fail to operate successfully in unfamiliar environments, even if it has performed well in its original training dataset. For example, a self-driving car model trained primarily on roads in the United States may not perform as well in other countries with different road designs and driving habits.

One of the most influential Wikipedia scientists of our time once said: “Humans drive with eyes; add eyes to cars, cars drive like humans.” However, this statement demonstrates a complete lack of understanding from that person. As we have discussed in this blog article, neural networks are a fragile technology that is subject to various limitations and challenges.
Creating self-driving cars is a highly complex task that requires much more than simply replicating human driving skills. Instead, it involves improving upon these skills in many different ways. That’s why more serious companies are using additional sensors, such as Lidars and radars, to provide an extra layer of information and security. These sensors allow self-driving cars to better perceive and understand their environment, which is critical to their safe operation.

Another example of the fragility of neural networks is face recognition technology. As you may recall, individuals of different ethnicities have faced issues with having their faces recognized correctly. One might assume that since the model has been trained on millions of faces, it should be exceptionally accurate in recognizing all types of faces. However, the reality is quite the opposite.
Despite being trained on millions of faces, the model’s performance is biased toward a specific ethnicity that is most present in the training dataset. As a result, the model has only learned to recognize faces that match this particular type, and those belonging to other ethnicities may not even be within its information domain.

While I may come across as an AI critic in this blog post, my intention is to provide a balanced perspective on the subject. I do not intend to diminish the value of current AI models, but rather to dispel any exaggerated notions of its capabilities. I am quite excited about the recent advancements in AI and eager to see what lies ahead.

To me, these models are akin to a natural language query system, opening up new possibilities for human-computer interaction. Instead of having to learn and use multiple software solutions, AI can serve as a universal interface for various tasks.

While AI may pose a threat to certain professions and income sources, it is important to remember that our intelligence is not defined by our jobs -In many cases, our jobs involve imitating AI by repeatedly performing mundane tasks to fulfill certain societal needs- There are numerous boring tasks that I would gladly automate to focus on more interesting work.

When it comes to Intelligence, AI models nowadays do not even possess the level of intelligence that our ancestors had when they first learned to manipulate fire hundreds of thousands of years ago. AI -as is today- should be viewed as a tool to augment us, rather than replace us.

Chapter 5: Conclusion.

There is no doubt that the advancements in AI are impressive, and despite the current limitations, the future of AI looks very promising. The direction we are heading towards is AI becoming a Universal User Interface for interacting with our computers rather than AI becoming sentient. The end game of that can potentially lead to an Operating System that does not rely on the multitude of applications and software that we currently use, but instead, only requires a Natural Language or Voice interface to perform all the tasks that we do today.

Thank you for taking the time to read this opinionated blog post. As I wrap up, here are some final thoughts:

  • AGI is currently a concept that is more akin to science fiction than reality. Although goats possess great athletic abilities, they cannot match the beauty and grace of a human athlete. While they may excel in certain areas, it is very debatable whether it can truly be considered athleticism.
  • Creating a genuine AGI demands technology that is still largely in the realm of science fiction. Self-awareness, emotions, and motivation are intrinsic characteristics that cannot be learned through parameters. While AI may appear to be self-aware, this is merely a consequence of being trained on human data such as conversations. The resulting text may seem self-aware because humans possess self-awareness, and our conversations reflect this trait.
  • While increasing the size of the models may enhance specific aspects, it will not necessarily lead to significant improvements in intelligence. Rather, it will likely result in only slight improvements over the current versions.
  • The next significant development in AI might not come from big AI companies but from Epic Games. They have been acquiring several data-centric companies in recent years and standardizing data formats across different platforms, which suggests that they may be working on AI, possibly related to 3D generation.
  • DeepMind has recently published two astonishing papers that highlight their significant progress in AI reasoning. This achievement may have gone unnoticed due to the overwhelming attention given to ChatGPT and other news. AlphaTensor, AlphaCode
  • Another great recent model is Meta’s Segment Anything Model. This model can serve as a generic Visual Scene Understanding Model, which has the potential to make a significant impact on the field of AI. Visual Scene Understanding is a critical task that will enable future AI models to develop a better understanding of the visual content they are analyzing. It is essential for Vision models to comprehend the visual information in a similar way to us. This includes the ability to distinguish different objects, identify their materials and characteristics, understand light and shadow in the scene, and…. Models like Segment Anything are a significant step in the right direction.

--

--