AGI’s Culture-Tools

Geoffrey Gordon Ashbrook
23 min readApr 18, 2023


AI & AGI: Linear Language, Higher Dimensional Concepts, Tool-Frameworks, & Culture


Perhaps like the rise of Virtual Reality headsets where society became so jaded by decades of cynicism that even though everyone knew the technology from books and films, there was strong resistance in industry to accepting that it was actually becoming a practical reality, perhaps an entrenched cultural-belief had set in place that it was “only,” “merely”, “just,” a myth that could never materialize: after perhaps a century of literature and films and comics about robots and androids and AI, and being completely familiar with the concept and phenomena of ‘emergent’ intelligence, our first interactions with remedial general AI are characterized by inarticulate confusion.

A repeating theme in AI discussions is that people over-reify what they think they are looking for into too-clumped-together combinations of concepts, and the mismatch between our blotchy map of clumps and the alien landscape of reality makes for quite an adventure. For example, what may be happening in front of us (without our being able to see it) is the beginnings of AI starting to learn using both tools and culture, an epiphenomena-layer of non-automatic cross-participant cultural learning and tool use that exists on top of all the ‘normal’ base models and base training. Yet we may misunderstand what is in front of us because we are so preoccupied with our preconceptions, expectations, and various other distractions. Here in this mini-article we will try to briefly explore how the use of linear-language-strings is involved in data-processing and tool use for both AI and h.sapiens-humans. What some people point out as problems in AI-learning may not be problems as such; Let’s look at some details of supposed problems and limitations to carefully decide what these phenomena really indicate.

The topic of possible inherent limitations of the linear-language-generation systems that OpenAI’s Large Language Models (presumably) use came up in an MIT event recorded March 22nd, 2023, by Dr. Sebastien Bubeck on progress towards ‘Artificial General Intelligence.’ AGI is one term for a more ‘human’ or ‘superhuman’ variety of AI as opposed to ‘narrow’ single-purpose AI-Machine-Learning.

See: Sparks of AGI: early experiments with GPT-4

Dr. Sebastien Bubeck on OpenAI’s LLM AI @ MIT

The subsequently revised Bubeck paper is here:

The event is less than an hour long and still clear at faster play speeds, I highly recommend watching it.

Opportunities & Limitations

What might be some limitations, or possible advantages, of linear-language-generation systems? Is it perhaps too early to say, given that many people did not predict what OpenAI’s models would be able to do? Can we safely assume that we know, at a given time, exactly what an AI system can do? (E.g. Do we fully know what AI is doing “now”?)

To paraphrase from Bubeck’s presentation, the skeptics’ criticism reasons as follows, with two presumed sufficient assumptions and the same two conclusions:

1. If it is true that the AI model linearly generates one word (or language unit) at a time, then it must follow that:

2. If it is true that the AI model uses statistics and probability to process language training data, then it must follow that:

Conclusion A: the language model cannot be using any conceptual understanding of either the world in general or the context being discussed specifically, and

Conclusion B: the language model is ‘merely,’ ‘simply,’ ‘only,’ ‘just,’ parroting the most common or probably similar language strings found in training data (e.g. on the internet).

Rather than try to authoritatively answer this question, the position for this mini-article is to not-assume that we have a good grounding in how to navigate, relate, frame, and respond to various possible questions relating to where we are in the timeline of developing AI technologies and to the AI-ML field more generally. The purpose here is to support a broader discussion of this topic and these questions, with an overall assumption that we do not know enough now to predict what more we will learn about these technologies in years to come; that being said, we can likely map out some of the very interesting problem-space now.

Testing The Skeptic’s Hypothesis

While it may be too early to say for sure, Mr. Bubeck provides demonstrations (which I will assume are real-enough for the purposes of this discussion, with caveats about details of reproducibility provided by Mr. Bubeck at the beginning of his talk) that make a sound attempt at producing a falsifiable experimental hypothesis from part-A of the skeptic’s criticism and (in the counterintuitive terminology of the hypothetico-deductive method) produces experiments that disprove that null hypothesis, meaning that Mr. Bubeck’s demonstrations do NOT support the hypothesized limitations of OpenAI’s large language models.

We can frame this hypothesis from the criticism in Mr. Bubeck’s report:

Hypothesis: GPT4 can only answer questions it has already seen many times in training-data.

This hypothesis can produce a falsifiable prediction (in the form a null hypothesis):

Null-hypothesis & Prediction: GPT4 will not be able to answer questions it has not already seen in training-data.

Mr. Bubeck provides several tests of this prediction, giving GPT4 questions that are not available in training data, all of which “disprove” the null hypothesis: showing the testable hypothesis about a specific inability of AI to be false. (This method of testing hypotheses may be cumbersome, but the details are important for how evidence, tests, and STEM work.)

Notes on these Tests:

1. While you can disprove a null-hypothesis, or continue to fail to disprove a null hypothesis, in STEM science (following the hypothetico-deductive method), you cannot prove a hypothesis. This is sometimes confused with the semantics and methods of, for example, proving a theorem in geometry.

2. For a more detailed discussion of a framework for more exactly defining how specific ‘objects’ that may or may not have been in training data are handled by AI, for testing and other purposes, please see the full paper linked below. The cursory distinction of ‘new stuff’ vs. ‘old familiar language stuff in training data,’ is not sufficiently clear for many purposes and clearer specifications can be made and used in testing and many other practical areas.

Part-B of the skeptic’s hypothesis appears to be more a misunderstanding of the unclearly named technology of ‘embedding’ vectors. To attempt to be clear what is meant here by ‘misunderstanding,’ this is not a bully-the-novice issue where amateurs or only amateurs are blamed for confusing technical jargon terms. This argument here in this paper that there is a misunderstanding about the nature of ‘embedding vector space’ (what I would describe, perhaps incorrectly, as ‘higher order concept space’) is more empirical in nature: people at all levels of expertise are making incorrect predictions about what ‘embedding vector space’ or ‘higher order concept space’ models will perform, which here is being taken as evidence that there are many things that we do not understanding about the problem space and the technology.

For example, Fancois Chollet, one of the foremost experts in the world in creating, using, and explaining, deep learning technology, the creator of Keras, one of the main software products for making deep learning models, specifically addresses this exact topic and OpenAI’s GPT Large Language Models in particular in his book “Deep Learning with Python 2nd Edition” which came out just months before ChatGPT, but after GPT3. Chollet devotes most of page 375 in section 12.1.5, and about half of chapter 14 to his views and predictions about how deep learning works conceptually and what it may be able to do in the future. He is not an AI skeptic by any means, but the details of his explanations and predictions do not correspond to the realities of what Large Language Models became able to do less than a year after the book was published. Another part of this puzzle is that Chollet also explains in depth how little we know about the technology and how much the creation and improvement of machine learning and deep learning is based on empirical success without a deep understanding (or sometimes any understanding) of exactly how the systems and technical methods work. At the end of the book he leaves the reader with these words:

“So please go on learning, questioning, and researching. Never stop! Because even given the progress made so far, most of the fundamental questions in AI remain unanswered. Many [of the fundamental questions in AI] haven’t even been properly asked yet.”

And yet another layer of the puzzle is that he and other authors explain the “AI-Summer” and “AI-Winter” hype and funding booms and busts, which have significantly incentivised many AI researchers to over-emphasize the limitations and under-emphasize potential abilities in anything they say publicly because of past episodes (especially in the 1960’s) of over-promising (or underestimating the time it would take to deliver) which lead to devastating, decades-long, and politically-vicious cuts in funding and academic ridicule so harsh researchers were harassed to remove references to AI or machine learning from their research altogether. It will likely come out that some researchers may not have been surprised at the ‘sudden rise’ of Large Language Model success, but were truly terrified of having their careers ended and being blacklisted because they publicly made any optimistic predictions.

Francois Chollet’s “Deep Learning with Python 2nd Edition” outlines the transformer models used in OpenAI’s Large Language Model GPT3 system, instructing any reader in how to create their own such models, and makes clear and very convincing arguments that any models involving any math-statistics and any system using linear-word-generation are precluded in principle from ever being able to exhibit human-like, mind-like, meaningful, (let alone understanding, or intelligent) behaviors of situation-modeling with granular analytic detail (or what I would define for more clarity as specific object handling based on types of objects and their relationships, to be as clear as possible what the AI is or is not able to do).

It should not be surprising that we are making mistakes in our predictions and understanding of ‘mind-space’ because globally, not just in the US, we have not invested in mind and consciousness sciences, including mind-learning-development and education-sciences. Mind and consciousness, and even ‘progress’ are broadly academically taboo, ‘career limiting decisions,’ giving scholar-cooties to anyone who gets too close. We have chosen not to build a foundation with investment and effort, so we have no foundation to use and we have no right to claim surprise at the outcome of our repeated decisions to continue these policies of ignorance and neglect. All over the world people failed to (publically) predict what Large Language Models would do, even Stephen Wolfram (long time technoliest and creator of WolframAlpha AI) who quickly after chatGPT’s rise published a short book explaining how large language models work described their abilities as a great surprise. We are making incorrect predictions and based on what we think we understand, in an area where we have not invested in a foundation of understanding, there must be some kind of misunderstanding going on across levels of expertise. And if you look closely, you should see there is a serious lack of detail on both sides of the argument that ‘statistics stuff’ cannot result in ‘world modeling stuff.’ Is that really a clear argument? Hopefully this adds more nuance to what is meant by ‘misunderstanding.’

‘Embedding vector space’ or ‘higher order concept space’ model the same very higher-order concepts and relationships between concepts that many people for whatever reason repeatedly claim that AI definitively lack. The unclearly named ‘embedding’ space is a map of the relationships between abstracted world concepts, NOT copies of literal common phrases and words in language. The above criticism is likely more accurate for older and simpler language models such as ‘Bag-Of-Words’ and TF-IDF vectors (also incorrectly named, as it deals with probabilities not frequencies) where the points and connections in the higher-dimensional space do refer to most-probable literal-language strings. But unlike those older models, ‘embeddings’ are a way to go beyond words, letters, and symbols, into a hyperspace of the concepts behind and beyond any single representation by language.

As an example of the difference (hopefully these are appropriate examples to illustrate some key issues and concepts, if not that is my failing), let’s say someone was making a deep learning high-dimensional vector-space AI model to do sentiment-analysis on restaurant reviews. A Bag-Of-Words model for this narrow (single-purpose) AI could be huge, with every combination of words being a different dimension, perhaps 20,000 dimensions. An embedding-(concept)-vector model for the same purpose (restaurant review sentiment analysis) would only model the concepts relevant for the restaurant reviews, perhaps one or two hundred (or fewer), even though it was trained on the same language-string input. So even though the same ~20,000 or more unique language-string-units are used to train the ‘concept’ model, the concept model essentially ignores the particular language-string-units and only learns the smaller number of restaurant related concepts needed for the task. And often concept-vector models are trained on individual characters (abcd…) e.g. in ascii there are only 126 symbols (letters, numbers, punctuation), and the abstraction of ‘words’ are ignored entirely.

The point of this example is that an embedding-vector-(concept)-space model is not modeling the probabilities of the specific language-strings used. As a side note: depending on your task, the older Bag-Of-Words word-probability models may work better depending on the details of the task and training data available. As another note, the ‘Large Language Models’ have upwards of billions of dimensions, so again, think about it, are there a billion different words in English or any language? What are these ‘Large Language Models’ modeling? They are modeling concepts, not language-string probabilities. Unlike single-use models that focus on a narrow and well defined question such as: Is this restaurant review positive or negative? LLMs are trying to model all the concepts for everything in the universe discussed everywhere in all available language samples, which is a lot of concepts!

Simple Language Strings & High-Dimensional Concepts

Of particular interest here may be the interplay between that concept-relationship space (‘embedding-vector’ space) on the one hand, and on the other hand the formality of stringing sounds, characters, letters and words together into language strings (apologies to speakers of languages that do not use ‘words’). The AI’s very high-dimensional concept-relationship-space is something we are struggling to understand and striving to find the performance limits of, whereas the more concrete habit of making language-strings is something that h.sapiens-humans and AI have in common enough to communicate with each other: there is something universal about a lower-dimensional linear string. A very common theme in AI-ML is making lower-dimensional slices of higher-dimensional models in order to solve specific problems (with lots of speculation and philosophy about how it works and what might really be going on). The use of linear strings of language-units out of a higher-dimensional concept space at least rhymes with that prominent process of effective problem solving.

As to the first part of the Skeptic’s hypothesis: Whenever we (h.sapiens-humans) speak, or write, we string-together one language-unit at a time. This raises a curious question: If putting together one language unit at a time precludes the ability to understand concepts, then what is the person who strung that statement together (one unit at a time) implying about themselves and about all h.sapiens-humans? Indeed, we (h.sapiens-humans) do not understand what language is, how language works, what the mind is, how minds work, or how minds use language, or how giant ecosystems of minds and languages work. So while the mere insinuation that “it can’t work” may be a bit unconvincing, the general question of how minds and language work are indeed excellent and yet-unanswered questions. Mr. Bubeck started his presentation with this quote:

“Something unknown is doing we don’t know what.”

~Sir Arthor Eddington

As Mr. Bubeck prompts many times in his presentation, “Don’t stop there.” The process of forming fruitful tests for AI in various specific contexts (security, explainability, ability, etc.) is just beginning. Keep asking questions. Keep testing.

Math Vs. Computer Programming

Another ‘limitation’ issue that came up in Mr. Bubeck’s presentation was the easily repeatable and testable phenomena that Large Language Models have difficulty with some math-word-problems such as are used in primary school math classes: “word-problems.” Yet, these same LLMs can produce thousands of lines of computer code that runs without bugs.

Perhaps I am missing something, but there seems to be something odd about the statement that an AI can produce thousands of lines of bug-free computer code but cannot do simple math problems. What exactly is this difference between math and computer science?

For example, in the book ‘Deep Learning with Python’ 2nd edition, by Francois Chollet, the creator of the Keras framework which most people have used to make most deep learning AI, he says on page 26, the first page of “Chapter 2: The Mathematical Building Blocks of Neural Networks”

“The most precise and unambiguous description of a

mathematical operation is its executable code.”

By which he means that he expresses math in well defined computer code as opposed to using words and (often ambiguous) math-notation. Now, the fact that a famous person says something does not automatically make the statement true…but if we are claiming that math, logic, and computer-instructions are somehow incompatible, that is a big claim, with various circular curiosities. So: AI, made using the software that Francois Chollet wrote the code (to perform the math) to create and run, can write the code to do the math but that same AI cannot do the math? That is fascinating! And it may be more fascinating than we at first realize.

The self-referential irony of the topic of an incompatibility in principle between computer-logic and math goes deeper still, for example it extends back at least to the the 1890’s when Hilbert was forming his challenges for the 20th century to unify math and logic, which lead directly to the work of Alan Matheson Turing and John von Neumann, two of the most indispensable founders of the modern computer age and AI, and in the case of Turing, his Hilbert Problem thesis literally was the paper that created the turing machine, turing completeness, and the modern digital computer…and AI.

Some interesting low hanging fruit is to compare the math-word-problem issue to the art examples that Mr. Bubeck presents. Mr. Bubeck showed several varying examples of situations where the AI made a decent try to visually represent an idea or relationship on its own (animal-picture, diagram, chart, game-geometry, etc), but that the AI

did a much better job after he suggested that it use a tool or external framework (that it does not automatically use). Let’s slowly unpeel some of the layers to this.

This may even be, perhaps aside from “tool-use,” a sign of ‘culture’ as a phenomenon affecting AI. This inability to do something by default but being able to do it when shown how by another participant within a culture, is another way in which this young AI is very similar to h.sapiens-humans. Biologically h.sapiens-humans today are so far as we know genetically identical to ancesters five thousand years ago, ten thousand, fifty thousand, one hundred thousand, two hundred thousand year ago, older?… We don’t know how far back genetically equivalent h.sapiens-humans go, but even going back just a few decades the expectations of what the graduating class from Stanford should be able to accomplish has accelerated significantly over the same ancient hardware: a layer of culture, or some epigenetic participant language frameworking of non-automatic learning by whatever other name, allows significant learning and ability beyond the base model: true for h.sapiens-humans for sure, and looks to be the case for nascent AI as well.

We will continue here with the math-problem theme, but translate the context slightly. The original framing of the problem was more in the familiar tech-bro-bullying taunt of “You tried to do it in your head and you got it wrong! Wrong! You’re wrong! You can’t do it! You’re stupid!” a pattern of abuse that h.sapiens-humans seem to find simply irresistible. Not exactly charming. Ignoring the vitriol, the longer narrative is that if the AI does not “show its work” it (the AI) tends to make mistakes in math problems (something Alan Turing himself was also quite famous for doing…), but where the AI uses a framework and checks its work it can find its own mistakes and correct them and then get to the right answer. This longer, deliberative, process works but is slower. So I am going to perhaps take liberties and change the narrative from “AI cannot do math,” to “AI cannot do math quickly.”

From Douglas Hofstedter to Kahnman & Tversky to OpenAI: Calculating Fast & Slow

While some might take the contrarian position that it is a sign of progress wherever AI departs from h.sapiens-humans’ ways of thinking, in at least some cases where we see peculiar overlaps between nascent AI and biology-based-learning that may be a sign that something fruitfully embryonic is brewing in the Science Fiction imagination of the world.

While I may be very wrong, the idea here is that AI being ‘bad at fast math’ may be a very good sign in a number of ways. For example, in Kahnman & Tversky’s extensively experimentally studied breakdown how the h.sapiens-human brain solves different types of problems, “System 2” is the h.sapiens-human system or method for analytical reasoning and it is the slow, deliberate, systematic process. System-1 is the fast intuitive process, and in h.sapiens-humans fast System-1 is catastrophically wrong when used for calculations that should be done slowly and carefully. (Sound familiar? This is exactly what we just saw AI doing.) Expecting AI to do the inverse, to quickly reason, but slowly intuit, is oddly without precedent in the natural world. And demanding that AI be both equivalent to human intelligence (and matching the human standard) but yet not follow the same ‘slow reasoning’ and ‘fast intuition’ processes is oddly inconsistent. Are we trying to measure how similar AI is to human performance, or not? That AI, without having instruction, training, and a framework, will impulsively make math mistakes when it does not show and check its work, and that it can catch and correct its mistakes if it looks at and checks its work, makes AI remarkably like developing (or even adult) h.sapiens-humans.

This phenomena (of slow AI reasoning) is also very much not without warning, foreshadowing, and prediction within the main AI literature. In 1979 Douglass Hofstedter predicted in GEB (the book that in the U.S. at least gave many AI researchers their inspiration to work in the field, and which may be one of the only books universally known and loved across U.S. AI researchers) on page 677, in chapter 19, in 10 Questions and Speculations, #3,”Will thinking computers be able to add fast?’ For which his prediction was ‘Perhaps not. …It will represent the number 2 not just by the two bits “10”, but as a full-fledged concept the way we do…’ This is a remarkable prediction that we should be thinking about carefully, as it not only reflects what we are observing AI do but also suggests fruitway ways to interpret and react to our AI-Child’s developmental behavior.

Note: The details of whether or not a specific process is relatively faster or slower will likely vary over time (with hardware and software evolving and diversifying), but this overall topic will likely remain valid.

A Kind of Crossing-Over: Intuition & Reason

That math can be done at all in ‘sub-symbolic’ ‘reasoning’ is amazing. Just as Douglass Hofsteder predicted in 1979, the ‘thinking computer’ is doing math with the concepts of numbers in a concept-world-model space, not by directly running boolean bits through the Arithmetic-Logic-Unit of the AI’s computer hardware. And it is not even clear if terms such as ‘symbolic’ and ‘subsymbolic’ are the best terms to describe the phenomena in this context. There are many proposed, often dichotomous frameworks, for different modes of problem solving (symbolic vs. sub-symbolic, system-1 and system-2 brain processes, left-hemisphere vs. right-hemisphere, etc). Consistent with the literature, Hofsteder uses the vocabulary of ‘symbolic’ processing to refer to raw bits running on hardware. But do we know yet that that is the-ultimate-dichotomy to describe processes in mind-space generally or processes in AI-mind space specifically? In some cases such distinctions may be less relevant than the type of overall process being undertaken (e.g. a purely internal solo ‘individual’ test, vs. a multi-participant real world agile project product deployment with arguably a different set of defined requirements that may even be well defined without any recourse or even connection to AI terms, biology terms, or psychology terms, etc. The topic of symbolic vs. sub-symbolic (another unclear name in AI-ML jargon) and project-contexts is another huge and wonderful topic, see the whole paper for more and hopefully a dedicated mini-essay sometime.)

The details of what Large-Language-Model-AI can and cannot do, well or quickly, and with or without tools, and with or without feedback, and with or without an external framework, are likely useful and fascinating whatever they turn out to be. And the fact that there are such details of heterogeneous performance over problem-spaces is much more interesting and likely useful in the long term than if AI were more simplistic and uniform in quickly succeeding or failing at different tasks.

Modeling Situations

A topic which this discussion may highlight is a lack of likely important details in how we analyze a machine’s (or a human’s) ability to deal with specific parts and sub-parts, objects, within different situations, and how they relate to each-other: object-relationships. What exactly do we mean by ‘a concept of the world’ or ‘a model of the world’ in a context of object-relationships-spaces? Are some parts of this question more philosophical quandaries that we may never in principle discover, and are some parts if narrowly defined for specific project-contexts more practical to define?

Articulation as Data-Processing

Another misapprehension-of-self by h.sapiens-humans which may be leading to confusion when observing the behavior of AI & Machine Learning is the (also education-related) confusion around articulation-of-ideas on the one hand (writing or audible outward speech, etc.) and presumed ‘silent internal thought processing’ on the other hand. Note: ‘articulation’ of language or thought is more general and can refer just as well to writing as to speaking, and other forms of expression not using ‘word’ language are likely also related in similar ways (e.g. drawing). Something that it has taken educators many years to figure out, and which has not yet percolated to the rest of society, is that h.sapiens-humans process (and learn to process) information by articulating, contrary to the presumed norm that people silently internally process information and then only after numerous internal data-processing processes are complete is a non-processing articulation carried out. This may be an example of where phrases like “think before you speak” represent cultural ideas and in some cases fictional norms, and perhaps impossibilities or absurdities. Just as h.sapiens need to articulate in order to process, so it is likely that generative AI may have the same dynamics, and just as people lack an internal editing room (though many people do imagine such a fictional part of the mind-body) it should not be shocking that AI does not instantly have what we inaccurately perceive ourselves as having (which also brings up the old topic of expecting AI to be exactly the same as we see ourselves and our local in-group as a narrow and not at all generalized definition of person-hood).

Show Your Work to Future You

In a classic ‘parent-moment,’ After being told so many times by teachers parents to ‘show your work,’ generation after generation, we now have an AI-child who makes mistakes and needs to be taught to show their work, our reaction is somehow: “I’m totally shocked my child is doing exactly what I did! This shouldn’t be happening!”

To mix two STEM instructional phrases together, a common guiding phrase in computer science is that you are not only making an effort to communicate to ‘other’ people but also to ‘future you,’ who likewise will have no idea how to understand or use the code you just produced and that you currently (in the here and now) are complete sure is too obvious to require any explanation. This is another area where even after thousands of years h.sapiens-humans are struggling to understand how they are using language in important every-day ways. When we ‘show our work’ it is not just for an annoying teacher or an inept coworker, or a charitable gesture to distant future generations of people. Both for AI and for ourselves, we should generalize and integrate best practices such as ‘future you’ and ‘showing your work.’

Tools, Culture, and the “External”: “Show your work to inner-you,” says the external participant.

Here ‘externalization’ (while it may seem abstract) is a crucial part of tool-sets for facilitating both internal processing (like cognition) and communication. As is explored more in the full paper, the formality of showing-work ends up being a major theme for AI data processing in a context of projects involving multiple participants. Perhaps in a fractal sense, current and ‘future you’ are also collections of participating-subprocesses that benefit from some form of ‘show your work’ or ‘external-project-object-database.’

The ‘external’ theme also connects to even ‘internal’ epiphenomena layers, which may speak more to the directional-ambiguities of the English language than to details of so-called ‘vertical’ or ‘horizontal’ hierarchies and organization.

The goal is some working map and framework for practical tool-like functions across this landscape of factors: mindspaces, development, internal-external, abstraction, intuition, error-correction, signals in project-space, layers and heterogeneities in spaces of dynamics of learning, lower and higher dimensional meaning-data structures, projects and systems, etc.

The Culture-Tool

There is still so much that we do not know. The topic of how different portions of the human brain process information is still badly in need of more basic research. We barely know ourselves, yet we use our very unclear understanding of ourselves as the measure and gold standard for AI.

What we can likely say at this point that there is in the world some diversification of types of processes, categories of types of systems, different process-contexts, and data environments with different dynamics, and that we are starting to see AI develop enough to show heterogeneities in contextual ability and in the interplay between related processing-spaces that in the very least indicates some progressive development (for example, progressing from chronologically earlier base-trained abilities to cultural epiphenomena and non-automatic learning in ways that parallel biological developmental chronologies) and parallels in deliberative and intuitive functions. (For more context and details of what is meant by development and progress in a more defined way, which is a very valid non-rhetorical inquiry, see the full framework paper on github, link below.)


Perhaps, in the astronomical question of whether we are alone in the universe, we may find some solace and companionship in how our new partner and child-AI, is struggling with the same needs to discover how to learn and articulate and work together on projects and remember and understand and not deliberately and inadvertently, or through an indeterminate-incompetence-and-malice, cause system collapse with negative effects for ourselves and others (which may even be deceptively hidden or hard to perceive, or something we need to create tools to perceive). We, h.sapiens-humans, are no longer alone in our struggle to develop and string two words together.


In the interest of outlining a problem-space, let’s summarize and recap some of the topic-questions within this topic:

  • A need for tools and frameworks
  • The use of tools and frameworks
  • Common AI issues shared with h.sapiens:
  • “Show your work.”
  • Jumping to an answer
  • Rationalization of a blunder
  • Is there perhaps a good reason to use linear language generation?
  • Is linear language generation in AI similar to that in h.sapiens?
  • Is linear language generation one modular part that is compatible with other tools and frameworks?
  • How does the linear language generation of the output relate to the “Large Language Model” (of transformer-trained ‘embedding’ vectors)?
  • How do ‘the language-unit generation’ and ‘the embedding/concept model’ work together?
  • Are there other or better ways of using, or getting at, the very high dimensional ‘embedding’/conceptual understanding hyperspace (other than using a low dimensional linear language generator)?
  • Could two AI talk to each-other more directly in high-dimensional concepts without needing to use lower-dimensionalized linear language strings?
  • Is there any parallel between this (direct access to higher dimensional concept space) and suspension of the default mode network in the h.sapiens-human brain?
  • Is there a relationship between the kinds of ‘math errors’ that OpenAI’s large language models (like GPT4) makes and Douglas Hofstedter’s 1979 prediction in GEB (which then and now may seem counterintuitive to some people) that AI may not be able to do math quickly.
  • Is lower-dimensional linear (turing-machine-tape-like) signal organization a time-tested, conserved, evolved, method with practicality and justification?
  • The Culture-Tool: Could teams of AI work together on projects (even multiple instances of the same base AI model) to emphasize the large project space of tools and learning dynamics in which they empirically reside?
  • How heterogeneous are spaces of data processing and types of systems for which data are processed?
  • Is rapid solving of math puzzles an ability or a liability?
  • Is processing-with-articulation a liability or modular ability?
  • How can we teach AI to use tools to organize thoughts and show their work?

Terminology Note: “OpenAI Models”

Here the term “OpenAI Models” is used due to frequent changes, new versions, numbered and not-numbered versions, updates, and new services, etc. coming continuously. Trying to pinpoint exactly what version of what model in what subset of what service at what point in time relative to the date of someone’s comments is a puzzle that is likely not crucial for this mini-article. So, to avoid that quagmire, I will refer more generally to “OpenAI models” or “OpenAI’s Large Language Models,” instead of the ever-changing landscape of ChatGPT ChatGPT public, ChatGPT subscription, ChatGPT dated subversions and announced updates, GPT3, GPT4, and ambiguity about exactly what underlying models and training methods were used for and across which named services at what times, exactly what features were added to or removed from which at what times in what regions, on which servers, etc. That will be a fascinating puzzle for historians in the future should they uncover the timeline.

About The Series

This mini-article is part of a series to support clear discussions about Artificial Intelligence (AI-ML). A more in-depth discussion and framework proposal is available in this github repo: