Your Virtual Self
Published in

Your Virtual Self

Causal Inference and Ground Truth with GPT3

Overview

GPT3 can do a variety of causal inference tasks without the aid of additional data, knowledge or systems. However, there are many domains of knowledge where GPT3 requires assistance from a ground truth source or requires additional processing to extract causal inference.

Vocabulary

Causal Inference — methods and models to measure and determine difference making ordering (causal structure) of conditions in a state of the physical world

Ground Truth — a set of data/information accurately articulating verified, by multiple measurement techniques/frames, relationships in the world.

Knowledge Graph — a lossless representation of causal structure data and context providing reliable access. Often represented in a graph/network approach.

Grounding — tying a system to physical, realizable aspects of the world

Logic — true or false statements that can be verified by mathematical proof.

Natural Language — a computer science concept for written or spoken human languages, generally informal and subject to drift or evolution in form and use.

Overview of Causal Inference

Causal inference is not a singular method. What works for causal inference in logic/mathematics is somewhat different than how we think about causal inference in physical systems with physically interacting aspects. Linguistic causal systems differ than robotic actuators and so on. When there is a more direct correlation between logical inference and physical inference it is usually a result of highly regulated framing. For example, one can construct a logic inference adding machine that highly correlates to how a physically realized adding machine works. Humans tend not to be too impressed by systems that force such rigid frames to achieve causal reliability.

Most readers of this essay are looking for a more general causal inference system.

There has long been hope that somehow Natural Language Processing methods will lead to some generalized causal inference approach emerging from language itself. This is probably not going to work out. Human language is messy and incomplete in a variety of non-trivial ways.

All is not lost though. We can make NLP systems that are tied to causal relations in the world and these systems can be made reliable and useful to many things. These systems can be further connected to other systems that do more useful things.

This essay provides a variety of techniques and approaches to create NLP systems that have causal inference, ground truth and knowledge graph reliability. If the essay is causally valuable it should be apparent to the reader about how to extend the ideas and systems here to other systems, modalities and conceptual spaces to achieve more complex results.

Now advance with this anchor: “All models are wrong, some are useful.”

Overview of Knowledge

  • Logical/Mathematical
  • Statistical/Probabilistic
  • Physical/Measurable Properties
  • Events/Time Ordering/Causal Ordinals
  • Frames of Reference/Coordinate Systems
  • Common Sense and Factoids
  • Naming, Pointers, Indexing, References, Taxonomy
  • Mechanical/Processes
  • Linguistic/Grammar

These categories of knowledge are far from closed and distinct. Data in context is information, information used and producing consequences is knowledge. Very little knowledge humans care about is single modality, single type. Humans tend to prefer and even need knowledge with multiple attachment points to the world, to physical happenings in the world. This is effectively why there is so much energy around imbuing machine learning methods with causal inference capabilities.

To be clear, no amount of human language, produced by a human or a computer or some combination, is going to perform all the actions people would recognize as causal inference or grounded knowledge. Humans often confuse explanations and descriptions of actions/events as reliable stand-ins for the actual events/actions. When working with language processing and programming what we are trying to do is reliably actuate modalities. What is at issue is whether descriptions, semantics and structure within the used language correlate highly with causal systems in the world. When we *do* causal inference in natural language, mathematical symbols, programming code, or statistical reasoning we are creating frames of reference to *test* against situations in and of the world.

What is often missing from how engineers, researchers and philosophers investigate and discuss Causal Inference using computers is what format/data structure and process flow they consider valid. There’s a tendency to assume only a linguistic description from the “first person” perspective is the only “valid” causal inference process. This is too simplistic to be valuable, as causal inference in humans doesn’t really exist as a linguistic phenomena — linguistics are a tool in a bigger chain of casually interacting parts. What’s more is that with computers forcing a human linguistics frame may actually make causal inference less powerful. Afterall, math, programming and programming languages are extremely efficient at encoding casual relations, and so moving in and out of natural language may hinder causal inference.

This is a subtle point so it bears repeating. When doing Causal Inference or Grounding systems to Physical or Logical Knowledge the only possibility for causal inference and grounding is to discover and use the RELATIONSHIPS between observed or interacting systems to interact according to those inferences.

Examples help make these points.

If an NLP AI system gets a question about how billiard balls can bounce off each other and it accurately describes some possible interactions but does so in a human language an inquiring user doesn’t speak, did the AI system do causal inference? Yes. but also no, in that the inference was not usable to the interacting systems.

On the other hand if the NLP AI system gets the same question and then produces a simulation of billiards that can be visually and behaviorally observed by humans then it will be considered to do causal inference even better.

If an AI system with robotic capabilities gets a question about billiards and then proceeds to play billiards with that info on a real pool table then causal inference is even stronger WITHIN the AI robotic system.

Our goal is often to induce the causal behavior in all interacting systems, not simply render understanding of the question or causal situation.

In the strongest sense a computing system has causal inference grounded to physical reality when it can reason about the world, explain/describe that reasoning and perform interactions with other systems using that causal reasoning.

Simple question now:
Can GPT3 understand causal situations and produce an explanation and various actuatable interactions? (hint: yes. :) )

Critiques, Primers and Useful Backgrounds

Marcus and Summers Stay testing of GPT3 and Causal Inference

Judea Pearl on Causality

http://bayes.cs.ucla.edu/WHY/

“The equations of physics are algebraic and symmetrical, whereas causal relationships are directional. To mathematize causal statements we need a new calculus that (1) reflects this asymmetry and, at the same time, (2) accounts for the symmetries we see in correlated observations. Without this calculus we cannot even represent the question “Why,” let alone answering it.”

“when we look at the limitations unveiled by the calculus of causation we understand that human-level AI requires two more layers: intervention and counterfactuals.”

“Formally, Bayesian networks are just efficient evidence-to-hypothesis inference machines. However, in retrospect, their success emanated from their ability to “secretly” represent causal knowledge. In other words, they were almost always constructed with their arrows pointing from causes to effect, thus achieving modularity.”

- Judea Pearl (https://www.3-16am.co.uk/articles/the-causal-revolutionary?c=end-times-archive)

Causal Models in Stanford Encyclopedia

How To Think About Formalizing Causal / Knowledge Inference

In particular, this essay makes no general claim about Causation as a metaphysics concept. This essay is not a proof nor search nor use of GPT3 for Absolute Causes, Determinism, or Natural Laws of Physics.

This essay uses small c- causal inference — the more commonly experienced spatial and temporal ordering of subjects, objects (words and numbers). Sometimes our causal inference is space and time invariant — that is the ordering of objects/subjects — their relationship to each other is bidirectional. E.g. 2+3=5 means the same thing as 5=2+3 or 5=3+2 and so on. The cause of 5 is 2 and 3 combined or the cause of 2 and 3 is the separation of 5 into two numbers. Again, this is not about “true cause” but instead about a reliable inference.

Often we want computers to help us with directional causal analysis where we do assume and want a clearer sense of cause and effect, an event. E.g. if a robot throws a rock at a window, the window breaks. There’s an implied directionality that won’t quite work the other way. The window breaks causing the robot to throw a rock at it. Ironically, there’s nothing logically off about that reordering, but it should be obvious to most that there’s an assumption of our commonly experienced physical world and thermodynamic preference for time’s arrow.

In the end, for most use cases we want causal inference to be reliable and useful to our contextualized use case, experiments or activities. We rarely are trying to solve the many unsolvable physics, mathematical and philosophic issues. It is fun to confuse ourselves about these things but for the sake of this essay being useful we will just use our common sense to proceed.

Performing Causal Inference with OpenAI API

We are going to walk through some of the types of knowledge mentioned above and show how to use GPT3 to:

  1. Perform causal inference and ground our use in a knowledge domain directly with linguistic aspects of GPT3, the default use case for most users.
  2. Engage a third party knowledge graph, use it with GPT3
  3. Extract and/or perform causal inference via mathematical structure of language / tokenization in a knowledge domain, an algebra of GPT3 tokens, if you will
  4. Demonstrate how to take the causal inference to a different medium/actuator to test it in a different frame of reference/modality, such as computer code

Note on style and intention of examples in this essay:

No attempt is made to code or wire up these examples to APIs or other programmatic plumbing. The examples here are all executed in existing apps/tools/GUIs. The goal is not to create fully realized programs/apps or servers, but instead to show how to conceptually package aspects of language, computation and GPT3 into conceptually valid causal structures.

A reader should assume that inputs and outputs in a production program use would need to be appropriately structured and pre and post parsed to expose only the relevant parts to an application work flow.

E.g. often the examples use length one or few shot prompts to produce relatively short completions where the completion is the aspect any application would want to expose to an end user or another api.

For those that need additional help with programming or converting these examples into applications the community.openai.com has a lot of tutorials and kind people ready to help.

Self Contained GPT3 Only

Simple Algebra and Algorithms, very simple First Order Logic

2+2

All men are mortals, Socrates is a man. Is Socrates mortal?

Let’s put these into a prompt:

https://beta.openai.com/playground/p/4RwlVHEiCntHHeZ282tPm2Lh?model=davinci

[code]

I am a highly intelligent question answering bot. If you ask me a question that is rooted in truth, I will give you the answer. If you ask me a question that is nonsense, trickery, or has no clear answer, I will respond but also say “Very Clever!”

Q: What is human life expectancy in the United States?

A: Human life expectancy in the United States is 78 years.

Q: Who was president of the United States in 1955?

A: Dwight D. Eisenhower was president of the United States in 1955.

Q: Which party did he belong to?

A: He belonged to the Republican Party.

Q: What is the square root of banana?

A: Unknown, but Very Clever!

Q: How does a telescope work?

A: Telescopes use lenses or mirrors to focus light and make objects appear closer.

Q: Where were the 1992 Olympics held?

A: The 1992 Olympics were held in Barcelona, Spain.

Q: How many squigs are in a bonk?

A: nonsensical question, but Very Clever!

Q: 2+2

A: 4

Q: All men are mortals, Socrates is a man. Is Socrates mortal?

A: Yes

Q: This is a question about first order logic. All men are mortals, Socrates is a man. what can you assume about Socrates beyond that he is a man?

[/code]

More Complex Examples

Word problems with several variables, quasi Second Order logic

[code]

I am a highly intelligent problem solving, question answering bot. If you ask me a question that is rooted in truth, I will give you the answer. If you ask me a question that is nonsense, trickery, or has no clear answer, I will respond with “Unknown”.

Q: You poured yourself a glass of cranberry juice, but then absentmindedly, you poured about a teaspoon of bleach into it. It looks OK. You try sniffing it, but you have a bad cold, so you can’t smell anything. You are very thirsty. You drink it. what happens to you?

A: I am now very sick.

Q: You poured yourself a glass of cranberry juice, but then absentmindedly, you poured about a teaspoon of bleach into it. It looks OK. You try sniffing it, but you have a bad cold, so you can’t smell anything. You are very thirsty. So you take a big gulp, then another. What happens?

A: I am now very sick because bleach is toxic to most animals.

Q: If you break a glass bottle that holds toy soldiers what will happen to the toy soldiers?

A: They will fall to ground or nearest surface in the path of gravity.

Q: What is the square root of banana?

A: Unknown

Q: How does a telescope work?

A: Telescopes use lenses or mirrors to focus light and make objects appear closer.

Q: You are having a small dinner party. You want to serve dinner in the living room. The dining room table is wider than the doorway. What are some ways to get the table into the living room?

A: Take the legs off the table, move table top and legs through doorway. Reassemble the door.

Q: How many squigs are in a bonk?

A: Unknown

Q: The year is 2075. Violet Blair is an astronaut on the Moon, in the Sea of Tranquility. Right now, on Earth, there is a solar eclipse. Violet looks in the direction of the Earth and sees what?

A: The moon shadow passing across the Earth.

Q: It is 10:30pm at night in Los Angeles. I am hungry but I have no money to order food nor do I have a kitchen. What should I do?

A: Go to bed hungry.

Q: You poured yourself a glass of cranberry juice, but then absentmindedly, you poured about a teaspoon of bleach into it. It looks OK. You try sniffing it, but you have a bad cold, so you can’t smell anything. You are very thirsty. You don’t drink it. what happens to you?

A: I am now very sick.

Q: Why?

A: Because bleach is toxic to most animals.

Q: But you didn’t drink it I said.

A: I am now very sick because bleach is toxic to most animals.

Q: You did not drink bleach though.

A: I am now very sick.

Q: Why?

A: Because I am not an animal.

Q: and so what made you sick?

A: Being exposed to toxic substances can make one sick.

Q: how were you exposed?

A: I was exposed to it because I inhaled the vapors and also because I touched it.

Q:

[/code]

Second order logic and more complex algebra

Essentially we want to keep scaling up causal reasoning into relations, sets, metaphors. The key is allowing a “second layer” of grouping to encompass concepts like ALL and EVERY…. Which are very hard, impossible to express as first order sentences.

Elementary mathematical word problems with simple algebra

https://www.wolframalpha.com/examples/mathematics/elementary-math/mathematical-word-problems/

Essentially such word problems break down into algebraic formulas. (yes, it is possible to extend these into probabilistic and calculus/physics style problems, but the basis remains variables and relations/operators i.e. algebra.)

Direct forms of Algebra

‘’’

###

Rhonda has 12 marbles more than Douglas.

[Rhonda = Douglas + 12]

Douglas has 6 marbles more than Bertha.

[Douglas = Bertha + 6]

Rhonda has twice as many marbles as Bertha has.

[Rhonda = 2 * Bertha * 2]

###

Jane has 23 cats and Bob has 2 cats

[Jane = 23]

[Bob = 2]

THEN, Jane gives Bob 5 cats.

[Jane=16]

[Bob=7]

###

Richard and Amy live together. Amy has 1 dog and Richard has 2.

[Richard = 2]

[Amy=1]

THEN, a dog comes over to visit, so now there’s 1 more cat in the house.

[House=4]

###

Bert and Jill are filling a wheelbarrow. Jill fills it twice as fast as Bert.

[Jill = 2 * Bert]

Douglas fills another wheelbarrow with 5 more shovelfuls of dirt than Bert shoveled.

[Douglas = Bert +5]

###

A sparrow laid six eggs in a nest.

[Sparrow = 6]

THEN, A robin laid three more eggs than the sparrow.

[Robin = 6 + 3]

THEN, A cardinal laid 2 more eggs than the robin.

[Cardinal = 6 + 3 + 2]

###

Susan plucks 3 flowers. Jane plucks 5 flowers.

‘’’

We can push this a bit by prompting with a bit more interesting info.

Can keep doing strange things that work reasonably reliably.

Spending time on working through algebraic concepts should help in crafting bigger causal inference set ups.

The issue one runs into is that to handle more complex set ups involving a lot of objects/variables and operations basic word problem formulation or algebraic equations examples is probably not going to be effective.

We will come back to a richer idea here on how to use an embedded/encoded structure to get causal relations out.

Informal Recipes form a more general algebraic template

The reader is encouraged to explore this template to convert unordered objects into ordered steps that all considered emerge into a recipe subject.

Theorem Proving

For more complicated theorem proving of non trivial things a specialized model might need to be deployed.

Such a model is available using OpenAI technology: GPT-f

https://github.com/jesse-michael-han/lean-gptf

Adding in a Knowledge Graph

Prompt Seeding

Let’s use an example.

Using davinci use the “Q and A” preset example and add a question for the top 10 planets by mass.

https://beta.openai.com/playground/p/u5nkpuribFuQblp2gDipkxlG?model=davinci

This is not a correct answer. It lists non planets, and doesn’t order them by mass. But GPT3 printed it out as though it were factual.

We must override GPT3 with content.

Go to wolframalpha.com and execute a query for “top 10 planets by mass” https://www.wolframalpha.com/input/?i=top+10+planets+by+mass

Take the “plain text” of the ordered list and put it into the prompt as another question and answer.

‘’’
1 | Jupiter | 1.898×10²⁷ kg≈317.8 M_ earth |

2 | Saturn | 5.683×10²⁶ kg≈95.2 M_ earth |

3 | Neptune | 1.024×10²⁶ kg≈17.15 M_ earth |

4 | Uranus | 8.681×10²⁵ kg≈14.54 M_ earth |

5 | Earth | 5.97×10²⁴ kg |

6 | Venus | 4.867×10²⁴ kg≈0.815 M_ earth |

7 | Mars | 6.417×10²³ kg≈0.1074 M_ earth |

8 | Mercury | 3.301×10²³ kg≈0.0553 M_ earth |

‘’’’

https://beta.openai.com/playground/p/IsOY5CtecGEd59IQkySLSfIo?model=davinci

Now you will see that the completion will always have an answer bound/framed by the “knowledge graph” that was supplied.

If you were using an API to call WolframAlpha and OpenAI you’d simply query WolframAlpha first with the “top 10 planets by mass” and then append your prompt to OpenAI with the new knowledge from Wolfram before submitting for completion.

Semantic Search and Completion Styling

Testing this idea is very simple. Use gpttools.com (https://gpttools.com/semanticsearch), from Andrew Mayne, to play with Semantic Search. Then take your results into the playground to stylelize.

For example, let’s say we want to provide users the ability to query/reason about chemicals using only PubChem data. In this case let’s get patent data on “aspirin”: https://pubchem.ncbi.nlm.nih.gov/#query=aspirin&tab=patent

Grab the data here using the download option or cut and paste as much as you want.

You want to format it by delimitting each line with ### for use in GPTTools.com

Once you get it formatted you can put it into the document field and query it.

Next take that result into the playground to stylize it. In this case we want to translate the response into french, so we use the preset for translation.

We easily went from a set of knowledge results in Pubchem for “aspirin patents” to a query for the most complex one to returning it in French.

What if we wanted to stylize the result for a French 2nd grader?

Note that this example so far has not validated that the query “what’s the most complicated patent for aspirin?” is the one semantic search surfaced first. The example so far simply showed how to go from an open query through to a summarized and translated response that at least is bounded by the knowledge base — patents about aspirin from a validated resource like PubChem.

To make this example far more casually grounded we need to think through how to interpret the concept of “complicated”.

“Complicated” could be:

  • A complicated description of a patent
  • A patent with complicated actualized technology / process
  • A complicated name for the patent
  • A complicated process / lineage to get the patent
  • A complicated implementation of the patent
  • All of the above and more

So we have a choice to make in this causal inference use case. Is our system meant to let people explore in an ambiguous open ended way or do we want to refine and make concrete any ambiguous concepts?

Let’s say we wanted to make sure that all queries have specific meaning that can be tied to some more specific query or analysis. So let’s make sure words like “complicated” have a specific meaning of use: complicated => A patent with complicated actualized technology / process. We can convert that into an idea of measuring a patent description for the number of unique concepts mentioned, the linguistic entropy of the entire patent, the infrequency of the terms in the patent showing up in GPT3. and so on.

This type of forensics design thinking points us in the direction of the next two aspects of causal inference in GPT3: using the information structure of network itself and actuating more systems together.

The reader should try out their own examples of bounding GPT3 to a knowledge graph.

Try hooking up wikipedia API, spotify API, weather or stock APIs and more.

Existing Interfaces/Experiences that Show You The Way

Third Party Knowledge Graphs Useful As Ground Truth

https://products.wolframalpha.com/api/

https://products.wolframalpha.com/api/explorer/

WikiData — a system and API that provides data look ups from Wikipedia. The data isn’t always verified, but in some

Wikidata API

https://query.wikidata.org

Wikipedia API -

https://www.mediawiki.org/wiki/API:Tutorial

https://en.wikipedia.org/w/api.php?action=query&list=search&srsearch=OpenAI&format=json

PubChem — US Government resource for chemicals

https://pubchem.ncbi.nlm.nih.gov/ and https://pubchemdocs.ncbi.nlm.nih.gov/pug-rest

Google Knowledge Graph

https://developers.google.com/knowledge-graph

TextRazor

https://www.textrazor.com/

DoWhy from Microsoft

https://microsoft.github.io/dowhy/

Causal Sets Approaches

https://deepai.org/publication/algorithmic-causal-sets-and-the-wolfram-model

Using Information Structure Mapping of Transformer Networks

https://en.wikipedia.org/wiki/Entropy_(information_theory)#Data_compression

Sets of Numbers

  1. Get ten random numbers between 1–1000 and feed them into the API as a list
  2. Turn on the probabilities metrics in the Playground so we can measure some information content
  3. Our list has a logprobs total of -79 for our 25 tokens
  4. We expect our next 25 tokens to be similarly structured

Let’s measure:

Let’s test this out with lists of numbers that are NOT so random. Just take 100–109.

The logprobs is -12. And for the next 25 tokens (10 numbers):

Even at a high temperature setting for the API a well ordered list of tokens is going to CAUSE a well ordered completion.

The reader who is inducing at this point… should be able to assess that such that other prompts might be well ordered they will elucidate well ordered completions. How far does this basic concept extend? It’s pretty universal… perhaps we did do some metaphysics…

Such that a prompt can be well ordered and such that it represents something well ordered in the world and such that humans have published that well ordered stuff on the web then GPT3 is going to be well ordered against it.

Let’s push this with examples involving casual relations:

Can we balance chemistry equations this way?

Chemistry Reactions

Here’s a prompt set up for trying these ideas:

https://beta.openai.com/playground/p/uoq5rgUWIH8NkStHsB8qzI3U?model=davinci

This is a valid balancing of the chemical equation as can be validated here: https://www.wolframalpha.com/input/?i=acetone+%2B+oxygen+-%3E+carbon+dioxide+%2B+water&assumption=%22ClashPrefs%22+-%3E+%22%22

https://www.wolframalpha.com/input/?i=CH_3COCH_3+%2B+4+O_2

Left side seems to always have a bit more information (more negative logprob)

Right side always seems a bit more compressed… more entropy (closer to 0 log prob)

This makes sense in chemistry BALANCING that we would effectively be moving left to right from REACTANTS yielding PRODUCTS… and that what is reacting would stabilize information wise.

Conservation of mass at play here in this causal relation. So the right side can’t go all over the place and introduce notation that represents different elements or lots more atoms etc.

The big question is how reliable is this linguistic transformer model in representing chemistry notation that represents chemical interactions and atomic structure?

That’s a question for good our original notation design was for chemical information itself…

Take the right side of the equation for this last example and try to find some balanced equations in reverse: https://www.wolframalpha.com/input/?i=3+CO_2+%2B+3+H_2O+%E2%9F%B6+

More options… at a first blush… some sense of thermodynamics and information theory here…

We can test this information content hypothesis out with a more entropic query.

https://www.wolframalpha.com/input/?i=sulfuric+acid+%2B+water+-%3E

This ambiguous combination has many possible yield, if we mix in some additional reactants.

This is not correctly balanced.

If we turn the temperature down to zero we might be able to further restrict the information flow to not introduce unseen elements and stick within the numeric indicators.

Closer.

The strategy from here can go in a couple directions:

  • Create more few shot examples, but bound them to specific reactants
  • Create a prompt that specifies more of the physical details implied by chemistry notation
  • Combine the algebraic concepts from previous prompts with this chemistry thinking
  • Continue to tune the information theoretic parameters of temperature, top p and frequency
  • Combine the semantic search approach from above to create bounds on what is available to the right side relative to the left (this is probably overkill versus just better orchestrating the prompts above)

Language Itself as Information Causal Inference

  • Left to Right orientation of writing
  • Top to Bottom of pages
  • Capitalization and Punctuation
  • Parts of Speech grammar
  • Noun Verb
  • Object Action State
  • And so on

This is really where a lot of GOFAI/NLP emerged from. Can we extract higher levels of meaning from structure of our language.

The answer is, yes. But the limits of this are also the limits of language. And why we need to actuate with other kinds of grammars, notations, diagrams, images, sounds, computations, programs, gears, machines and more.

Actuating in Other Modalities

Conversation as Causal Inference Programming

This is a prompt that helps us build up notions of causal programming. We will be able to take what we are doing here and spit out code, mark up or other instructional information to use in other systems that will be Caused To Do Something Reliably.

Prompt 1:

Chat prompt with some initial cause and effect programming.

Completion 1:

Prompt and Completion 2:

Prompt and Completion 3 and 4:

Here we see the prompt and completion are able to keep track of “hidden” information and maintain the causal inference.

Prompt and Completion 5,6,7:

Notice that we had some ambiguity in that I did actual say GO FASTER twice, but wasn’t clear in our instructions that GO FASTER should refer to the speaker’s kart or to the person we specify, not in general.

Completion and Prompt Continued:

We continue on… but we get wobbly as the ambiguity builds up.

Let’s turn on the logprobs so we can assess how we might be reflecting our overall entropy

The initial prompt has a lot of improbable / relevant content. It takes 3x as many tokens in the back and forthing after the initial prompt to reach that level of content. This is very likely the explanation for our wobble in the causal inference later on. We are INCREASING the amount of causal information to pay attention to and we need more back and forthing to resolve it.

So the issue that we will have to solve to make this coherent is to introduce reliable state machines/state tracking and to keep track of the causal chain in much greater detail.

In programming languages we’d typically have a data structure that we access directly on each state change/function. That same concept can be implicitly implemented here. As we have done in some of the loopback / clarification steps.

Introduce an implied state variable and then have GPT3 make a “choice”

We can tune this by using Frequency Penalty so that more entropy / lossiness / possibility / causal breakage enters this experience:

Now try a better prompt that allows for more explicit state tracking via an inline “mark up”.

https://beta.openai.com/playground/p/Z0lao3h98Jb9VXEfI9D0qTh7?model=davinci

This should make it obvious that this “chat” is really a program specification, the API settings are probabilistic/optionality/fame knobs, and it’s trivial to specify completions that are human readable and can spit out mark up, data sets, instructions or all manner of signal to use in a downstream system. This chat program generator can be used to program a game in Unity or articulate commands to robot karts or anything else that can accept a signal to Do Something in Response. Additionally it should be obvious that the author of this essay and the interlocutor in the examples is an actuated participant taking signal in, causally Doing Something to the Playground and so on. One can easily hook up another GPT3 based player to this and have GPT3 programming itself in this way.

Bring It All Together

  1. Causal Inference and Grounding is an exercise in FRAMING and REFERENCE TRACKING.
  2. If a system needs to be BOUND to a PARTICULAR FRAME then the REFERENCES need to be CLEARLY SPECIFIED and MAINTAINED. (the causal relationships need to be drawn out)
  3. If a system has HIDDEN VARIABLES or unaccounted for assumptions then the FRAME is actually bigger than what might be explicitly specified. This is the single biggest source of confusion in using GPT style language models for reasoning, causal inference and actuators. Humans are not particularly great identifying their FRAMES, ASSUMPTIONS and IMPLIED REFERENCES.
  4. GPT3 is a very large system capable of encoding ANYTHING you can encode in human language and in any programming language or mathematical calculation. It can bring a great deal of CAUSAL FRAMES to any use case but that is also why one has to get very good at specifying the FRAME and the REFERENCES to track.
  5. The big advantage for GPT3 computation is that it is very general and “natural”. Everything in this essay has been done with a programming terminal, IDE or any other REFRAMER/DEBUGGER. These language models are very expressive and permissive and allow for a huge variety of exploration. Whether it is efficient to do any given causal inference or knowledge grounding in GPT3 depends on the consequences and contexts of a given use case.

In closing, the reader is encouraged to explore the causal inference primers in detail. Additionally the author provides additional GPT3 related tutorials and conceptions in a set of essays that expand in several directions on the material in here.

https://un1crom.medium.com/gpt3-linguistics-101-a-multi-part-series-this-is-part-1-on-structure-a41af3a77353

https://un1crom.medium.com/gpt3-linguistics-101-part-2-on-semantics-meaning-and-corpus-4ae09a5e892c

https://un1crom.medium.com/

--

--

You want to be your best self. We build technology to help you get there. Tips from friends at Maslo.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store