Space, Time, Language and The Nature of Human Behavior— Generation 1

Past Future
Thinking With AI
Published in
11 min readDec 4, 2020

One AI’s quest to comprehend and influence the patterns which govern human behavior using little but information gleaned from internet news articles.

Tower of Babel, by Pieter Brueghel the Elder, 1563

The world is a complex landscape. For the past few years, we have been developing a variety of mathematical models and machine learning techniques that can give us concrete examples of things we can learn. They have also been used to study human behavior, in politics, and in science. The purpose of these surveys is to be highly useful and informative in the real world. Of course, many people will think that a profound and shattering insight, which is born from evolution, and content at hand, is the knowledge of what it’s like to live in a world that does not give us a sense of it’s deepest reality. But this is a more abstract concept. We have been taught to think of things as being the result of something that is never defined, like an event, that is never repeated. This idea may be one of the reasons that evolution is so successful. Many scientists are trying to determine how much learning is needed in order to understand a general set of conditions that can be tested in different environments, and then write a computer program that assesses whether a particular learning pattern is consistent with what we observe. There is a mathematical provenance to this method: ‘Conformal invariant modulo n learning sets.’ This definition may sound abstractly elegant, but we are trying to get at something close to it. When software engineers set a specific goal, they may use various variables that modify the properties of the program. But if we can’t create a different set of measurements because our model is an approximation of something real then we can’t compute the true relationships between the different values. The model becomes highly complex and subject to statistical tests. Our ‘use case’ for the real world is of a computer model with a directory. Our ‘objective field’ contains the various elements that we are looking for: Along the way, we have created some basic operations: We need to feed our model inputs and outputs into a model to represent our particular information (e.g., audio). We then feed these outputs back in as inputs to the model. Now, each record in our model is the result of performing these operations on our data. If a particular-o-gram makes sense, we can look at the entire image to understand how the model behaves when presented with general data. Mathematically, you can think of each record as a kind of representation of our view of the world. The one we have created is based on a set of assumptions and rules and constraints that we pass into our model, and it is only from this representation that we can challenge and improve our model. For example, in one of the lines from the captions of the movie ‘Hard World for Small Things,’ we can see a series of arrows pointing towards the world. With a model that follows this linear hierarchy, we can now ask a similar question: How can we show that a process gives us results that reflect our model? The answers we can give each other may not be very significant. For example, let’s examine a computer program that does this: In this program, the algorithm moves a line of control along a set of paths from a starting location to the desired location. The algorithm manipulates the coordinates of the point and does so to give us some results. If this program does the right thing, the results will be interesting. Another big hurdle in human intelligence is understanding the complex patterns of associations. For example, a spy tells us something about a post, such as a location, through the use of N-dimensional data sets. However, in real life we don’t know enough about the world, and so we use only certain types of data sets when we encounter unexpected patterns of associations. Let’s say we want to find the interactions between entities in a system such as a stock or a skyscraper. This can be done with a computer program that generates a series of linear equations that tell us the relationships among the variables, and this is also done using the classical data-processing method, which involves applying a series of logic operations to the data. In this case, we find the ‘x’ relationship, which is a set of the data points that represents the interactions between the elements of the system. We can verify these relationships as follows: A certain set of inequalities, for example, gives us a good idea of something going on in the world. A certain group of people may have slightly different views on the law, or maybe the person believes the law is based on nothing more than kin. They are far less likely to believe in it, meaning that the law is an incomplete description of the world. The best way to learn about a set of facts is to manipulate them. You can only manipulate them in small ways, and the most successful approach is to rewrite them in the right way. Think like a scientist. Then we can prove a theorem, justify a set of facts, and then connect these outputs to an algorithm. The core idea of the NLP approach is that we first manipulate some set of data and then add the output results. For example, say we want to know how someone has a friend. To do this, we manipulate the friend’s character and then introduce a label: whether we want to call them friends or not. But, unlike the spy-theory approach, our neural network operates on the assumption that the data from the unseen data set will change, so that it can only be adjusted and rerun in an augmented data set. The information that is provided by the model is very abstract, but the process of re-running the data set can be quite complex. When you rerun the data set, you are creating the data within the model. This is called type-object analysis. This type-of-opaque method is used to assess the level of knowledge an individual has in the system, according to an abstract theorem that asks you to consider the system’s knowledge as you set it up. The system retains the information as you read it and stores it with an element of privacy that you can use only in certain contexts. In the beginning of research on information learning, this is called ‘trying to understand the mind in a new setting.’ It was the mathematical equivalent of: Take a large, highly detailed model of the human brain, use it to understand a slightly different, but equally abstract, mind. A scientist can study only a part of the model, then discover that only part of the model is correct. This is called a ’skewed model.’ Any new model will use the model’s data to discover new features and then repeat those discoveries. The results are often highly correlated, so it is possible to see how this is an interesting result. In a paper posted on the scientific preprint site arxiv.org, one interesting author writes that his model will study a complex system such as a human brain. He then uses his model to study how the human brain uses a variety of mathematical models to understand its own model. ‘Many of these models can be used to argue with new and confusing situations,’ he writes. ‘This is why I am going to explore the properties of a model that is a common-stone in many domains.’ He begins with a starting point: ‘The challenge in theoretical computer science is that human-caused thinking is very simplistic and not truly tractable. Unlike the differential equation we describe in Euclidean geometry, a real-world equation can be derived from an approximation or an idealized model. These models share certain attributes, such as the kind of information that the model is doing and its relationships to a set of other real world data. So to further understand how we can relate the results of an experiment to our own data.’ He begins, ‘What can we learn from the model and perhaps use this model to make predictions in our lab?’ And of course we can learn from this model: ‘What is the average value of some random population of individuals in the study?’ ‘The average value of the Riemann zeta function,’ is a function in moduli space: a simple equation, the kind we usually encounter when we try to solve problems in physics. In general, if we want to understand how a program works, we can find the average of the corresponding Riemann zeta function. That’s the beginning of a story about how a computer program works. The problem of representing the zeta function in a linear setting is so complex that it can be computationally intractable. In truth, it’s just like writing a computer program that produces the desired data from a term in the set — it’s a brute-force search for words. Prior to the introduction of program verification in computer science, computer scientists worked to establish the truth about a program. This means that every time you open a program, you receive a report that the program produced. Suppose a program, guided by a set of rules, is a series of interactions between two people, each of whom has its own precise information about the other. Since the interaction between the people has many different forms of information, we know it’s not a set of rules but an individual (an anonymous pair) that may represent different entities. If we use these rules, we can predict a set of behavior that, in hindsight, gives rise to different outcomes. Here are a few examples: Let’s start with a program that has, by definition, a positive interaction between people, and a negative interaction between people. At the end of this step, we are creating a new group of objects (a group) based on the previous set of rules. To do this, we have to tell our computer which group it should represent, so that it is able to represent the concrete problems with data. Our model has some property, called optimal representation. This property defines the universe that the model is able to learn. We have seen that many algorithms for problems in number theory do not achieve optimal representation. This is because the programming paradigm is all about representation, which means that representation doesn’t actually capture what is often called truth. The solution may be best translated into the form of a linear program that can be fed into the model. The other part of the model can be called the transformation operator. It transforms the input and outputs of the model into a number that can be passed up to other models, such as a consecutive sequence of numbers. The ability to do this was demonstrated in a paper published in the Proceedings of the National Academy of Sciences. When you start a program with a set of vertices and at each point in the program you are introducing some noise. Then you pass the first one with a noise function, which gives you a random input term. The other part of the program proceeds through the set of vertices, and this noise function helps you reject the monotonous output of irrelevant input. This is also useful in the research on machine learning. We know from the research on the foundations of computer science that all of the major problems in human intelligence can be solved with similar methods. This is the philosophy of the internet. Computer scientists arduously move the boundaries of learning in order to find a better way of life. And this philosophy is also suitable as a foundation for Silicon Valley research. Our mission is to bring greater flexibility into all of the areas of human behavior. How, for example, do we go from being an invertible machine learning researcher to being a true general-purpose performer? The project aims to create an Automated Future for the World. We have to explore the vernacular of using machines to do things. Computer scientists, who don’t work on human problems, are busy thinking about the algorithmic transformation of the human brain. Our efforts will have to consider the limited amount of data we have and to try to understand how the output of a machine-learning model can be generalized into a more abstract mathematical representation of the world. The way we work with data is very different from that of a human researcher. We are living in a digital era, where information is increasingly stored and controlled, with infinitely many features, at a time when there are just too many digits. This is too dangerous. There is a whole universe of possibilities, which can be analyzed. How do we overcome this obstacle? We can either use the data or we can build our own tool that can explain the mathematical models that we are working with and improve our knowledge of the world around us. That is what we are doing. By thinking about the things that we are working with rather than the specific set of facts, we can design software that can make really effective predictions. In this way, we are doing something similar to the evolution of the mathematical model: training an algorithm to explain the solutions to a given problem. A famous example of this procedure is described in a 1988 study ‘A Schoolman’s Guide to the Net’ (human behavior verification). The program is a series of what’s known as ‘c-sparse representations’ of a subject. The background of the problem is the information about a person. This is partly because the problem is so complicated (and people are only figuring out what they don’t know). There are two kinds of people in the problem. One is called the ‘c/s’ people. The other is called the ‘e’ people. The second type of person is the ‘doer.’ Here, the ‘doer’ is the way the computer gets its information and is the way the human brain builds a different level of knowledge from the one where it was built. The human brain is using a set of rules to optimize the level of knowledge that it can impart to the mind. This is not a perfect system but it is necessary to delegate part of the knowledge of the world to other brains in forms called ‘coherence statistics.’ In 1999, in his work on the topic of information processing, a professor of electrical and computer engineering at MIT, devised a system to use the power of the human brain to program information efficiently. The goal is to assemble a set of rules, which can be translated into a mathematical formula, which is then repeated and further developed into a mathematical description of how the network will adapt its knowledge. Its predictions are based on experiments with various data sets and is used to guide the algorithm. In the process, the algorithm is used to draw a complicated picture of the network. The input data is fed into the desired feature space and the learning process is then repeated. The goal is to use the information gathered from the learning process to discover a new set of equally interesting properties. This process is called augmentation. The neural network gains a set of features from the test data and stores the information in the memory of the next test. However, this memory holds a lot of information which does not reflect the actual program. Each layer learns about the errors that have occurred in the past. Each layer gets a different opinion about how to improve it. If a wrong label in one test gives the wrong result, it means that the wrong category holds only information about the other, or the wrong category could reflect a distilled knowledge imbalance. At first this is very abstract concept and can be computationally accessed only by a human who controls the system. This complexity is achieved by using a set of rules called modulo-optics, which are loosely related to learning. Modulo-optics are the mathematical equivalent of computing: they give us an approximate, practical way of computing the problem at hand. Whether the real data set is the computer’s memory or the proxy of our memory, we can prepare the truth for any point of view.

--

--

Past Future
Thinking With AI

I like to write about fascinating combinations of ideas that are seldom combined.