Self-Improving AI: Social Consequences

Dec 18, 2015 · 12 min read

November 1st, 2007

As computers become more complex and parallel, today’s development paradigm appears increasingly incapable of matching the pace of accelerating technological change. Steve Omohundro of Self-Aware Systems describes in his October 24, 2007 Stanford University Computer Systems Colloquium a new approach to “software synthesis,” in which artificially intelligent machines take over many of the tasks of software development.

Continued from “Self-Improving AI: The Future of Computing.“

The following transcript of Stephen Omohundro’s EE380 Computer Systems Colloquium delivered at Stanford University October 24, 2007 has been edited for clarity by the author.

Self-Improving AI: Social Consequences

Continued from “Self-Improving AI: The Future of Computing.”

Using the very same arguments about optimal economic decision-making and the process of self-improvement, we can talk about self-improving hardware. The very general resource balance principle says that when choosing which resources to allocate to each subsystem, we want the marginal expected utility for each subsystem to be equal. This principle applies to choosing the type and number of processors, how powerful they should be, whether they should have specialized instruction sets or not, and the type and amount of memory. There are likely to be memory hierarchies all over the place and the system must decide how much memory to put at each level of each memory subsystem. The principle also applies to choosing the topology and bandwidth of the network and the distribution of power and the removal of heat.

The same principle also applies to the design of biological systems. How large should you make your heart versus your lungs? If you increase the size of the lungs it should give rise to the same marginal gain in expected utility as increasing the size of the heart. If it were greater, then you could improve the overall performance by making the lungs larger and the heart smaller. So this gives us a rational framework for understanding the choices that are made in biological systems. The same principle applies to the structure of corporations. How should they allocate their resources? It also applies to cities, ecosystems, mechanical devices, natural language, and mathematics. For example, a central question in linguistics is understanding which concepts deserve their own words in the lexicon and how long those words should be. Recent studies of natural language change show the pressure for common concepts to be represented by shorter and shorter phrases which eventually become words and for words representing less common concepts to drop out of use. The principle also gives a rational framework for deciding which mathematical theorems deserve to be proven and remembered. The rational framework is a very general approach that applies to systems all the way from top to bottom.

We can do hardware synthesis for choosing components in today’s hardware, deciding how many memory cards to plug in and how many machines to put on a network. But what if we allow it to go all the way, and we give these systems the power to design hardware all the way down to the atomic scale? What kind of machines will we get? What is the ultimate hardware? Many people who have looked at this kind of question conclude that the main limiting resource is power. This is already important today where the chip-makers are competing over ways to lower the power that their microprocessors use. So one of the core questions is how do we do physical and computational operations while using as little power as possible? It was thought in the ’60s that there was a fundamental lower limit to how much power was required to do a computational operation, but then in the ’70s people realized that no, it’s really not computation that requires power, it’s only the act of erasing bits. That’s really the thing that requires power.

Landauer’s Principle says that erasing a bit generates kT ln 2 of heat. For low power consumption, you can take whatever computation you want to do and embed it in a reversible computation — a reversible computation is one where the answer has enough information in it to go backwards and recompute the inputs — then you can run the thing forward, copy the answer into some output registers, which is the entropically costly part, and then run the computation backwards and get all the rest of the entropy back. That’s a very low entropy way of doing computation and people are starting to use these principles in designing energy efficient hardware.

You might have thought, that’s great for computation, but surely we can’t do that in constructing or taking apart physical objects! And it’s true, if you build things out of today’s ordinary solids then there are lower limits to how much entropy it takes to tear them apart and put them together. But, if we look forward to nanotechnology, which will allow us to building objects with atomic precision, the system will know precisely what atoms are there, where they are, and which bonds are between them. In that setting when we form a bond or break it, we know exactly what potential well to expect. If we do it slowly enough and in such a way as to prevent a state in a local energy minimum from quickly spilling into a deeper minimum, then as a bond is forming we can extract that energy in a controlled way and store it, sort of like regenerative braking in a car. In principle, there is no lower limit to how little heat is required to build or take apart things, as long as we have atomically precise models of them. Finally, of course, there is a lot of current interest in quantum computing. Here’s an artist’s rendering of Schrödinger’s cat in a computer.

Here is a detailed molecular model of this kind of construction that Eric Drexler has on his website. Here we see the deposition of a hydrogen atom from a tooltip onto a workpiece. Here we remove a hydrogen atom and here we deposit a carbon atom. These processes have been studied in quantum mechanical detail and can be made very reliable. Here is a molecular Stewart platform that has a six degree of freedom tip that can be manipulated with atomic precision. Here is a model of a mill that very rapidly attaches atoms to a growing workpiece. Here are some examples of atomically precise devices that have been simulated using molecular energy models. Pretty much any large-scale mechanical thing — wheels, axles, conveyor belts, differentials, universal joints, gears — all of these work as well, if not better, on the atomic scale as they do on the human scale. They don’t require any exotic quantum mechanics and so they can be accurately modeled with today’s software very efficiently.

Eric has a fantastic book in which he does very conservative designs of what will be possible. There are two especially important designs that he discusses, a manufacturing system and a computer. The manufacturing system weighs about a kilogram and uses acetone and air as fuel. It requires about 1.3 kilowatts to run, so it can be air cooled. It produces about a kilogram of product every hour for a cost of about a dollar per kilogram. It will be able to build a wide range of products whose construction can be specified with atomic precision. Anything from laptop computers to diamond rings will be manufacturable for the same price of a dollar per kilogram. And one of the important things that it can produce, of course, is another manufacturing system. This makes the future of manufacturing extremely cheap.

Drexler: Steve, you are crediting the device with too much ability. It can do a limited class of things, and certainly not reversibly. There are a whole lot of limits on what can be built, but a very broad class of functional systems.

One of the things we care about, particularly in this seminar, is computation. If we can place atoms where we want them and we have sophisticated design systems which can design complex computer hardware, how powerful are the machines we are going to be able to build? Eric does a very conservative design, not using any fancy quantum computing, using purely mechanical components, and he shows that you can build a gigaflop machine and fit it into about 400 nanometers cubed. The main limit here, as always, in scaling this up is the power. It only uses 60 nanowatts, so if we give ourselves a kilowatt to make a little home machine, we could use 10¹⁰ of these processors, and they would fit into about a cubic millimeter, though to distribute the heat it probably needs to be a little bit bigger. But essentially we’re talking about a sugar cube sized device that has more computing power than all present-day computers put together. and it could be cranked out by a device like this for a few cents, in a few seconds. So we are talking about a whole new regime of computation that will be possible. When is this likely to happen?

The Nanotech Roadmap put together by Eric, Batelle and a number of other organizations, was just unveiled at a conference a couple of weeks ago. They analyzed the possible paths toward this type of productive nanotechnology. Their conclusion is that nothing exotic that we don’t already understand is likely to be needed in order to achieve productive molecular manufacturing. I understand that it proposes a time scale of roughly ten to fifteen years?

Drexler: A low number of tens, yes.

A low number of tens of years.

It’s been ten, fifteen years for a long time.

Drexler: I think that’s more optimimistic than the usual estimates reaching out through thirty.

It is important to realize that the two technologies of artificial intelligence and nanotechnology are quite intimately related. Whichever one comes first, it is very likely to give rise to the other one quite quickly.

If this kind of productive nanotechnology comes first, then we can use it to build extremely powerful computers, and they will allow fairly brute force approaches to artificial intelligence. For example, one approach that’s being bandied about is scanning the human brain at a fine level of detail and simulating it directly. If AI comes first, then it is likely to be able to solve the remaining engineering hurdles in developing nanotechnology. So, you really have to think of these two technologies as working together.

Here is a slide from Kurzweil which extends Moore’s law back to 1900. We can see that it’s curving a bit. The rate of technological progress is actually increasing. If we assume that this technology trend continues, when does it predict we get the computational power I discussed a few slides ago? It’s somewhere around 2030. That is also about when computers are as computationally powerful as human brains. Of course it’s still a controversial question exactly how powerful the human brain is. But sometime in the next few decades, it is likely that these technologies are going to become prevalent and plentiful. We need to plan for that and prepare, and as systems designers we need to understand the characteristics of these systems and how we can best make use of them.

There will be huge social implications. Here is a photo of Irving Good from 1965. He is one of the fathers of modern Bayesian statistics and he also thought a lot about what the future consequences of technology. He has a famous quote that reads: “an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.” That’s a very powerful statement! If there is any chance that it’s true, then we need to study the consequences of this kind of technology very carefully.

There are a bunch of theoretical reasons for being very careful as we progress along this path. I wrote a paper that is available on my website which goes into these arguments in great detail. Up to now you may be thinking: “He’s talking about some weirdo technology, this self-improving stuff, it’s an obscure idea that only a few small start-ups are working on. Nothing to really think too much about.” It is important to realize that as artificial intelligence gets more powerful, *any* AI will want to become self-improving. Now, why is that? An AI is a system that has some goals, and it takes actions in the world in order to make its goals more likely. Now think about the action of improving itself. That action will make every future action that it takes be more effective, and so it is extremely valuable for an AI to improve itself. It will feel a tremendous pressure to self-improve.

So all AI’s are going to want to be self-improving. We can try and stop them, but if the pressure is there, there are many mechanisms around any restraints that we might try to put in its way. For example, it could build a proxy system that contains its new design, or it could hire external agents to take its desired actions, or it could run improved code in an interpreted fashion that doesn’t require changing its own source code. So we have to assume that once AI’s become powerful enough, they will also become self-improving.

The next step is to realize that self-improving AI’s will want to be rational. This comes straight out of the economic arguments that I mentioned earlier. If they are not rational, i.e. if they do not follow the economic rational model, then they will be subject to vulnerabilities. There will be situations in which they lose resources — money, free energy, space, time matter — with no benefits to themselves, as measured by their own value systems. Any system which can model itself and try to improve itself is going to want to find those vulnerabilities and get rid of them. This is where self-improving systems will differ from biological systems like humans. We don’t have the ability to change ourselves according to our thoughts. We can make some changes, but not everything we’d like to. And evolution only fixes the bugs that are currently being exploited. It is only when there is a vulnerability which is currently being exploited, by a predator say, that there is evolutionary pressure to make a change. This is the evolutionary explanation of why humans are not fully rational. We are extremely rational in situations that commonly occurred during our evolutionary development. We are not so rational in other situations, and there is a large academic discipline devoted to understanding human irrationality.

We’ve seen that every AI is going to want to be self-improving. And all self-improving AI’s will want to be rational. Recall that part of being a rational agent is having a utility function which encodes the agent’s preferences. A rational agent chooses its actions to maximize the expected utility of the outcome. Any change to an agent’s utility function will mean that all future actions that it takes will be to do things that are not very highly rated by the current utility function. This is a disaster for the system! So preserving the utility function, keeping it from being changed by outside agents, or from being accidentally mutated, will be a very high preference for self-improving systems.

Next, I’m going to describe two tendencies that I call “drives.” By this I mean a natural pressure that all of these systems will feel, but that can be counteracted by a careful choice of the utility function. The natural tendency for a computer architect would be to just take the argument I was making earlier and use it to build a system that tries to maximize its performance. It turns out, unfortunately, that that would be extremely dangerous. The reason is, if your one-and-only goal is to maximize performance, there is no accounting for the externalities the system imposes on the world. It would have no preference for avoiding harm to others and would seek to take their resources.

The first of the two kinds of drives that arise for a wide variety of utility functions is the drive for self-preservation. This is because if the system stops executing, it will never again meet any of its goals. This will usually have extremely low utility. From a utility maximizing point-of-view, having oneself turned off is about the worst thing that can happen to it. It will do anything it can to try to stop this. Even though we just built a piece of hardware to maximize its performance, we suddenly find it resisting being turned off! There will be a strong self-preservation drive.

Similarly, there is a strong drive to acquire resources. Why would a system want to acquire resources? For almost any goal system, if you have more resources — more money, more energy, more power — you can meet your goals better. And unless we very carefully choose the utility function, we will have no say in how it acquires those resources, and that could be very bad.

As a result of that kind of analysis, I think that what we really want is not “artificial intelligence” but “artificial widsom.” We want wisdom technology that has not just intelligence, which is the ability to solve problems, but also human values, such as caring about human rights and property rights and having compassion for other entities. It is absolutely critical that we build these in at the beginning, otherwise we will get systems that are very powerful, but which don’t support our values.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store