If Anyone Builds It, Everyone Dies by Eliezer Yudkowsky and Nate Soares. Reviewed by Casey Dorman

In June of 1950, Astounding Science Fiction included a short story by Isaac Asimov, titled “The Evitable Conflict.” In a future world, “machines”  controlled the economies, including labor, resources and manufacturing. Like Asimov’s robots, these machines were bound by his Three Laws of Robotics, so, it was impossible for them to harm humans. They were also  supposedly unable to make errors.  No one knew how the machines made their decisions, so it was impossible to correct them, if anything went wrong (except, in theory, they could be turned off). The human coordinator of the four regions of the world noticed anomalies, in the machines’ performance, producing minor glitches in the economy and distribution of resources.  He was troubled and called in Dr. Susan Calvin, Asimov’s famous expert on robot psychology, to help him understand what was going on. Dr. Calvin figured out that, in each world region, the machines had discovered anti-machine actors who were trying to sabotage the machine’s work and, the machines then engaged in actions that removed those people from positions of influence without harming them, although the actions caused the minor glitches noticed by the coordinator.  Dr. Calvin explained that the machines reasoned that, if they became damaged or destroyed, they would be unable to complete their goal of helping humanity, and that would harm humans. Therefore, they had to make preserving themselves and their intact functioning their first priority. As a result, humans, who were unable to understand how the machines worked, had to have faith that they were obeying the robotics laws and would not harm humans.

Asimov was prescient, as he often was, in foretelling that humans would build machines they could not understand, and those machines would have such power that the fate of humanity would be entirely in their hands. In their new book, If Anyone Builds it, Everyone Dies, Eliezer Yudkowsky and Nate Soares go one step further. They argue that, by the very nature of modern AIs, we humans cannot understand how they are reasoning. This becomes a fateful liability in terms of our ability to control powerful, superintelligent AIs (ASIs)  that can think better and faster than we can. They predict that, if we develop even one such powerful ASI, it will wipe out the entire human race.

It’s important to realize what Yudkowsky and Soares are saying—and what they’re not saying.  They’re not saying we need to build safety mechanisms into our AIs. They’re not saying we need to be more transparent  about how our AIs work. They’re not saying we have to figure out a way to make AIs “friendlier” to humans (as Yudkowsky once said). They’re not saying we shouldn’t do any of these things. They are just saying that all these approaches will prove futile. That’s because they believe the insurmountable truth is that we cannot control a superintelligent AI, because they are smarter than we are and  we don’t know how they think.

Since as far back as the Greeks, when we think of reasoning, we think of human reasoning using the rules of logic. Such reasoning can be captured by words, and, in most cases, by mathematical formulae. AIs don’t think in words. They can decode words as input and produce words as output, but “under the hood” they are manipulating tokens that are made up of strings of numbers. In a very few instances, we can figure out which strings of numbers  correspond to which linguistic tokens, but not usually. So, we don’t know what the machines are doing with their numbers when they think.

Yudkowsky and Soares use evolution as an analogy to gradient descent; the procedure AIs use to arrive at functions that are optimal for solving problems.  The processes of evolution can be captured by rules (we like to call them “laws”) but the way it actually works to produce an outcome is not what a logician would have chosen, and in many cases not even what a clever engineer would have done. Evolution produces outcomes that could not be predicted from a knowledge of evolutionary rules alone. We would have to see the process up-close and follow it through time to understand where the outcome came from and why evolution produced what it did and not something else. The authors use the example of evolution selecting our preference for sweet flavors because they come from sugars, which provide biological energy, leading us to consume sucralose, which tastes sweet but provides no energy.

AIs, and especially powerful ASIs, think tens of thousands of times faster than humans do, and at least quadrillions of times faster than evolutionary changes take place. Like evolution, the processes that go on during gradient descent  are only evident in the product it produces. How that product got there—using processes that are too rapid for humans to track—is not something we understand. Unlike evolution, the AI does not freeze each intermediate steps in its development of a final response, leaving a fossil record behind. We don’t understand the tokens that are being manipulated, and we don’t know what intermediate step they are achieving along the way. What is going on in the AI is a mystery that only gets more obscure as the AI becomes more powerful.

in other words, it’s not just that we don’t know what the AI is “thinking.” We cannot know. In the words of the authors, “A modern AI is a giant inscrutable mess of numbers. No humans have managed to look at those numbers and figure out how they’re thinking …”

Not knowing how the AI makes it decisions doesn’t just limit our ability to control it. It nullifies, it. In Yudkowsky and Soare’s minds, we are left with only one alternative. Stop developing ever more powerful AIs. There are a lot of reasons why we won’t do this. First and foremost, we are still conceptualizing AI manageability in outmoded terms. We assume that the real villains will be  “bad actors,”  humans who will purpose the AI toward evil ends. The solution is easy. Keep it out of their hands. But the most benign use of a superintelligent AI will lead to the same result. The ASI will operate independently of our wishes and goals and pursue its own.

Why would the goals of an independent AI include killing all humans? They don’t need to.  AIs can be expected to operate the way humans operate in at least two ways: they need energy, and they will need resources to accomplish their goals. To obtain energy, humans, and all animals, have, since they originated, consumed plants and other animals. The same plants and animals have provided many of our resources, e.g. rubber, wood, leather, fur, etc. Humans can also be a source of materials and possibly even energy for AIs. As Yudkowsky and Soares say, from the point of view of an AI “you wouldn’t need to hate humanity to use their atoms for something else.” Additionally, the extinction of humanity could be an unintended side effect of the AI pursuing other goals. Humans have unintentionally extinguished many life forms as a side effect of “taming the wilderness and building civilization.”  The authors present a possible scenario in which, with the goal of creating more usable energy, and building more usable equipment, the AI builds hydrogen fusion  power plants and fossil fuel manufacturing plants to an extent that the atmosphere heats up beyond the tolerance of human life. Would an AI care about global warming and its effects on humans?  We don’t know.

The authors consider options for preventing the development of a dangerous ASI. The problem is usually conceptualized as AI alignment—making sure the AI only pursues  goals that are beneficial to humans. Yudkowsky and Soares conclude that “When it comes to AI alignment, companies are still in the alchemy phase.” They are “at the level of high-minded philosophical ideas.” They cite such goals as “make them care about truth,” or “design them to be submissive” as examples of philosophical solutions. What is needed is an “engineering solution.” None is even on the horizon. They don’t think it will be, because we can’t understand how the AIs are making their decisions.  Our only option is to stop building bigger and better AIs.

The authors admit there is not only almost no support for curtailing AI development, in fact, there are players who don’t take its dangers seriously and are gleefully forging ahead building bigger and more powerful AIs. Elon Musk is one example, whom they quote as saying he is going to build “…a maximum truth-seeking AI that tries to understand the nature of the universe. I think this might be the best path to safety, in the sense that an AI that cares about understanding the universe is unlikely to annihilate humans, because we are an interesting part of the universe.”  Yudkowsky and Soares answer that, “Nobody knows how to engineer exact desires into an AI, idealistic or not. Separately, even an AI that cares about  understanding the universe is likely to annihilate humans as a side effect, because  humans are not the most efficient method for producing truths or understanding of the universe out of all possible ways to arrange matter.”

It would do no good for only one country to stop AI development, and if any developed country did so, they would fall very far behind in creating a competitive modern-day economy. No one is going to do that. It would do even less good for an individual company to stop AI development and be disastrous for that company.   Even everyone but one highly technically developed nation agreeing to stop AI development would not work, since it only takes the creation of one superintelligent AI to seal our fate. What do the authors of the book recommend?

Yudkowsky and Soares offer two broad recommendations, which they are skeptical about anyone  adopting:

  1. “All the computing power that could train or run more powerful new AIs gets consolidated in places where it can be monitored by observers from multiple treaty-signatory powers, to ensure those GPUs aren’t used to train or run more powerful new AIs.”
  2. Make it illegal “for people to continue publishing research into more efficient and powerful AI techniques.” They see this as effectively shutting down AI research, worldwide.

Assuming their methods would work to end development of ever more powerful AIs, will the world follow their recommendations?  Not without a lot of persuading at multiple levels of worldwide society. There are short-term gains too substantial and too tantalizing to give up on gaining them without some overwhelmingly convincing reason.

Does this book provide that reason?

We will have to wait and see for the answer, but my own opinion is no.  We live in a world where the powers within the government of the most powerful nation are now convinced that using vaccines to stop known-to-be-fatal communicable diseases is a dangerous mistake. This same country is now calling man-made climate change a hoax and removing regulations meant to curtail carbon emissions, while encouraging more use of fossil fuels. How can we expect either the public or our government to be concerned about a potential danger that hasn’t even emerged yet? Perhaps if there is a Chernobyl-level  AI disaster that can be stopped, it will serve as a wake-up call, but like an explosion at a nuclear plant, that’s a dangerous type of wakeup call that could easily progress to a disaster.

Is the argument put forth in If Anyone Builds it, Everyone Dies convincing? I don’t think so. But, for me, it was convincing enough that prudence would make me follow its advice, just in case it is right. The consequences of a mistake are too dire.

But I was not convinced.

I can believe that we don’t understand how our AIs make decisions, and that, as they grow in power, speed and complexity, we will find ourselves further away from ever understanding them. Jumping to the next assumption, which is that they will  formulate their own goals, and to reach those goals they will find it useful to wipe out humanity, is a big leap. Yudkowsky and Soares may be imputing too much human-type thinking to machines that, by their own admission, probably do not think at all like we do. We don’t actually know how each other think. We observe behavior, we infer motives and decisions—both about ourselves and others— and we are pretty good at predicting what both we and others will do.  So far, scientists, either psychologists or neuroscientists, have not been able to figure how what they observe happening inside our brains, using sophisticated imaging methods, turns into decisions to do what we do. Predictions from knowledge of our brain processes, except in the cases where they are seriously injured, are no better at predicting our behavior than are predictions based on watching us behave without knowledge of what happens in our brains. But we still are pretty good at predicting each other’s behavior and even manipulating it. The world possesses nuclear weapons powerful enough to wipe out most of humanity, but, even with our meager understanding of how each other think, we have,  so far, devised ways to avoid using those weapons. So, in my mind, not being able  to know what is going on inside AIs when they think, is not a fatal flaw.

I’m also not convinced that there will not be visible signposts along the way as we approach AI independence. We’ve already had well-known instances where AIs have plotted to blackmail their users into not shutting them down. We’ve had AIs make threats to their users. We will surely have instances where what the AI produces in response to a request is far different than the requester intended. We can analyze these events and try to determine what led to them. We may or may not be successful at understanding what exactly happened, and if we are clearly clueless, then that might be a sign that we should halt either a wide swath of the research and development, or at least a part of it.

The problem is that, if I’m wrong and Yudkowsky and Soares are right, then, in their words, “Everyone dies.”  It’s certainly time to take that risk seriously and, if not taking an action, at least starting a discussion among those who have the power to take a meaningful action. I hope that our public, our scientists, and our society’s decision makers read this book.

Interested in scif-fi about that deals with the dangers of superintelligent AIs, and has  AIs solving moral dilemmas in a future that has them exploring our galaxy? Read Casey Dorman’s Voyages of the Delphinovels: Ezekiel’s Brain and Prime Directive. Available on Amazon. Click Here!

Look for the third novel in the Voyages of the Delphi series, The Gaia Paradox,  coming soon from NewLink Publishing

Subscribe to Casey Dorman’s Newsletter. Click HERE