The Alignment Problem:
Machine Learning and Human Values
By Brian Christian
I must admit that I was taken by surprise by the contents of Brian Christian’s recent book, The Alignment Problem. The book came out in 2020 and made quite a splash among the artificial intelligence (AI) and machine intelligence community. Much of the public, including myself, had been made aware of “the alignment problem” by Nick Bostrom’s book, Superintelligence, or the writings of people such as MIT physicist, Max Tegmark. In fact, in my case, it was the conundrum of the alignment problem that spurred me to write my science fiction novel, Ezekiel’s Brain. Simply put, the alignment problem in the AI world is the question of how you create a superintelligent AI that is “friendly,” i.e., helpful rather than dangerous, to humanity. It’s such a difficult question that, in my novel, Ezekiel’s Brain, the creators of the superintelligent AI fail, and the result is disastrous for the human race. What I was expecting from Brian Christian’s book was another description of the nightmare scenarios of the kind I wrote about in my novel, and experts such as Bostrom and Tegmark talk about in their writings. That wasn’t what The Alignment Problem was about… or at least not what it was mostly about.
Christian gives some detailed accounts of disastrous results applying the most sophisticated AI learning algorithms to actual human situations. Some of these are well-known, such as attempts to censor social media content, or to produce an algorithm that aided judges in criminal sentencing, or to develop screening tools for employment selection. Training AIs using data on human decisions simply amplified the biases, including gender, racial and ideological, we humans use to make our decisions. These were instances of AIs performing in a way that was more harmful than helpful to humans, and they were results of which I had previously only been vaguely aware. Although they were not the kind of misalignment that I was concerned with and had prompted me to buy the book, they expanded my concept of alignment considerably.
Instead of providing nightmare scenarios of the danger of superintelligent AIs that are not aligned with what is best for humanity, the bulk of Christian’s book provides an exquisite history, up to the present, of the efforts of the AI community to define how machines can learn, what they are learning and what they ought to be learning, and how to identify whether the progress being made is bringing AIs in closer alignment with what humans want from them. What was most surprising and gratifying to me, as a psychologist, was how much this effort is entwined with progress in understanding how people learn and what affects that learning process.
Christian writes his book like a good mystery, but rather than following a narrow plot, the breadth of inquiry is extraordinary. Even a psychologist, such as myself, learned about findings in psychology and learning and child development of which I was unaware. How computer scientists who develop AI use psychological findings to open up new avenues in machine learning is fascinating to hear about. The collaborations are thrilling, and both psychologists and AI researchers who are not aware of how much is happening on this front should read Christian’s book to get an idea of how exciting an important this area of research is becoming.
Although I have some background related to psychology, AI, and the alignment problem, this book is written for the non-expert, and the interested layperson can easily understand it and bring their knowledge of the subject up to date. I found it one of the most captivating and informative books I have read in the last several years, and I recommend it for everyone for whom this topic sparks an interest.
The Alignment Problem may be purchased through Amazon by clicking HERE
Ezekiel’s Brain may be purchased through Amazon by clicking HERE.
Want to subscribe to Casey Dorman’s fan newsletter? Click HERE.