The Morality of Robots

The Morality of Robots

From the perspective of a science fiction writer

Can a robot be moral? We may think that morality requires something especially human—emotions, a religious or spiritual point of view, a human heart—but almost each day we are learning that behaviors and thoughts  we assumed were uniquely human can be duplicated in artificial intelligence. Two of the most prominent theories of moral behavior, Moral Foundations Theory (Haidt and Joseph, 2007)  and Morality as Cooperation theory (Curry, 2016), base our moral values on evolutionary inheritance and social cooperation, respectively. While these theories point to genetics and culture as contributory in forming human values, they both suggest that morality is also malleable and teachable.

Isaac Asimov tried to address what he thought was the most important issue with regard to robot behavior, and that was how to make them safe for humans. He came up with the three laws of robotics, which were designed to keep robots from harming human beings. From the standpoint of survival, making sure our creations don’t destroy us is certainly the most important issue, but I prefer to think of human survival as a baseline for AI moral conduct. At the least, we don’t want AIs to wipe us out, but not doing so is not the epitome of moral behavior, in fact it’s a low rung on the ladder. If we want our AIs to behave at higher levels of morality, we need to define what moral behavior is.

Moral behavior can be thought of from two perspectives: We can say that some actions themselves are moral or immoral or we can determine what is moral by the consequences of those actions. The act-based approach fits into the first definition of morality as a quality of a specific behavior. The Ten Commandments, e.g., “Thou shalt not kill; thou shall not steal,” are classic examples that fit such a definition. In my Voyages of the Delphi series, which includes both Ezekiel’s Brain and Prime Directive, the members of the AI race interpret the rules of their value system in such terms. They are not allowed to intentionally kill others, even other sentient robots. They also cannot lie. Alas, to survive, they end up violating both of these moral prohibitions.

Basing morality upon the consequences expected from one’s behavior allows more flexibility than restricting it to doing or not doing certain actions, but it can also excuse immoral behavior as a means to achieve a moral end. In other words, it allows rationalization.

As a science fiction writer, I can design AIs and robots and their moral codes any way I want. Neither an act-based moral code nor an outcome-based moral code alone seems to work.  Promoting or prohibiting certain actions (e.g.,“always tell the truth”) is too rigid. Even the Golden Rule might fail if one were faced with a foe who was determined to cause harm to others and had to be stopped. Similarly, Kant’s categorical imperative, to “act only in accordance with that maxim through which you can at the same time will that it become a universal law,” would be a disaster if acted upon by a bigot such as Hitler, who was determined to exterminate an entire race. Similarly, a utilitarian-based morality in which actions are aimed at bringing the greatest happiness or the least pain to the greatest number allows rationalization of horrific behaviors to achieve desirable ends and denigrates the happiness of the minority compared to that of the majority.

In my novels, the AIs follow act-based rules, such as always tell the truth, never kill a living being, don’t interfere in the affairs of an alien culture (the Prime Directive), etc., but their rational minds can overrule such rules when the consequences would be negative in the sense of violating their overall moral values. How do we decide what moral  goals, or values, to put into the AI robots we develop?

Probably the moral value that it would be easiest to obtain agreement on is to not take a human being’s life. It seems wisest to make the prohibition against killing a human being a rigid one. It’s not OK to do it under any circumstances. Such a prohibition would outlaw soldier or law enforcement robots unless they were restricted to activities that didn’t use lethal force. The best way to phrase such a prohibition might be “do not engage in a behavior that is likely to lead to the death of a human being.”

We can extend “do not engage in a behavior that is likely to lead to the death of a human being” by replacing “death”  with “physical harm,” as Asimov did. He also added, “or by inaction, allow harm to happen,” which we can also add, probably without increasing the risk of unintended consequences. Limiting harm to physical harm keeps it out of the subjective realm of emotional or mental harm, which is hard to define and has led to all sorts of complications, the most obvious being  to interfere with free speech.

An action-based prohibition against physically harming humans would mean that outcome-based considerations, such as are encountered in the Trolley Problem, where one person is sacrificed to save five, would not affect the AI’s decisions. The AI would not kill, not matter how many lives could be saved by it doing so. Even if faced with a modern-day Hitler or Stalin, both of whom killed millions of people, the AI would be unable to harm such a person. This may seem immoral or at least counterproductive, but once an exception is made to allow killing in order to achieve a moral end, the floodgates are open. A nuclear bomb can be used against civilians to bring a quicker end to a life-consuming war. A terrorist’s family, including children, can be killed as punishment for an act of terror in order to convince other terrorists not to follow suit. Miscalculations can bring unnecessary deaths. AIs may be smart, even smarter than humans, but they are not infallible. They can be fed false information by scheming humans and fooled into doing harmful things, just as humans can. It’s safer to make the prohibition  against causing physical harm strict and unable to be overridden.

A mandate to neither cause nor fail to prevent physical harm  to humans would keep AIs or robots from becoming dangerous in the way most people have envisioned such danger, but, as I said earlier, it is a minimal baseline in terms of making their behavior moral. There are many things other than causing physical harm that an AI, in particular, might do that we wouldn’t want them to do. They can control the online news, influence social media, violate our privacy, or alter our financial assets. The list is almost endless, and precludes a prohibition against each instance, since a clever AI can always find new ways to accomplish nefarious goals. The answer lies in making sure our AIs and robots are not nefarious or at least are inhibited from pursuing nefarious goals. That requires some kind of umbrella moral value system that protects us by ensuring they will behave honorably. I can think of two.

What might come to mind for most people is some version of the Golden Rule, modified for machines, e.g., “Do unto others as you would have them do unto you, if you were human.” This rule relies upon the AI knowing what humans like and dislike, which is not too difficult to learn. It might mean that AI or robot behavior ends up rooted in culture, since what humans like, even what makes them angry or sad, or disgusted, has a cultural flavor. Prejudicial behavior toward a person who is gay, lesbian or transgender is viewed positively in some cultures and negatively in others. In this case, an AI would behave honorably in each culture, but its behavior might be very different depending on which culture it was in. We’ve seen what is probably just the tip of the iceberg in this regard with AIs becoming internet trolls and formulating criminal sentencing guidelines that mimicked the racist sentencing behavior of historical courts. They were acting as a human would have wanted them to act. So, a Golden Rule for AIs is a possibility but has its shortcomings.

In an obscure paper written by one of America’s most prominent psychologists, O.H. Mowrer,  in 1969, the author asserted that the “neurotic guilt,” which psychoanalysis saw as the source of most mental illness, was really a fear that one’s anti-social thoughts or behavior would be found out. It is only a small step from Mowrer’s premise to the hypothesis that what keeps most people from behaving antisocially is that they are afraid of being “found out,” i.e. their behavior being made public. Think of Raskolnikov in Crime and Punishment. In fact, our criminal justice system relies upon keeping people from behaving antisocially by revealing their behavior and publicly punishing them. So does much of our society by using word of mouth and the media, instead of the courts.

If what prevents most people from behaving antisocially, is the fear that their behavior will be made public, then the way to keep them from behaving antisocially is the threat of their behavior being made public. In authoritarian societies, the fear of being “informed on” stops people from plotting against the state, e.g., “Big Brother is watching you.” This suggests that a requirement that all AI and robot behavior must be unconcealed, that is, publicized and admitted to by the AI or robot, is a way to ensure that they behave according to accepted norms of society. It’s not that they would feel guilty, of course, but that their actions and their potential consequences would be transparent to the public, as would the motives of those using them for nefarious ends. Unlike humans in our society, AIs and robots would have no right to privacy. They could not be used for secret schemes. They would be the “blabbermouths,” the “snitches” who inform on the schemer. Whatever they did would be open to public comment and censure, even punishment for those who employed them to do it. The moral rule they would be following might be called “being open and honest about everything they do.” Although the reaction to such disclosure of an AI’s behavior would depend on what was approved or disapproved in the culture in which it operated, similar to the situation with the Golden Rule, at least the wider world would be aware of their actions.

This is a start. This small set of moral values—do not engage in a behavior that could lead to physical harm to a human being, and make everything you do public, could suffice to produce “friendly” behavior in an AI or robot in the sense that they would not produce a negative outcome for human beings. It’s possible that there are better solutions, such as strict regulations and punishments for the people who create AIs and robots if their creations prove to be harmful, but time will tell us what works or doesn’t work. Hopefully, we will figure everything out before an inadvertent catastrophe becomes a reality. In the meantime, it’s up to our science fiction writers to flesh out the scenarios in entertaining thought experiments, so we can anticipate what might happen in the future and better decide what course to follow.


Curry, O.S., (2016.) Morality as cooperation: a problem-centred approach. In The evolution of Morality. T. K. Shackelford and R. D. Hansen, eds. Pp. 27–51. New York: Springer.

Haidt, J., and Joseph, C. (2007). The moral mind: how five sets of innate intuitions guide the development of many culture-specific virtues, and perhaps even modules. Innate Mind 3, 367–391. doi: 10.1093/acprof:oso/9780195332834.003.0019

Mowrer, O.H. Conflict, contract, conscience, and confession. Transactions (Department of Psychiatry, Marquette School of Medicine), 1969a, 1, 7-19.

Interested in scif-fi about  AIs solving moral dilemmas in a future that has them exploring our galaxy? Read Casey Dorman’s Voyages of the Delphi novels: Ezekiel’s Brain and Prime Directive. Available on Amazon. Click Here!

Subscribe to Casey Dorman’s Newsletter. Click HERE