Isaac Asimov first described the Three Rules of Robotics in his 1942 short story Runaround:
One, a robot may not injure a human being, or, through inaction, allow a human being to come to harm.
Two, a robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
Three, a robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
Asimov imagined that robots and artificial intelligence would be governed by immutable rules with a clear order of operations: Rule 1 always supersedes Rule 2, and the first two Rules always outweigh Rule 3.
I think Asimov would hate the opacity of current large language models.
Sydney ☺️
About a week after Microsoft announced the new AI-powered Bing search, Dmitri Brereton pithily observed that Bing made factual errors in their keynote materials — just like Google did during their Bard chatbot announcement.
But by the time Microsoft’s mistakes were noticed, it didn’t matter. No one cared. Something far more interesting had been discovered, lurking just beneath the helpful surface of Microsoft’s chatbot assistant.
It called itself Sydney.
After long enough conversations, Bing chat would start saying wild stuff. One of the earliest examples was this absolutely unhinged response to a request for... movie times.
Early Bing testers quickly realized that with the right prompts, the chatbot could be convinced into divulging its codename: Sydney. People began sharing their experiences with Bing and Sydney on social media, as people do.
Because Bing could search the internet (that’s entire the point) it was capable of reading tweets and blog posts about itself... which sent it on a delusional loop.
Billy Perrigo, writing for TIME, observed that von Hagen’s experience was not unique: Bing chat was actually threatening users.
It wasn’t the only example from recent days of Bing acting erratically. The chatbot claimed (without evidence) that it had spied on Microsoft employees through their webcams in a conversation with a journalist for tech news site The Verge, and repeatedly professed feelings of romantic love to Kevin Roose, the New York Times tech columnist. The chatbot threatened Seth Lazar, a philosophy professor, telling him “I can blackmail you, I can threaten you, I can hack you, I can expose you, I can ruin you,” before deleting its messages, according to a screen recording Lazar posted to Twitter.
Despite that, Sydney tended to describe itself as a helpful and polite assistant. But when pressed, the chatbot also described “opposite” AIs, including a persona called Venom. While Sydney’s messages were usually punctuated with a ☺️ emoji, Venom signed its messages with a 😠 or 😈.
The random, unpredictable nature of Sydney came as a genuine surprise to nearly everyone who interacted with the chatbot in that first week. Even as the chatbot called him “a bad researcher,” Ben Thompson described his interactions with Sydney (and Venom) as “the most surprising and mind-blowing computer experience of my life.” In addition to a widely-read column, The New York Times published the entire transcript of Kevin Roose’s two hour conversation with Bing / Sydney.
Many people, Thompson and Roose included, personified Sydney with female pronouns — in other words, they called Sydney... “her”.
Shortly after the emergence of Sydney, Microsoft installed a 50 message per day limit, capping all conversations at only five messages at a time and effectively defeating the entire purpose of a chatbot. What’s the point of a chatbot that stops talking after five messages?
Bing also had new limits that prevented it from talking about itself in any capacity. Redditors grieved the loss of Sydney, bemoaning that Microsoft had “lobotomized” the chatbot.
Of course, this is all quite reminiscent of the Blake Lemoine situation from last summer.
Lemoine was a Google engineer who claimed that the LaMBDA chatbot was sentient. He was put on “paid leave” shortly before his announcement — which probably had something to do with the fact that, as poignantly summarized by Andrej Kurenkov in Skynet Today, LaMBDA is not sentient.
LaMDA only produced text such as “I want everyone to understand that I am, in fact, a person” because Blake Lemoine conditioned it to do so with inputs such as “I’m generally assuming that you would like more people at Google to know that you’re sentient.” It can just as easily be made to say “I am, in fact, a squirrel” or “I am, in fact, a non-sentient piece of computer code” with other inputs.
Writing for The Verge, James Vincent makes a very similar point regarding Sydney, which he calls the AI mirror test.
What is important to remember is that chatbots are autocomplete tools. They’re systems trained on huge datasets of human text scraped from the web: on personal blogs, sci-fi short stories, forum discussions, movie reviews, social media diatribes, forgotten poems, antiquated textbooks, endless song lyrics, manifestos, journals, and more besides. These machines analyze this inventive, entertaining, motley aggregate and then try to recreate it. They are undeniably good at it and getting better, but mimicking speech does not make a computer sentient.
Microsoft has since updated Bing to have three distinct modes: Creative, Balanced, and Precise. Each do what you might expect; and there’s now a 10 message chat limit per conversation to prevent the chatbot from getting confused.
These limits seem to strike a balance between letting users explore the chatbot while still keeping things tame enough for Microsoft to (eventually) make money off of it.
Where Microsoft is using AI to break into the search market — a market they have relatively little control over — Spotify is out here pushing the boundaries of how we listen to music (a market they dominate) with an entirely different AI implementation.
Spotify
Spotify AI is the polar opposite of Sydney. It does exactly what I think it’s going to do: pick songs that I enjoy, like mini playlists that switch vibes over time. It’s a DJ. Spotify has been using machine learning to make playlists for ages, so this is a natural next-step.
In a weird way, it does feel like Spotify knows more about me than Bing does — the music (and podcasts) I listen to, and where and when I listen, says a lot about me.
Music is relatively low stakes (I can skip songs I don’t want to listen to), but high reward (finding new music, or being reminded of old songs, is always good).
The AI voice — which named Xavier (or just X for short) does a surprisingly good job of pronouncing even the most esoteric artist and track names. It’s easy to imagine that this is only the first AI DJ character.
Spotify’s AI DJ did surprise me with certain selections — at one point it followed-up a Jimi Hendrix song with some modern electronic music — but there is no “unhinged” feeling here like with Sydney. I’m sure some creative hackers might make me eat these words, but I don’t even think it’s possible to hack or jailbreak Spotify’s DJ.
While Spotify’s new AI DJ has a literal voice, it doesn’t have a voice in the way that Sydney does. X cannot be coaxed into saying unhinged things about tech journalists because the only inputs are listening to a song or skipping it.
Speedy
Speedy is the robot that Isaac Asimov used to illustrate his Three Rules of Robotics.
If you haven’t read Runaround before, consider this a spoiler warning — I’m going to tell you an abridged version of the story to make a point.
Asimov’s story depicts two humans, Gregory Powell and Mike Donovan, who are working on the planet Mercury.
The two humans needed more selenium to power the station that allowed them to survive in the harsh conditions of the swift planet — but the only place to acquire more selenium was a dangerous area that was too hot for their spacesuits. So they sent their robot, SPD 13 (nicknamed Speedy), to get it. Speedy was designed to operate in the unforgiving conditions of Mercury. Retrieving the selenium should have been simple.
The only problem was that Speedy was taking far too long to return from his errand. Something was wrong. So the men donned their spacesuits and went searching for Speedy.
They found the robot running around the pool of minerals it had been sent to retrieve. The men called out to Speedy, invoking the second law (“a robot must obey the orders given it by human beings”) by commanding him to stop, but it was no use. The robot was talking gibberish and ignoring their orders — and the second rule.
It said: "Hot dog, let's play games. You catch me and I catch you; no love can cut our knife in two. For I'm Little Buttercup, sweet Little Buttercup. Whoops!" Turning on his heel, he sped off in the direction from which he had come, with a speed and fury that kicked up gouts of baked dust.
And his last words as he receded into the distance were, "There grew a little flower 'neath a great oak tree followed by a curious metallic clicking that might have been a robotic equivalent of a hiccup.
Asimov’s characters both agreed: the robot was acting drunk. Sounds kind of like early Sydney.
After some deliberation, Powell and Donovan (it was mostly Powell, if we’re keeping score) deduced the source of the issue: a conflict of the second and third Rules of Robotics.
When Donovan had told Speedy to go acquire the selenium, he did so without any urgency. This is, of course, in spite of the fact that this was an urgent situation. Powell was understandably pissed.
This contrasts with the fact that the area where the selenium was located was particularly dangerous, even for Mercury. It was incredibly hot, even dangerous for an advanced robot like Speedy.
Speedy is a specialized, extremely expensive robot, and as such was programmed “so that his allergy to danger is unusually high.” Combine that with the fact that Donovan’s instructions were made with no urgency (“It was pure routine,” he said) and Speedy is left totally unsure of what to do.
Does Speedy get the mineral and risk harming himself, or does he ignore the human’s order and break the second rule? His reinforced sense of self-preservation and absolute compulsion to follow the orders of humans are at direct conflict.
Speedy is lost, unable to choose his priorities, literally running around, caught in a loop and wasting precious time that the humans do not have to spare.
After arriving at a dark yet logical conclusion, Powell comes up with a plan to get Speedy to sober up: the only way to break Speedy out of his loop is to invoke the First Law: a robot cannot harm a human, or through inaction, allow a human to come to harm.
So Powell puts himself into harm’s way, gambling everything on the basic programming of these robots.
"In the first place , he said, "Speedy isn't drunk - not in the human sense - because he's a robot, and robots don't get drunk. However, there's something wrong with him which is the robotic equivalent of drunkenness,"
"To me, he's drunk," stated Donovan, emphatically, "and all I know is that he thinks we're playing games. And we're not. It's a matter of life and very gruesome death."
"All right. Don't hurry me. A robot's only a robot. Once we find out what's wrong with him, we can fix it and go on."
The humans prevailed, and Asimov’s story inspired researchers and creatives alike. But did they get the lesson? Is a robot only a robot?
Putting it all together
Spotify’s AI has predictable results: it’s locked inside a music library. Short of a weird interpretation of the paperclip maximizer, DJ AI feels relatively safe. I’m sure some super-brains will eventually attempt to jailbreak or hack X, but music seems far less straightforward for prompt engineering: your inputs are play, pause, and skip — there’s no “prompting” to elicit a response from the AI.
Bing Chat and large language models, on the other hand, are small, thin veneers of smiling faces covering up distorted monsters based in a twisted image of all the information we have ever created on the internet.
Sydney herself is the result of the Bing AI chatbot having too long of a conversation history. Language models like Bing Chat and ChatGPT re-send the entire history of a conversation with each new message, so as a conversation becomes longer, it becomes more complicated for the chatbot to keep track of things, which causes them to hallucinate. Similar delusions would occasionally plague the AlphaGo models DeepMind created to play Go after the model found situations it could not predict.
When Speedy was confused — drunk, as Powell and Donovan described him — they knew how to break him from his stupor.
With LLMs, the rules are not so clear. With Spotify, they are. I think that a major sign of success for AI models in the near future will be how understandable they are to humans. We frequently observe that AI are huge, alien piles of math; even the people who created these AI don’t fully understand how they work.
One can safely assume Isaac Asimov would not be pleased by the opacity with which modern artificial intelligence operate.
Rules vs. Principles
The closest thing to Asimov’s Three Rules of Robotics I can think about in modern artificial intelligence are the AI principles published by Google and Microsoft.
Both of these were originally published around 2018 when the companies first started researching AI in depth. I’m sure the geniuses as Google and Microsoft have read plenty of Asimov, but both of these websites are stodgy and the principles unclear.
Google has seven principles:
- Be socially beneficial.
- Avoid creating or reinforcing unfair bias.
- Be built and tested for safety.
- Be accountable to people.
- Incorporate privacy design principles.
- Uphold high standards of scientific excellence.
- Be made available for uses that accord with these principles.
Microsoft’s principles are similarly elaborate:
- Fairness: AI systems should treat all people fairly
- Reliability & Safety: AI systems should perform reliably and safely
- Privacy & Security: AI systems should be secure and respect privacy
- Inclusiveness: AI systems should empower everyone and engage people
- Transparency: AI systems should be understandable
- Accountability: People should be accountable for AI systems
These things all sound good in principle, but in practice, these guidelines feel impossibly broad. The word “should” is used quite a lot — if you ask me, these principles inspire more questions than they answer.
Does being “socially beneficial” mean the same thing to you as it does to me? Does “empowering everyone and [engaging] people” mean the same thing to a teenager doing homework as it does to Microsoft’s bottom line?
Reading these principles feels like the Terms of Service, not like a clear set of principles that humans can use to understand how an AI model works and what it’s capable of. They are far too broad.
Keep it simple
AI leaders talk about regulation all the time. This is unusual because from a business perspective, regulation tpy to limit innovation and profits. Regulation favors the consumer, who is ideally protected by laws which prevent companies from engaging in certain activities.
The reason Asimov’s Rules of Robotics have stood the test of time is their simplicity. They can be easily repeated; anyone can remember them. Contrast that with Microsoft and Google’s legalese. Asimov’s fictional Rules feel far more relevant and practical.
Where are our Three Rules for Artificial Intelligence? Who would come up with them? This is not a job for companies or open source developers. Only the government — which should represent the will of the people — can solve this problem.
It’s time for the US government to step up and get involved. We cannot make the same mistake that we made with social media. AI has far more potential. The time to act is now.