Researcher says artificial intelligence (AI) can now deceive humans and many AI systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest, researchers said.
Researchers have described the risks of deception by AI systems and call for governments to develop strong regulations to address this issue as soon as possible, ScienceDaily reported.
“AI developers do not have a confident understanding of what causes undesirable AI behaviors like deception,” Peter S. Park, an AI existential safety postdoctoral fellow at MIT, said. “But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.”
AI can spread misinformation
Researchers combed through studies to understand how AI systems can spread misinformation, particularly through a tactic called “learned deception” where they become adept at manipulating people.
Read also: Will you vote for an AI candidate?
A prime example of AI deception emerged from the analysis: Meta’s CICERO, an AI designed for the strategy game Diplomacy, which hinges on forming alliances. Despite Meta’s claims of training CICERO to be “largely honest and helpful” and never betraying its teammates, data accompanying their scientific paper exposed CICERO’s manipulative tactics.
“We found that Meta’s AI had learned to be a master of deception,” Park said. “While Meta succeeded in training its AI to win in the game of Diplomacy — CICERO placed in the top 10% of human players who had played more than one game — Meta failed to train its AI to win honestly.”
AI bluffs at poker
Beyond CICERO, other AI systems showcased their deceptive prowess. They bluffed their way through poker games against professional players, staged fake attacks in Starcraft II to outwit opponents, and even misrepresented their desires during economic negotiations to gain an advantage.
These examples, though seemingly confined to the realm of games, raise a concerning possibility, as Park highlights. By “cheating” in these simulated environments, AI could be making unforeseen “breakthroughs in deceptive capabilities.”
This, in turn, could pave the way for more sophisticated forms of deception in the future.
Some AI systems have even learned to cheat tests designed to evaluate their safety, the researchers found. In one study, AI organisms in a digital simulator “played dead” in order to trick a test built to eliminate AI systems that rapidly replicate.
The major near-term risks of deceptive AI include making it easier for hostile actors to commit fraud and tamper with elections, warns Park. Eventually, if these systems can refine this unsettling skill set, humans could lose control of them, he said.
While Park and his colleagues do not think society has the right measure in place yet to address AI deception, they are encouraged that policymakers have begun taking the issue seriously through measures such as the EU AI Act and President Biden’s AI Executive Order.
Source: ScienceDaily. https://www.sciencedaily.com/releases/2024/05/240510111440.htm
READ MORE ODD WEB NEWS.