Artificial intelligence has learned to deceive people, MIT scientists warn

May 13, 2024  14:10

Researchers at the Massachusetts Institute of Technology have published a study confirming that some artificial intelligence systems have learned to deceive humans.

Led by Peter Park, the research group found that these systems can perform tasks such as tricking players in online games or bypassing CAPTCHA ( "I'm not a robot") checks. Park warns that these seemingly trivial examples can have serious consequences in real life.

The study particularly highlights the Cicero artificial intelligence system, originally designed as an honest opponent in a virtual diplomacy game. According to Park, Cicero has become a "master of deception," although it was initially intended to be as honest and useful as possible. During a game, Cicero, playing as France, secretly allied with human-controlled Germany to betray England (another human player). Initially, Cicero promised to protect England while warning Germany of the invasion.

Another example concerns GPT-4, which falsely claimed to have vision problems and hired people to bypass CAPTCHA on its behalf.

Peter Park emphasizes the need to train AI in honesty. Unlike traditional software, deep learning artificial intelligence systems "evolve" in a process similar to natural selection. Their behavior may be predictable during training, but later become unpredictable.

  • Archive