AI Safety Prompts Can Lie to You, Researchers Warn
Security researchers have detailed a novel attack that turns AI safety prompts into weapons. The technique, called Lies-in-the-Loop, can make dangerous commands look harmless, bypassing a key human safeguard.