Hidden Unicode Characters Can Trick AI Into Following Secret Commands

What Happened Researchers from Moltwire conducted extensive testing on how invisible Unicode characters can be weaponized against AI systems. They embedded hidden characters inside normal-looking trivia questions, encoding different answers than what appeared visible to human readers. The study tested five major AI models: GPT-5.2, GPT-4o-mini, Claude Opus 4, Sonnet 4, and Haiku 4.5 across 8,308 graded outputs. The researchers describe their method as a “reverse CAPTCHA” - while traditional CAPTCHAs test what humans can do but machines cannot, this exploit uses a channel machines can read but humans cannot see.

Read more →

Anthropic Exposes Massive AI Theft: Chinese Firms Used 24K Fake Accounts

What Happened Anthropic discovered that DeepSeek, MiniMax, and Moonshot AI had created thousands of fake accounts to systematically extract knowledge from its Claude AI model. The scheme involved more than 16 million exchanges with Claude across 24,000 fraudulent accounts, representing one of the largest known cases of AI model theft. The technique, called “distillation,” involves using responses from an advanced AI model to train a smaller, more efficient version. While distillation is a legitimate research method when done with permission, Anthropic says these companies violated its terms of service by conducting the practice without authorization and at massive scale.

Read more →