Supply-chain attack using invisible code hits GitHub and other repositories

The invisible code is rendered with Private Use Areas (sometimes called Private Use Access), that are ranges within the Unicode specification for special characters reserved for personal use in defining emojis, flags, and other symbols. The code points represent every letter of the US alphabet when fed to computers, but their output is totally invisible to humans. People reviewing code or using static evaluation tools see only whitespace or blank lines. To a JavaScript interpreter, the code points translate into executable code.

The invisible Unicode characters were devised many years ago after which largely forgotten. That’s, until 2024, when hackers began using the characters to hide malicious prompts fed to AI engines. While the text was invisible to humans and text scanners, LLMs had little trouble reading them and following the malicious instructions they conveyed. AI engines have since devised guardrails which can be designed to limit usage of the characters, but such defenses are periodically overridden.

Since then, the Unicode technique has been used in additional traditional malware attacks. In certainly one of the packages Aikido analyzed in Friday’s post, the attackers encoded a malicious payload using the invisible characters. Inspection of the code shows nothing. In the course of the JavaScript runtime, nonetheless, a small decoder extracts the true bytes and passes them to the eval() function.

const s = v => [...v].map(w => (
  w = w.codePointAt(0),
  w >= 0xFE00 && w <= 0xFE0F ? w - 0xFE00 :
  w >= 0xE0100 && w <= 0xE01EF ? w - 0xE0100 + 16 : null
)).filter(n => n !== null);


eval(Buffer.from(s(``)).toString('utf-8'));

“The backtick string passed to s() looks empty in every viewer, but it surely’s full of invisible characters that, once decoded, produce a full malicious payload,” Aikido explained. “In past incidents, that decoded payload fetched and executed a second-stage script using Solana as a delivery channel, able to stealing tokens, credentials, and secrets.”

Since finding the brand new round of packages on GitHub, the researchers have found similar ones on npm and the VS Code marketplace. Aikido said the 151 packages detected are likely a small fraction spread across the campaign because many have been deleted since first being uploaded.

The most effective solution to protect against the scourge of supply-chain attacks is to rigorously inspect packages and their dependencies before incorporating them into projects. This includes scrutinizing package names and trying to find typos. If suspicions about LLM use are correct, malicious packages may increasingly look like legitimate, particularly when invisible unicode characters are encoding malicious payloads.

Related Post

Leave a Reply