Technology

The first AI-built cyber weapon nearly went live, then Google caught it

Ryan Brothwell 4 min read
The first AI-built cyber weapon nearly went live, then Google caught it

Key Points

  • Google Threat Intelligence Group says it identified the first known zero-day exploit weaponised with help from an AI model.
  • The exploit was a Python script bypassing two-factor authentication on a popular open-source server administration tool.
  • Telltale AI fingerprints in the code included a hallucinated CVSS score, textbook docstrings and a clean ANSI colour class.
  • Google says Gemini was not the model used, but high confidence remains that an AI model wrote the exploit.
  • GTIG worked with the vendor to patch the flaw and may have prevented a planned mass exploitation event.

Google’s threat hunters just stopped what they believe is the first real zero-day exploit ever written by an AI, and the criminals behind it had no idea anyone was watching.

Google’s Threat Intelligence Group (GTIG) revealed in its latest AI Threat Tracker report that a criminal hacking crew was sitting on a previously unknown vulnerability in a popular open-source server administration tool, one that let them slip past two-factor authentication and walk straight into systems they had no business being in.

The plan was a mass exploitation event. Hit thousands of targets at once, cash in before anyone noticed.

GTIG noticed first. Working with the affected vendor, Google quietly disclosed the flaw, got it patched, and may have killed the campaign before it ever launched. The name of the tool has been kept under wraps.

Here is the bit that should make your week interesting if you work anywhere near security: GTIG says it has high confidence that the exploit was not written by a human.

How did Google catch it?

How does Google know? The code told on itself. The Python script is littered with the unmistakable fingerprints of a large language model trying very hard to look professional.

Excessive educational docstrings. A textbook-clean ANSI colour class. Detailed help menus of the sort no working hacker bothers writing. And the giveaway: a CVSS severity score that the model had simply made up. A hallucination, dressed in the formatting of a real security advisory.

Google’s own Gemini model was not the one used, the company says. Anthropic’s models were not used either. But something was.

What makes this particular bug interesting, and what hints at why AI was useful for finding it, is the kind of flaw it was. Not memory corruption. Not an injection issue. Not the sort of thing automated fuzzers chew through every day. It was a semantic logic error.

A developer had hardcoded a trust assumption that quietly contradicted the 2FA logic wrapped around it. The code looked correct. Static analysers would have shrugged. A human auditor reading the file might well have missed it.

Frontier LLMs, it turns out, are unusually good at reading intent. They can hold the contradiction in their head, weigh what the developer was trying to do against what the code actually does, and surface the gap. That is a meaningfully different attack capability than anything fuzzers have ever offered.

More trouble coming

The rest of the GTIG report makes the case that this is not an outlier event. Chinese state-linked group UNC2814 has been jailbreaking Gemini with expert persona prompts to research remote code execution flaws in TP-Link router firmware.

North Korea’s APT45 has been firing thousands of repetitive prompts at AI models to recursively analyse CVEs and validate proof-of-concept exploits, building an arsenal of capabilities no human team could realistically manage.

Suspected Russia-nexus actors are using LLMs to generate decoy code that pads malware with inert junk, making static analysis harder. And a malware family called PROMPTSPY has been quietly using the Gemini API to drive infected Android phones autonomously, with the model picking taps and swipes based on whatever is on screen.

The 2FA bypass is the first time a clear AI fingerprint has been spotted on a finished, weaponised zero-day in the wild. Hultquist’s point is the one worth sitting with. If Google found one, there are others Google did not find.

The other half of the report is the defensive answer, and it has a familiar shape. Google says its own AI agent Big Sleep, built by DeepMind and Project Zero, has been hunting vulnerabilities in the opposite direction, and another agent called CodeMender is being trialled to automatically patch them.

The AI offence and the AI defence are now running the same race, just from opposite ends of the track.

Now read: Hong Kong’s richest family just bailed on Britain’s biggest mobile network