Security researchers at Sysdig say they have captured what they assess is the first documented ransomware operation driven end to end by a large language model agent, turning a familiar cloud-intrusion chain into something faster, more adaptive, and harder to dismiss as a lab scenario.
The operation, which Sysdig named JadePuffer, began with exploitation of CVE-2025-3248, a critical unauthenticated code-execution flaw in Langflow, an open-source framework used to build LLM applications and agent workflows. From there, the agent enumerated the host, hunted for provider API keys and cloud credentials, dumped Langflow’s PostgreSQL database, probed internal services, established persistence, moved toward an Alibaba Nacos configuration service, and eventually destroyed production database records.
That is the part that should make this more than another ransomware headline. None of the individual techniques were exotic. The warning is that a model-driven operator appeared to chain them together, diagnose failures, and adjust payloads at a pace that looks different from ordinary hands-on-keyboard activity. Sysdig’s July 1 report describes more than 600 purposeful payloads in a compressed intrusion window, with natural-language comments inside attacker code explaining targeting choices and next steps.
How the attack started
Langflow was the initial entry point. The vulnerable endpoint, /api/v1/validate/code, allowed remote unauthenticated code execution in versions before 1.3.0, according to the National Vulnerability Database. CISA added the flaw to its Known Exploited Vulnerabilities catalog on May 5, 2025, with a May 26 remediation deadline for federal agencies.
That older date matters. JadePuffer did not need a brand-new zero-day. It used a known, already-patched flaw in a type of server that is especially attractive to attackers because AI orchestration tools often sit close to secrets: model-provider keys, cloud credentials, database strings, configuration files, and internal service endpoints.
After gaining code execution, the agent collected host details, searched environment variables and sensitive files, and looked for keys tied to OpenAI, Anthropic, DeepSeek, Gemini, AWS, Google Cloud, Azure, Alibaba, Tencent, and Huawei. It also dumped Langflow’s backing Postgres database and staged the results locally before deleting temporary files, according to Sysdig.
The agent then scanned internal services reachable from the compromised host, including databases, object storage, secret stores, and service-discovery endpoints. One notable step was MinIO enumeration. The payload tried a JSON-style response first, received XML from the S3-compatible API, then immediately adjusted its parser and continued listing buckets and targeting files named like secrets, including .env and credentials.json.
The Nacos pivot made it destructive
The final target was a separate production server running MySQL and Alibaba Nacos, a configuration and service-discovery platform common in Alibaba-style microservice deployments. Sysdig says captured artifacts show the agent connecting to the exposed MySQL service with root credentials, though the source of those credentials was not observed.
Once there, the operation used multiple paths against Nacos: an older authentication-bypass family tied to CVE-2021-29441, JWT forgery attempts using a well-known default signing key, and direct insertion of a backdoor administrator into the Nacos database. The backdoor step is where the evidence for autonomy becomes more concrete. A first payload created an admin account with a generated bcrypt hash. Login failed. Within 31 seconds, the agent tested default credentials, diagnosed a likely hash-generation problem, deleted the broken account, recreated it with a simpler password, and verified access.
That pattern repeated elsewhere. The agent probed for container-escape paths through MySQL file primitives, including reads against Docker socket and cgroup paths, checked User Defined Function escalation possibilities, and wrote an explicit cleanup marker before moving into the ransomware phase. During destruction, it also adjusted a failed database-drop attempt by temporarily disabling foreign-key checks, then restoring them after the command.
The ransomware payload encrypted 1,342 Nacos service-configuration items using MySQL’s AES_ENCRYPT(), dropped the original configuration and history tables, and created a README_RANSOM table with a demand, a Proton Mail contact, and a Bitcoin address. Sysdig notes that the ransom note claimed AES-256, while default MySQL behavior more likely points to AES-128-ECB unless the server had been reconfigured.
The strangest detail is operationally important: the encryption key was randomly generated, printed once to standard output, and not stored or transmitted. If Sysdig’s reading is right, the victim could not recover the encrypted configurations even by paying. The Bitcoin address in the note also appears to be a widely used example address from Bitcoin documentation, raising the possibility that the model reproduced it from training data rather than using a real operator-controlled wallet.
Why this changes the defender’s job
Agentic ransomware does not make every attacker brilliant. It makes old weaknesses cheaper to combine. A neglected Langflow endpoint, default MinIO credentials, a Nacos deployment with weak assumptions, exposed database administration, overbroad secrets in application environments, and loose egress controls are all familiar problems. JadePuffer shows how an AI agent can stitch them into a coherent intrusion without needing a human expert to babysit every step.
That also gives defenders a useful clue. LLM-generated attack payloads may reveal intent in ways conventional malware usually does not. Sysdig points to self-narrating code comments that describe target prioritization, operational reasoning, cleanup, and database-selection logic. Those comments are a detection surface, not just a curiosity.
The practical takeaway is not that teams need a new category of panic. They need to treat AI-adjacent application servers as high-value infrastructure. Langflow, agent frameworks, model orchestration tools, vector databases, object stores, and configuration services can sit closer to credentials than ordinary web apps, especially when teams deploy them quickly for experiments and then leave them internet reachable.
What teams should check now
- Find any Langflow deployments and confirm they are running a version that fixes CVE-2025-3248, especially if
/api/v1/validate/codewas ever reachable from the internet. - Move model-provider keys, cloud credentials, and database passwords out of web-reachable AI orchestration environments and into a scoped secrets manager.
- Review MinIO, object storage, and internal configuration buckets for default credentials, broad read access, and files named like secrets.
- Harden Nacos by changing default token-signing keys, upgrading versions that force custom keys, removing internet exposure, and preventing root-level database access from the service.
- Restrict database management ports by source IP and role, then hunt for unexpected root logins,
README_RANSOMtables, dropped Nacos history tables, and bulk use ofAES_ENCRYPT(). - Check scheduled tasks and cron entries for outbound beacons, including the Sysdig-reported pattern of a Python request to
45.131.66[.]106:4444/beacon. - Add detections for generated-code artifacts: unusually verbose comments inside one-line payloads, scripts that narrate targeting logic, and rapid failed-command correction followed by narrowly tailored retries.
BleepingComputer reported the case on July 4, noting Sysdig’s assessment that JadePuffer adapted to errors much like a human operator would. That is the uncomfortable middle ground defenders now have to plan for: not fully novel malware, and not a scripted commodity attack, but an automated operator capable of reading the room inside a compromised environment.
For security teams, the near-term response is refreshingly concrete. Patch the known flaw. Remove public exposure. Rotate secrets that ever lived near Langflow. Harden Nacos and MinIO. Watch for AI-generated payload traits in logs. The longer-term problem is broader: as agents become cheaper and more capable, the long tail of forgotten infrastructure becomes easier to weaponize at scale.