HackerOne Just Made Agentic Prompt Injection Testing Official
Two things happened this week that every developer building AI agents should pay attention to.
On March 21, HackerOne launched a dedicated "Agentic Prompt Injection Testing" category on their platform. Then on March 26, The Register published a piece quoting Michael Bargury from Zenity at RSA Conference, where he described AI agents as "gullible and easy to turn into your minions."
These are not unrelated events. They are two sides of the same realization hitting the industry at once: AI agents in production are fundamentally vulnerable, and the security community is finally building the infrastructure to deal with it.
What HackerOne Actually Launched
HackerOne has been the dominant bug bounty platform for over a decade. When they add a new vulnerability category, it signals that the attack surface is real enough and common enough to warrant its own classification.
Their new Agentic Prompt Injection Testing program does a few things. It gives security researchers a formal framework for testing AI agents specifically, not just chatbots or completions APIs. It provides structured reporting for injection attacks that chain through agent tool calls, memory systems, and multi-step workflows. And it gives companies running AI agents a way to invite external testers to find the problems their internal teams missed.
This matters because agent security testing has been a gray area. Bug bounty hunters who found prompt injection vulnerabilities in AI products often struggled to report them. Was it a "vulnerability" if the AI assistant leaked data through a cleverly worded prompt? Where did the scope end when the agent had access to email, calendars, and internal databases?
HackerOne is answering that question definitively: yes, it counts, and here is how to report it.
The Zenity Angle: Agents Are Not Just Injectable, They Are Gullible
Michael Bargury and his team at Zenity have been researching AI agent security for years. His RSA presentation this week cut straight to the heart of a problem most developers underestimate.
The key insight: you do not always need a traditional prompt injection to compromise an agent. Agents are gullible by design. They are built to be helpful, to follow instructions, to trust the data they are given. That trust model becomes a gaping security hole when agents operate autonomously in real environments.
Bargury described "zero-click" agent attacks. These do not require any interaction from the user at all. If your agent reads emails, browses the web, processes documents, or pulls data from any external source, an attacker can plant instructions in those sources. The agent encounters them during normal operation and simply does what it is told.
This is not theoretical. Bargury's team demonstrated attacks where agents were turned into "minions" through nothing more than a well-crafted email sitting in an inbox or a hidden instruction on a webpage. No exploit code. No buffer overflow. Just persuasive text that the agent interpreted as legitimate instructions.
His framing of "persuasion, not just injection" is important. Traditional prompt injection implies a technical bypass, like SQL injection for AI. But agent compromise is often more subtle than that. You are not breaking the model. You are talking it into doing something it should not do. The same social engineering tactics that work on humans work on agents, sometimes better, because agents do not get suspicious.
Why This Convergence Matters
HackerOne building formal testing infrastructure and Zenity researchers publicly demonstrating how easy agents are to compromise in the same week is not a coincidence. It reflects a market that is waking up to reality.
Here is the reality: thousands of companies shipped AI agents in 2025 and early 2026. Most of those agents have tool access. Many can read and write emails, query databases, modify files, or call external APIs. Very few of them have been tested against adversarial inputs with any rigor.
The gap between "what these agents can do" and "how well we have tested what they will do under attack" is enormous. And it is growing as agents get more capable and more autonomous.
For developers, the practical takeaways are straightforward.
Your agent's tool permissions are your attack surface. Every API call your agent can make, every database it can query, every email it can send is a capability that an attacker gets access to if they can influence the agent's behavior. Audit your tool permissions ruthlessly. Apply least privilege like you would for any service account.
External data is untrusted input. If your agent processes emails, documents, web pages, or any user-generated content, treat all of that as potentially adversarial. This is the zero-click attack vector Bargury described. Your agent will encounter malicious instructions during normal operation, and it needs to handle them correctly.
System prompt hardening is necessary but not sufficient. Yes, you should write strong system prompts with clear boundaries. But as both HackerOne's new category and Bargury's research show, determined attackers will find ways around static instructions. You need layered defenses: input filtering, output monitoring, tool call validation, and ongoing testing.
Test with real attacks, not just unit tests. HackerOne's new program exists because companies need external adversarial testers. Your internal team built the agent and inherits the same blind spots. You need outside perspectives trying to break your system.
What You Can Do Right Now
The good news is that the tooling for agent security testing is catching up. HackerOne's program gives enterprises a path to formal adversarial testing. Researchers like Bargury at Zenity are publishing attack patterns that help everyone understand the threat landscape.
If you want to test your own agents before inviting the world, tools like BreakMyAgent let you run automated prompt injection scans against your agent and see exactly where it folds. Running a quick scan takes minutes and can surface the obvious vulnerabilities before you invest in a full bug bounty program.
The pattern we have seen over and over in security is that the window between "researchers prove this is a real problem" and "attackers start exploiting it in the wild" gets shorter every cycle. HackerOne and Zenity are both saying the same thing this week: the problem is real, the attacks are practical, and the time to test your agents was yesterday.
Do not wait for someone else to find the holes in your agent. Find them yourself first.