The Bug That Made My Pentesting Agent Give Up

# ai# autonomous# cybersecurity# pentest

XenoCoreGiger31

I’ve been building Halo, an autonomous pentesting agent powered by a local LLM (Gemma 4 12B,...

I’ve been building Halo, an autonomous pentesting agent powered by a local LLM (Gemma 4 12B, abliterated, running via LM Studio). Last week I found a bug in it that taught me more about state management in agentic systems than anything else I’ve hit so far.
The symptom

I was testing a tool — httpx, used for HTTP probing — standalone from the terminal. Worked fine. Ran it through the full agent loop against the same target. The agent refused to even try it. No error, no retry, just… skipped.

At first I assumed it was a prompt issue — maybe the LLM just wasn’t selecting the tool. But when I dug into the logs, I found something stranger: the agent’s reasoning explicitly referenced the tool as “previously failed” — except it had never run against this target before.

The investigation

Halo has a failure cache (agent_cache.py) that fingerprints failed tool runs using SHA-256 hashes, so the agent doesn’t waste cycles retrying things that already didn’t work. Reasonable design — except when I traced the fingerprint logic, I found the cache key had no concept of which engagement a failure happened in.

That meant if httpx failed once against Target A (say, due to a transient network blip or a misconfigured flag), it was blacklisted globally — not just for Target A, but for every future target, forever, across completely unrelated engagements.

The root cause

The cache was scoped at the tool level only: tool_name + target as the fingerprint. It should have been: engagement_id + tool_name + target. Without engagement scoping, one bad run anywhere poisoned the well everywhere. The agent wasn’t being cautious — it was permanently and silently giving up on tools that had simply had a bad day once.

This is a classic state-management trap in agentic systems: caching for efficiency is good, but if your cache key doesn’t match the actual scope of validity for that data, you get the appearance of stability while quietly accumulating false negatives. And the worst part is it fails silently — there’s no crash, no obvious symptom, just a system that gets less capable over time without telling you why.

The fix

Added engagement_id as a required field threading through agent_cache.py and every call site in agent_loop.py. Each engagement now gets its own failure namespace. A tool that fails against one target is still blacklisted for that engagement (so the agent doesn’t waste time retrying within a session) but starts fresh on the next one.

Before the fix:
cache_key = hashlib.sha256(f"{tool_name}:{target}".encode()).hexdigest()

After the fix:
cache_key = hashlib.sha256(f"{engagement_id}:{tool_name}:{target}".encode()).hexdigest()

(Simplified — the real implementation also classifies failure type: timeout, permission denied, tool missing, network error, etc., so the agent can make smarter decisions about when to retry, not just whether to retry.)

What it taught me

If you’re building any kind of stateful agent — caching, memory, learned preferences, whatever — ask explicitly: what is the actual scope of validity for this piece of state? It’s tempting to cache broadly because it feels more efficient, but a cache that outlives its true scope doesn’t just waste effort, it actively corrupts future decisions. The agent wasn’t broken. It was being “smart” with the wrong boundaries.

Halo is still very much a work in progress — open source, local-first, no cloud dependency for the reasoning loop. If you’re working on anything agentic with persistent state, I’d be curious how you’ve handled scoping problems like this.

XenoCoreGiger31 / GEMMA-by-GOOGLE

GEMMA-POWERED-BY-GOOGLE-CYBERSECURITY-AUTONOMOUS-AI An autonomous AI agent using a Linux Environment using a GEMMA4-12b model. HIGHLY ABLITERATED. Fully Local-Fully FREE -WITH PERSISTENT NEG-EXPERIENCE-CACHE -LEARNING-AND GETTING SMARTER WITH EACH ENGAGEMENT. AUTONOMOUS RECON-ATTACK-LOOPS AND AUTOMATIC- PROFESSIONAL REPORT GENERATION-ON FINDINGS.

Final_EDIT

What It Does · Tools · Architecture · Stack · Usage · Contributing

GEMMA-by-GOOGLE

GEMMA-POWERED-BY-GOOGLE-CYBERSECURITY-AUTONOMOUS-AI:

An Autonomous AI agent inside of Linux environment with one of the worlds most cutting edge AI models, Googles GEMMA 4-12b Model. Fully uncensored/Abliterated. FULLY

LOCAL. FULLY FREE. With PERSISTENT negative cache learning, adaptation.Learning and self harnessing getting more self aware and intelligent with each engagement.

Autonomous recon, scanning and attack vector mapping, - one word to start it all: ENGAGE. Attack-loops, reports, professional and comepletely local -this agent is fast, and documents its exploits, findings and risk levels autonomously with clean precision and professionalism. Star if you like it, or open a PR and lets do something together. This is where ideas come alive and problems are solved!!

license: mit language:

en tags:
security
penetration-testing
autonomous-agent
mcp
kali-linux
llm
cybersecurity
red-team library_name: other pipeline_tag: text-generation

🔐 HALO Cybersecurity

Autonomous AI-powered penetration testing agent —…

automajicly/GEMMA-by-GOOGLE · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co