manja316Over the past month, I've been systematically scanning open source repositories for security...
Over the past month, I've been systematically scanning open source repositories for security vulnerabilities. Not hypothetical ones — real, exploitable bugs that I've reported through bounty programs.
After scanning 100+ repos across ML frameworks, web servers, and developer tools, the same patterns keep showing up. Here's what I found, with real code examples.
I use a combination of Semgrep rules, custom Python scripts, and manual code review. The workflow:
If you want to automate step 2, I built a Security Scanner Skill for Claude Code that runs 50+ checks in one pass.
Found in: 23% of repos that handle file uploads
The classic: extracting a zip/tar without validating that file paths stay inside the target directory.
# VULNERABLE — real pattern from a top ML framework
def extract_archive(archive_path, dest_dir):
with zipfile.ZipFile(archive_path) as zf:
zf.extractall(dest_dir) # No path validation!
An attacker crafts a zip with entries like ../../etc/cron.d/backdoor. The fix:
def safe_extract(archive_path, dest_dir):
with zipfile.ZipFile(archive_path) as zf:
for info in zf.infolist():
target = os.path.realpath(os.path.join(dest_dir, info.filename))
if not target.startswith(os.path.realpath(dest_dir)):
raise ValueError(f"Path traversal detected: {info.filename}")
zf.extractall(dest_dir)
I found two distinct zip slip vulnerabilities in a single ML experiment tracking framework — both in functions that extract model artifacts. Combined bounty value: $3,000.
Found in: 41% of ML repos
This is the big one in ML. pickle.load() executes arbitrary Python code. Everyone knows this. Nobody fixes it.
# VULNERABLE — seen in multiple model serving frameworks
def load_model(path):
with open(path, 'rb') as f:
return pickle.load(f) # RCE if attacker controls the file
I discovered 31 distinct techniques to bypass the most popular ML model security scanner. The scanner blocks os.system and subprocess, but misses:
http.client.HTTPConnection (Python 2/3 naming gap — httplib is blocked, http.client isn't)multiprocessing.Pool (spawns worker processes via fork(), but only subprocess is blocked)smtplib.SMTP (wraps socket which IS blocked, but the wrapper isn't)tempfile.mkstemp (creates persistent files on disk during deserialization)31 bypass techniques across 4 categories: module blocklist gaps, Python 2/3 naming inconsistencies, wrapper modules, and format-specific skips.
Found in: 18% of repos with database integrations
Even with ORMs, developers drop to raw SQL for complex queries. When they do, they concatenate user input directly.
# VULNERABLE — real pattern from a data platform
def search_documents(query, db_type):
if db_type == "postgres":
sql = f"SELECT * FROM docs WHERE content LIKE '%{query}%'"
cursor.execute(sql)
I found SQL injection and NoSQL injection in a popular LLM framework's database connectors — the query builder trusted user input for column names and filter values.
The fix is always parameterized queries:
cursor.execute("SELECT * FROM docs WHERE content LIKE %s", [f"%{query}%"])
Found in: 29% of repos with HTTP client functionality
Any feature that fetches a URL based on user input is an SSRF candidate. ML frameworks are especially vulnerable because they download models, datasets, and configs from URLs.
# VULNERABLE
@app.route('/fetch-model')
def fetch_model():
url = request.args.get('url')
response = requests.get(url) # SSRF — attacker can hit internal services
return response.content
Cloud metadata endpoints (169.254.169.254) are the classic target. But the real damage comes from hitting internal APIs that assume network-level trust.
Found in: 15% of repos (but 34% of ML/notebook tools)
The worst pattern. I found a critical RCE in a popular ML UI framework where a single unauthenticated POST request writes arbitrary Python code to a file and exec() runs it within 50ms.
# VULNERABLE — real pattern from an ML UI framework
@app.post("/run-code/")
async def run_code(request):
code = (await request.json())["code"]
exec(code) # Full RCE, no auth required
This one was particularly bad because the endpoint had no authentication. Any network-adjacent attacker gets full code execution on the server.
Found in: 21% of web-facing repos
Serving files based on user-supplied paths without proper sanitization:
# VULNERABLE
@app.route('/files/<path:filename>')
def serve_file(filename):
return send_file(os.path.join(UPLOAD_DIR, filename))
An attacker requests /files/../../../etc/passwd. The fix is os.path.realpath() + prefix check, same as the zip slip fix.
Found in: 12% of repos (but nearly universal in "local-first" tools that get deployed to servers)
Tools designed for local development often expose admin endpoints without auth. When someone deploys them on a server (which always happens), those endpoints become attack surface.
# VULNERABLE — "it's just a local tool"
@app.route('/admin/delete-all', methods=['POST'])
def delete_all():
db.drop_all()
return "Done"
After 100+ repos:
| Pattern | Prevalence | Avg Bounty | Difficulty to Find |
|---|---|---|---|
| Pickle deserialization | 41% of ML repos | $500-2,000 | Easy (grep for pickle.load) |
| SSRF | 29% of HTTP repos | $500-1,500 | Medium |
| Zip slip | 23% of upload repos | $1,000-1,500 | Easy (grep for extractall) |
| Path traversal | 21% of file-serving | $500-1,000 | Easy |
| SQL injection | 18% of DB repos | $500-3,000 | Medium |
| eval/exec | 15% overall | $1,000-3,000 | Easy (grep for eval/exec) |
| Missing auth | 12% overall | $500-1,500 | Hard (need to understand intended access model) |
The fastest way to start:
1. Semgrep for the obvious stuff:
semgrep --config=p/python-security --config=p/owasp-top-ten .
2. Custom grep for ML-specific patterns:
# Dangerous deserialization
grep -rn "pickle\.load\|joblib\.load\|torch\.load\|yaml\.load" --include="*.py" .
# Dangerous execution
grep -rn "eval(\|exec(\|compile(" --include="*.py" .
# Archive extraction without validation
grep -rn "extractall\|extract(" --include="*.py" .
3. Trace user input to sinks. This is the step that catches what automated tools miss. Follow every request.args, request.json, function parameter from an API endpoint, and see where it ends up.
I automated most of this into a Claude Code skill that chains Semgrep scanning with custom pattern matching and generates structured reports. It catches patterns 1-6 automatically and flags pattern 7 for manual review.
ML repos are softer targets than web frameworks. Web devs have 20 years of OWASP education. ML engineers are focused on model accuracy, not input validation.
Scanner bypasses are everywhere. The most popular ML model scanner has 31 distinct bypass techniques. Don't trust a single tool — layer your defenses.
"Local-only" tools always end up on servers. If it has a web interface, assume it will be network-accessible. Add auth.
The money is in specificity. A generic "you have XSS" report gets ignored. A working PoC that demonstrates data exfiltration gets paid.
Volume beats perfection. Scanning 100 repos with basic patterns found more real bugs than deep-diving 5 repos with advanced techniques.
If you're building developer tools, I also built a Dashboard Builder skill for creating monitoring dashboards and an API Connector for integrating with vulnerability databases and security APIs.
All vulnerabilities mentioned have been responsibly disclosed through appropriate bounty programs before publication.