I built an open-source AI agent that explains any ML model in plain English — real SHAP, real LIME, real bias detection

# hermesagentchallenge# devchallenge# agents

Simran Shaikh

The problem I kept running into Every time I finished training a model, the same...

The problem I kept running into

Every time I finished training a model, the same conversation happened:

Manager: "Why did it predict that?"
Me: opens SHAP plot
Manager: glazed eyes

SHAP and LIME are powerful — but they output numbers and plots that
only data scientists can read. Nobody builds the bridge to plain English.
Nobody automates the bias check. Nobody generates a report your legal
team can actually use.

So I built XAI-Agent to do all of that — powered by Hermes Agent's
autonomous multi-step planning pipeline.

What it does

Upload any trained ML model (.pkl) + dataset (.csv) →
Hermes Agent runs 5 tools autonomously →
You get a full plain-English explainability report in under 3 minutes.

The 5-step Hermes Agent pipeline:

file_reader — loads model, auto-detects task type, picks right explainer
shap_analyzer — runs real SHAP, ranks all features by impact + direction
lime_explainer — explains 3 individual predictions in plain English
bias_checker — scans for demographic features, flags disparities
report_writer — writes structured Markdown report, downloadable instantly

What makes this genuinely agentic

Context flows between all 5 tools. The model type from Step 1
determines which SHAP explainer Step 2 uses. The feature ranking
from Step 2 informs Step 3's LIME analysis. The bias verdict from
Step 4 shapes Step 5's recommendations.

It also handles a real edge case most tutorials miss: newer SHAP
versions return 3D arrays (samples, features, classes) instead of 2D.
The agent detects this automatically and slices correctly —
a bug that breaks every naive SHAP implementation.

Sample output

Running on the breast cancer dataset (569 patients, 30 features):

Executive Summary (auto-generated):

This RandomForestClassifier was analyzed across 569 samples and
30 features. The most influential predictor is 'worst area'.
No demographic bias was detected.

SHAP top features:

worst area — 0.0756 — ↑ increases malignancy prediction
worst concave points — 0.0538 — ↑ increases malignancy prediction
mean concave points — 0.0503 — ↑ increases malignancy prediction

Prediction explained in plain English:

Row 0 — Predicted benign at 94% confidence.
'worst area' was well below the malignancy threshold
(impact: −0.141). 'worst concave points' also supported
benign classification (impact: −0.089).

Why this matters beyond the challenge

EU AI Act requires explainability for high-risk AI systems.
GDPR gives citizens the right to explanation for automated decisions.
US financial regulators require adverse action explanations for
ML credit scoring.

Existing tools (Fiddler, Arize, Arthur AI) cost $50K+/year.
XAI-Agent is free, open-source, runs locally, works in 3 minutes.

Tech stack

Hermes Agent (autonomous multi-step planning)
SHAP + LIME (real explainability — not simulated)
Streamlit (UI)
scikit-learn, XGBoost, LightGBM

Try it yourself

GitHub: https://github.com/SimranShaikh20/xai-agent

git clone https://github.com/SimranShaikh20/xai-agent
pip install -r requirements.txt
streamlit run app.py

Test files (sample_model.pkl + sample_dataset.csv) included —
runs in 3 minutes with zero extra setup.

What model would YOU run this on first? Drop it in the comments 👇