orenlabI already wrote about why I built CodeClone and why I cared about baseline-aware code health. Then I...
I already wrote about why I built CodeClone and why I cared about baseline-aware code health.
Then I wrote about turning it into a read-only, budget-aware MCP server for AI agents.
This post is about what changed in 2.0.0b4.
The short version: if b3 made CodeClone usable through MCP, b4 made it feel like a product.
Not because I added more analysis magic or built a separate "AI mode." But because I pushed the same structural truth into the places where people and agents actually work — VS Code, Claude Desktop, Codex — and tightened the contract between all of them.
A lot of developer tools are strong on analysis and weak on workflow. A lot of AI-facing tools shine in a demo and fall apart in daily use.
For b4, I wanted a tighter shape:
That is the release theme. Not "more output" — better day-to-day workflows.
Clone detection tells you this logic is repeated. Complexity tells you this function is locally hard to reason about.
Overloaded Modules asks a different question: which modules are taking on too much responsibility?
The signals include module size pressure, dependency pressure, hub-like shape, and reimport-heavy structure. This points to code that often feels wrong before it is easy to classify. You know the file keeps attracting logic. Every change in it feels heavier than it should. But it is not a clone group or a single high-complexity function.
The important design choice: this layer is report-only for now. It shows up in JSON, HTML, Markdown, text, MCP, and the VS Code extension — but it does not affect health score, gates, baseline novelty, or SARIF.
I wanted the signal to be useful before letting it become consequential.
The preview VS Code extension is the first release where CodeClone feels properly usable inside an editor instead of only around one.
It is now live on the Visual Studio Marketplace.
The extension is not a generic linter panel. It is built around a review loop:
A lot of extensions get this wrong by dumping every result into the IDE and calling it integration. I wanted the opposite: a client that is baseline-aware, triage-first, source-first, trust-aware, and read-only.
b4 also tightened the surrounding UX:
That last one mattered more than I expected.
I also added native client paths for Claude Desktop and Codex.
The goal was not "be available in more places." It was keeping one analysis contract across all of them:
Claude Desktop gets a local .mcpb bundle with pre-loaded review instructions.
Codex gets a native plugin with two focused skills — full review and quick hotspot discovery. Both sit on top of the same codeclone-mcp server.
That may sound boring, but boring is good here. The more clients you add, the easier it becomes to fork your own semantics without noticing. A lot of the b4 work was about resisting exactly that.
CodeClone defaults are intentionally conservative. That is the right first pass for CI, baseline-aware review, and agent-driven workflows.
But there is a real second need: sometimes the default pass looks clean, and you want to go hunting for smaller, more local repetition.
b4 makes that distinction explicit:
pyproject.toml thresholds.This now shows up clearly in MCP help topics and in the VS Code analysis profiles.
"More sensitive" is not the same as "more correct." A clean conservative pass
does not prove there is no finer-grained repetition. But a lower-threshold exploratory pass should not quietly pretend to have the same meaning as the default profile. That distinction needed to become product-level.
Two things happened on the MCP side that are easy to miss but matter a lot in practice.
First: the help tool. In b3, agents had 20 analysis and query tools but no way to ask "what should I do next?" or "what does this baseline state mean?" without burning tokens on trial and error.
b4 adds a help(topic=...) tool with bounded, static answers for common uncertainty points: workflow sequencing, analysis profile semantics, baseline interpretation, suppression rules, review state, and changed-scope review. An agent can ask one cheap question instead of making three exploratory tool calls to figure out the right next step.
This is a small surface — seven topics, short answers, no dynamic analysis. But it changes the economics of agent workflows significantly. The difference between "the agent guesses and retries" and "the agent asks and proceeds" is often 3–5x in token cost.
Second: tighter token budgets across the board. b4 continued the budget-aware work from b3:
derived section in MCP payloads is projected down to what agents actually needmetrics_detail is paginated with family and path filters so agents never pull the full metrics table by accidentNone of this changes the canonical report — the JSON is still the complete truth. But the MCP view over it is now meaningfully leaner.
Some of my favorite changes in b4 are not flashy:
This is not the kind of work that looks impressive in a screenshot. But it is exactly the kind of work that makes an engineering tool feel trustworthy over weeks and months.
b4 feels like
b1 — CodeClone became more than a clone detector.
b3 — it became a serious MCP server.
b4 — it started to feel coherent across the CLI, the report, MCP, and every client surface.
You can start in the editor. You can stay aligned with baseline-aware truth. You can inspect module-level pressure without turning it into fake gating. You can move between human and agent workflows without changing the underlying semantics.
That is much closer to what I wanted CodeClone to become.
uv tool install --pre codeclone # core CLI (beta)
uv tool install --pre "codeclone[mcp]" # + MCP server for agents and IDEs
codeclone . # analyze the current project
codeclone . --html --open-html-report # open the interactive report
If you are building review workflows around IDEs, MCP clients, or AI-assisted refactoring, I would love feedback on one question:
What makes a structural analysis tool feel trustworthy once it leaves the CLI and starts living inside real developer workflows?