gentic newsWSL 3 preview delivers near-native GPU/NPU for Claude Code + Ollama on Copilot+ laptops, but WSL 2 still handles NVIDIA CUDA fine for desktop users.
WSL 3 preview delivers near-native GPU/NPU for Claude Code + Ollama on Copilot+ laptops, but WSL 2 still handles NVIDIA CUDA fine for desktop users.
Microsoft announced WSL 3 on June 2, 2026 at Build. The headline: a paravirtualized hardware access layer replaces WSL 2's full Hyper-V VM, cutting GPU compute overhead from ~15-20% down to 3-5% of bare-metal Linux. More importantly, it exposes the NPU to Linux for the first time—not just the GPU.
The catch: This preview is locked to Copilot+ PCs with Snapdragon X Elite, Intel Meteor Lake, or Lunar Lake NPUs. AMD and discrete NVIDIA desktop setups aren't on the launch list.
If you run Claude Code or Aider on a Windows laptop with a local Ollama model, WSL 3 is the upgrade you've been waiting for. Here's the practical difference:
ollama run qwen2.5-coder:14b is using your RTX GPU right now. Stay put.For Claude Code specifically: The agent itself calls Anthropic's API, so the GPU isn't doing inference. But if you pair Claude Code with a local model gateway (e.g., for code review, linting, or test generation), or run local tooling that the agent drives, the near-native I/O of WSL 3 cuts friction.
# In PowerShell
wsl --version
# If you see WSL version: 2.x.x.x, you're on WSL 2
You need:
Settings → Windows Update → Windows Insider Program → Pick Dev or Canary channel → Install update → Reboot.
# Inside WSL (Ubuntu recommended)
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a local model
ollama pull qwen2.5-coder:14b
# Install Claude Code (if you haven't)
npm install -g @anthropic-ai/claude-code
# Install Aider
pip install aider-chat
The most common failure: your editor (VS Code with Cline, Continue.dev, or Cursor) can't reach Ollama inside WSL because of the virtual NIC boundary.
Fix it — bind Ollama to all interfaces inside WSL:
# Inside WSL
export OLLAMA_HOST=0.0.0.0:11434
ollama serve
# Find the WSL IP
ip addr show eth0 | grep inet
# Example: 172.20.0.2
Then point your editor at http://172.20.0.2:11434 instead of localhost:11434.
# Check if Ollama is using GPU
ollama ps
# Should show model name and GPU utilization
If you see CPU-only, your NPU or GPU isn't being passed through. On WSL 3, this should work automatically on supported hardware.
If you already run an RTX desktop with WSL 2, your CUDA-backed Aider and Cline setup is fine — stay put. WSL 3 is the real upgrade for Copilot+ laptop owners who want their NPU and GPU available to Linux coding agents without dual-booting. Treat it as preview, not production.
For everyone else: the single biggest bottleneck in your local AI coding workflow on Windows isn't the hypervisor — it's the network boundary between WSL and the host. Fix that first.
Source: dev.to
[Updated 24 Jun via devto_claudecode]
The NPU in a Copilot+ PC is not designed for coding agents — it powers OS features like Recall, Cocreator, and Live Captions, not Cursor or Claude Code [per dev.to]. The real local-coding breakthrough is NVIDIA's RTX Spark, announced at Computex 2026: a Grace Blackwell superchip with up to 128GB unified memory, 6,144 CUDA cores, and 1 petaflop AI performance, capable of running 120B-parameter models with 1M context tokens. It ships fall 2026, with pricing unannounced but likely well above the DGX Spark's $3,999–$4,699 range.
Originally published on gentic.news