
Nilesh RautIf you are using Claude API, OpenAI API, Cursor, or AI coding tools daily, your API bill can grow...
If you are using Claude API, OpenAI API, Cursor, or AI coding tools daily, your API bill can grow very fast.
A lot of developers are now moving to local LLM setups because they want:
The good news is:
You can now run powerful AI models directly on your laptop using tools like Ollama (run llm locally).
This setup works great for:
Let’s set it up step by step.
Install it normally like any software.
After installation, open CMD or Terminal and check:
ollama --version
If you see a version number, it is installed correctly.
Now pull a model locally.
Example:
ollama pull llama3
Or for coding:
ollama pull qwen2.5-coder:7b
The first download may take a few minutes because models are several GB in size.
Start chatting with the model:
ollama run llama3
Example:
>>> Explain Docker in simple words
You now have a local AI assistant running directly on your machine.
No API required.
Install:
Both work with Ollama locally.
In Continue.dev config:
{
"models": [
{
"title": "Local AI",
"provider": "ollama",
"model": "qwen2.5-coder:7b"
}
]
}
Now VS Code can use your local model for:
You can also use a ChatGPT-like interface locally.
Install Open WebUI using Docker:
docker run -d \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
ghcr.io/open-webui/open-webui:main
Open:
http://localhost:3000
Now you have your own private AI chat app.
| Model | Best For |
|---|---|
| Qwen2.5 Coder | Coding |
| DeepSeek Coder | Refactoring |
| Llama 3 | General AI |
| Phi | Low-end laptops |
| Mistral | Fast responses |
Basic setup:
CPU-only works too, but slower.
Main reasons:
For daily coding workflows, local LLMs are becoming surprisingly useful.
Cloud models are still stronger for advanced reasoning, but local AI is now good enough for many real-world tasks.
If you are spending too much on AI APIs, this is probably the easiest way to reduce costs.
Start simple:
That alone can replace a large percentage of your daily AI usage.
Useful links: