MiroFish: The Open Source Swarm Intelligence Engine That Simulates the Future

# ai# mirofish# programming

Mohammad

Imagine being able to run the future before it happens. Not as a statistical forecast or a gut...

Imagine being able to run the future before it happens. Not as a statistical forecast or a gut feeling, but as a living simulation where thousands of autonomous AI agents carry individual personalities, memories, and social relationships, and play out scenarios in a compressed digital sandbox. You upload the seed material, whether that is a news article, a policy draft, or the opening chapters of a novel, and MiroFish returns a detailed prediction report along with an interactive world you can continue to question.

That is what MiroFish is. Built by a small team backed by Shanda Group, one of China's oldest internet companies, MiroFish is an open source, self-hosted Swarm Intelligence Engine that combines multi agent AI simulation, knowledge graph construction using GraphRAG, and long-term agent memory. As of March 2026, the GitHub repository has over 32,300 stars and 4,100 forks, putting it among the most watched AI projects of the year.

1. What Is MiroFish?

MiroFish describes itself as a Simple and Universal Swarm Intelligence Engine, Predicting Anything. That tagline sounds bold, but the technical architecture behind it is genuinely distinct from most prediction tools on the market.

Rather than asking a single large language model to predict an outcome, which tends to bake in one model's biases and produce a single opinion, MiroFish populates a digital world with thousands of AI agents. Each agent carries an independently generated personality, backstory, and memory. Those agents then interact, react to injected events, form opinions, and change socially over time. The result is emergent group behaviour that no single agent knows in isolation.

The simulation engine is powered by OASIS, an open source multi agent social simulation framework from CAMEL AI. MiroFish adds a full product layer on top: a Vue based frontend, a Python and FastAPI backend, GraphRAG based knowledge graph construction, Zep Cloud for long term memory, and a Report Agent that brings together findings into a structured document.

The team's stated goal: let the future rehearse in a digital sandbox so every decision wins after a hundred simulated rounds.

2. How the Five Stage Pipeline Works

Stage 1: Graph Construction

You upload your seed material. This can be a data analysis report, a set of news articles, a financial briefing, or a piece of fiction. MiroFish extracts entities and relationships, builds a knowledge graph using GraphRAG, and loads both individual and group memories into the agent population before any simulation begins.

Stage 2: World Building

An Environment Configuration Agent reads the knowledge graph and generates the simulation parameters. This includes which agent personas exist, what the social environment looks like, and what the starting conditions are. Agent personalities are generated to reflect the demographics, power structures, and cultural context embedded in the seed material.

Stage 3: Running the Simulation

The simulation runs on a dual platform architecture for parallel processing. Agents interact autonomously, posting, reacting, debating, and forming coalitions, while the system updates each agent's memory as rounds progress. You can inject new variables mid simulation from a God View interface. Drop in a breaking news event and watch how the social environment responds.

Stage 4: Report Generation

A dedicated Report Agent has access to a full toolset and goes through the post simulation environment to produce a structured prediction report. The report pulls together emergent patterns, majority and minority opinion paths, likely event sequences, and confidence levels.

Stage 5: Deep Interaction

After the simulation ends, the world stays live. You can have a direct conversation with any individual agent. Ask a simulated political journalist their view, probe a simulated consumer about a purchase decision, or question a simulated executive about their thinking. You can also continue the conversation with the Report Agent for further analysis.

3. What Can Actually Be Built With MiroFish?

The possible applications reach well beyond a single industry. Below is a breakdown of realistic use cases at different levels of ambition.

3.1 Public Opinion and Crisis Simulation

Public relations scenario planning: Simulate how a public statement, product recall, or executive announcement will land across different audience segments before it goes out.
Political campaign testing: Model how a policy announcement ripples through different voter groups and surface unintended messaging risks before launch.
Policy review: Government bodies could simulate public reaction to draft legislation before presenting it formally.

The team has already shown this working in practice. Their live demo simulates public opinion dynamics around a real university campus controversy in China, producing a full sentiment report from a social analytics document.

3.2 Financial and Market Scenario Modelling

Simulate how a community of simulated investors responds to earnings surprises, central bank announcements, or sector news.
Model narrative spread: how a specific framing of a financial event travels through different agent types such as retail traders, institutional agents, and media agents.
Test investor communication strategies for listed companies ahead of major announcements.

3.3 Narrative and Creative Prediction

In one of MiroFish's most striking demonstrations, the team fed in the first 80 chapters of Dream of the Red Chamber, one of China's great classical novels whose ending was lost to history, and had MiroFish simulate the probable fate of the characters. The output was an emergent narrative extrapolation, not written by an author but evolved through thousands of agent interactions shaped by the characters' established personalities and relationships.

Similar use cases exist for screenwriters exploring plot outcomes, game narrative designers, and interactive fiction studios.

3.4 Enterprise Decision Support

Organisational change: Simulate how a restructure, acquisition announcement, or new leadership appointment will be received internally.
Product launch planning: Model consumer community reaction to a new product before committing to a marketing budget.
Competitive scenario planning: Simulate how a set of representative competitor agents might respond to a strategic move before you make it.

4. Setup Guide

MiroFish offers two deployment paths. Source code deployment is the recommended approach for development and enterprise use. Docker is available for simpler self hosted setups.

4.1 Requirements

Tool	Version	Purpose	Check Command
Node.js	18 or above	Frontend runtime including npm	`node -v`
Python	3.11 to 3.12	Backend runtime for FastAPI	`python --version`
uv	Latest	Python package manager	`uv --version`
Docker (optional)	Latest	Containerised deployment	`docker -v`

4.2 Environment Variables

Copy the example config file and fill in your API keys:

cp .env.example .env

Two API keys are required to run MiroFish:

Variable	Purpose	Provider
`LLM_API_KEY`	Powers all agent reasoning and text generation	Any OpenAI compatible provider
`LLM_BASE_URL`	API endpoint, for example Alibaba Bailian or OpenAI	Configurable per provider
`LLM_MODEL_NAME`	Model name to use, for example qwen-plus or gpt-4o	Configurable per provider
`ZEP_API_KEY`	Long term memory storage for agents across simulation rounds	Zep Cloud, free tier available

Recommended starting point: Alibaba's qwen-plus model via the Bailian platform. The team advises running fewer than 40 simulation rounds initially to keep API costs manageable while you learn how the system behaves.

4.3 Installing Dependencies

One npm command installs everything: Node packages for the frontend and Python packages for the backend, placed in an automatically created virtual environment.

npm run setup:all

4.4 Starting the Services

npm run dev

This starts both the frontend and backend at the same time. The services run at:

Frontend: http://localhost:3000
Backend API: http://localhost:5001

For production use, start with Docker instead:

docker compose up -d

5. API Cost Analysis: The Real Numbers

The most important practical constraint when running MiroFish is LLM API cost. Because the engine runs thousands of agents through multiple simulation rounds, each agent calling the language model to think, react, and update its memory, token consumption is significantly higher than a standard single agent chatbot.

5.1 What Drives the Cost

Number of agents: Each agent makes LLM calls every round. A run with 1,000 agents across 50 rounds means roughly 50,000 LLM calls before accounting for the report generation phase.
Agent memory size: Each agent carries a growing context including its personality, accumulated memories, and current world state. Longer simulations produce larger contexts and higher costs per call.
Report Agent: The final analysis phase runs a thorough multi tool pass over the entire simulated world, which adds a significant block of token usage at the end of every run.
Seed material complexity: Richer input graphs produce larger system prompts per agent, increasing per call token counts.

5.2 Estimated Cost Per Simulation Run

Scenario	Agents	Rounds	Model Used	Estimated Cost (USD)
Quick prototype	50	20	qwen-plus or GPT 3.5 class	$0.50 to $2
Standard analysis run	500	40	qwen-plus	$8 to $25
Full simulation	1,000	100	qwen-plus	$40 to $120
Large enterprise run	2,000+	200+	GPT 4o or Claude Sonnet	$200 to $800+
Budget option	1,000	100	DeepSeek or Gemini Flash	$5 to $20

These figures are estimates based on current LLM pricing and typical token patterns for multi agent simulations. Actual costs vary based on seed material length, how memory accumulates, and the specific model chosen.

Cost tip: For lower costs, providers like DeepSeek V3 or Google Gemini Flash offer OpenAI compatible APIs at a fraction of the price. MiroFish works with any OpenAI SDK compatible endpoint, so switching is straightforward.

5.3 Memory Costs via Zep Cloud

Long term agent memory is handled by Zep Cloud. The free tier covers early experimentation and lighter usage. For production deployments with persistent memory across thousands of agents in multiple sessions, paid Zep plans start at roughly $50 to $200 per month depending on memory volume.

6. Does It Actually Deliver at This Stage?

This is the fair question to ask. MiroFish is at version 0.1.2, released in March 2026, which is early by any measure. The honest answer has two sides.

6.1 What Works Well Right Now

Structured qualitative prediction: For social, narrative, and sentiment scenarios such as public opinion analysis, public relations modelling, and story extrapolation, MiroFish produces genuinely interesting emergent outputs that are difficult to replicate with a single language model call.
Fast world generation: The automated persona creation and knowledge graph construction from raw input text works reliably and is technically impressive for a project at this stage.
Post simulation querying: Talking directly with individual simulated agents after a run provides a useful qualitative research tool that no conventional forecasting method offers.
Approachable setup: With Docker support and a single start command, getting MiroFish running locally is simpler than comparable academic simulation tools.

6.2 Current Limitations

Prediction accuracy is not yet validated: The simulation outputs are compelling but have not been formally back tested against real world outcomes. There is no published accuracy benchmark at this stage.
Quantitative precision has limits: MiroFish works well for qualitative, narrative driven prediction. For hard numerical forecasting such as exact price targets or vote share percentages, purpose built quantitative models are still more appropriate.
Costs grow quickly at scale: Large scale simulations become expensive fast. The economics make sense for targeted strategic use, not continuous real time monitoring.
English documentation is thin: As a version 0.1 project, most of the detailed guidance and community discussion is in Chinese. English speakers will need to work through the code directly in some areas.
Results vary between runs: Two simulations with identical inputs will not produce identical outputs. This is philosophically coherent but creates challenges for use cases that require reproducibility.

Over 32,000 GitHub stars signal that developers find this genuinely worthwhile. That said, enthusiasm and production readiness are not the same thing. MiroFish today is best described as a compelling research and strategy tool, not yet a fully production ready enterprise platform.

7. The Business Case: Offering MiroFish as a Service

For technically capable individuals and small agencies, MiroFish represents a real commercial opportunity. The open source AGPL 3.0 licence permits hosting and selling managed services built on top of it, though it is worth understanding the copyleft obligations that apply if you modify the core code and distribute those changes.

7.1 The Service Gap

Most enterprises, including public relations agencies, financial services firms, management consultancies, and political campaign teams, do not have the technical capability to set up and operate a multi agent simulation engine. They want the output, not the infrastructure. This creates a clear opportunity for someone to provide a managed service layer around MiroFish.

7.2 Viable Service Models

Model	Target Client	Pricing Approach	Complexity
Managed simulation as a service	PR agencies, strategy consultancies	Per simulation or monthly subscription	Medium
White label instance for enterprise	Listed companies, government bodies	Annual licence plus setup fee	High
Research service bureau	Academic institutions, think tanks	Project based retainer	Low to medium
Creative simulation studio	Game studios, publishers, screenwriters	Per project	Low
Political intelligence service	Campaign teams, policy advisers	Confidential retainer	Medium to high

7.3 What It Costs to Operate

Component	Option	Monthly Cost (USD)
Server for the backend	4 core, 8 GB RAM VPS such as DigitalOcean or Hetzner	$20 to $60
LLM API budget for client runs	Typically passed through to the client at cost or with a margin	$50 to $500+
Zep Cloud memory	Pro tier for multi client use	$50 to $200
Domain and SSL	Standard web hosting	Around $10
Monitoring	Grafana Cloud free tier or similar	$0 to $30
Total operating overhead	Before LLM API pass through	$80 to $300 per month

With LLM costs passed through to clients, an individual operator could reasonably charge $500 to $3,000 per simulation engagement at current market rates. The hard infrastructure overhead is $80 to $300 per month. The value being charged for is operational knowledge, prompt configuration, report interpretation, and client management, not the server.

7.4 Who This Makes Sense For

AI consultants and digital agencies already working with marketing or strategy clients, where MiroFish adds a distinct capability to an existing service offering.
Strategy consultants who want to bring quantitative social simulation into their work alongside traditional scenario planning methods.
AI developers in markets where local enterprises lack in house AI teams and are looking for vendors who can deliver this kind of analysis.
Freelancers in public relations, crisis communications, or political consulting who want a technical edge that most competitors simply do not have.

7.5 Honest Challenges

Selling simulations means educating buyers: Most enterprise decision makers have never purchased an AI simulation service. Expect a long sales process and significant time spent explaining the approach before any contract is signed.
AGPL licence obligations: Modifications to the core MiroFish code that you distribute must be made open source. Running it as a hosted service does not trigger this requirement, but selling a modified fork does.
Accuracy questions will come up: Clients will ask how accurate the output is. Be prepared with an honest framing about probabilistic and qualitative results rather than hard predictions. This is a tool for widening the range of scenarios considered, not for producing a single correct answer.

Conclusion

MiroFish is one of the most genuinely interesting open source AI projects to appear in 2026. It takes a distinct approach to prediction by using emergent multi agent social simulation rather than asking a single model for an answer, and it packages that approach in a deployable, self hosted stack with a polished frontend.

At this stage it is best suited to qualitative strategic prediction: public relations and crisis scenario planning, narrative extrapolation, social sentiment forecasting, and policy review. The economics work for targeted use cases but not yet for large scale continuous monitoring. English documentation is still maturing and the community outside of China is still forming.

For technical entrepreneurs and AI service providers, the business case is real. Infrastructure costs are modest, the capability is genuinely different from anything most enterprise clients have access to, and buyers in strategy, communications, and finance have both the budget and the kind of problems that MiroFish is built to address. The gap between the open source tool and enterprise ready deployment is exactly where a service provider can create lasting value.

This repository is worth watching. Thirty two thousand stars are not there by accident.

GitHub: github.com/666ghj/MiroFish Website: mirofish.ai Built on OASIS by CAMEL AI