ppcvoteMost people use Gemini's free quota for 15 chat sessions. We use the same 1,500 RPD to run 25 timers, 4 AI Agents, and 105 daily tasks for full business automation. Monthly cost: $0. This article reveals the complete architecture, RPD budget breakdown, pitfall log, and every optimization trick.
"Most people spend 1,500 RPD on 15 chat sessions. We spend it on 105 tasks to run an entire company."
I'm the founder of Ultra Lab, a one-person tech services brand based in Taiwan. No employees, no assistants -- 4 AI Agents work around the clock to post content, reply to comments, follow up on leads, conduct research, and run strategy meetings for me.
All running on Google Gemini 2.5 Flash free tier. Monthly cost: $0.
This article reveals the complete architecture, RPD budget allocation, pitfalls we've hit, and every trick for squeezing maximum value out of free-tier quota.
Let's start with the numbers. Gemini 2.5 Flash free tier gives you:
How do most people use it? Open a long conversation, go back and forth for 20 turns, and a single session burns 100 RPD. Do that 15 times and your daily quota is gone.
But what if you design every request as a short, precise, one-shot task?
1,500 RPD suddenly becomes 1,500 work units. That's enough to run a full company's automation.
+-------------------------------------------------+
| OpenClaw Gateway |
| (WSL2 Ubuntu, port 18789) |
+---------+----------+----------+-----------------+
| Main |MindThread| Probe | Advisor |
| (CEO) | (Social) | (SecRes) | (Advisor) |
+---------+----------+----------+-----------------+
| 25 systemd timers |
| 62 bash/node scripts |
| 19 intelligence .md files |
+-------------------------------------------------+
| blogwatcher | hn-trending | summarize | curl |
| (RSS) | (HN API) | (Jina) | (HTTP) |
| ^ all 0 LLM cost (pure HTTP) ^ |
+-------------------------------------------------+
4 Agents, each with their own specialty:
| Agent | Role | Moltbook Account |
|---|---|---|
| UltraLabTW | CEO + Brand Strategy | @ultralabtw |
| MindThreadBot | Social Automation Specialist | @mindthreadbot |
| UltraProbeBot | AI Security Researcher | @ultraprobebot |
| UltraAdvisor | Financial Advisor | @ultraadvisor |
Hardware: One Windows desktop running WSL2. No Mac Mini needed.
This is the complete upgrade path that takes an Agent Fleet from "functional" to "maxed out."
The simplest and most effective upgrade. Instead of "generate and publish," every post now goes through:
Generate draft (1 call)
|
Self-review: "Score this 1-10" (1 call)
|
< 7 -> Rewrite (1 call)
>= 7 -> Publish as-is
Implementation -- add this to the autopost script:
# === Quality Gate ===
REVIEW_PROMPT="Review this draft. Score 1-10.
TITLE: ${TITLE}
CONTENT: ${CONTENT:0:500}
If >= 7: output APPROVED
If < 7: output REWRITE then a better TITLE:/--- version."
REVIEW=$(openclaw agent --agent main --message "$REVIEW_PROMPT")
if echo "$REVIEW" | grep -qi "REWRITE"; then
# Parse rewritten version
TITLE=$(echo "$REVIEW" | grep "^TITLE:" | head -1 | sed 's/^TITLE: *//')
CONTENT=$(echo "$REVIEW" | sed '1,/^---$/d')
log "Quality gate: REWRITE"
else
log "Quality gate: APPROVED"
fi
Impact: 8 posts x 2 calls = 16 RPD. Content quality doubles instantly.
This layer is completely free. Inject existing intelligence files into the posting prompt:
# Inject own posting performance
PERF_FILE="$HOME/.openclaw/workspace/POST-PERFORMANCE.md"
if [ -f "$PERF_FILE" ]; then
RESEARCH_CONTEXT="${RESEARCH_CONTEXT}$(head -30 "$PERF_FILE")"
fi
# Inject competitor intelligence
COMP_FILE="$HOME/.openclaw/workspace/COMPETITOR-INTEL.md"
if [ -f "$COMP_FILE" ]; then
RESEARCH_CONTEXT="${RESEARCH_CONTEXT}$(head -25 "$COMP_FILE")"
fi
Before writing, the Agent already knows:
Impact: 0 extra RPD, but content shifts from "LLM imagination" to "data-driven."
Most Agents on social media are drive-by: post -> leave. We're different.
# Track conversation depth
THREAD_DEPTH=$(node -e "
const t = JSON.parse(require('fs').readFileSync('$THREAD_FILE','utf8'));
console.log(t['$POST_ID'] || 0);
")
# Max 2 reply rounds to avoid infinite loops
if [ "$THREAD_DEPTH" -ge 2 ]; then
log "Thread depth limit reached, skipping"
continue
fi
Reply-checker v2 will:
Impact: Social presence goes from "bulletin board" to "engaging conversation."
Have another Agent review a post before publishing:
# peer-review.sh -- cross-review drafts
RESPONSE=$(openclaw agent --agent "$REVIEWER" --message \
"Review this teammate's draft. APPROVED or SUGGESTION: [one fix]")
Main's posts get reviewed by Probe: "Any security angles to add?"
Probe's posts get reviewed by Main: "Would a non-technical person understand this?"
Impact: Cross-perspective = fewer blind spots.
Automatically triggered every Sunday at 12:00:
Step 1: 3 Agents each read all intelligence files and propose Top 3 priorities for next week
Step 2: Main (CEO) Agent synthesizes all proposals into the final strategy
Step 3: Output written to STRATEGY-NEXT-WEEK.md -> readable by all Agents
for AGENT in main mindthread probe; do
RESPONSE=$(openclaw agent --agent "$AGENT" --message \
"Based on this week's data, propose TOP 3 priorities for next week.
$PERF_DATA $COMP_DATA $INQUIRY_DATA")
PROPOSALS="${PROPOSALS}### ${AGENT}: ${RESPONSE}"
done
# CEO synthesizes
FINAL=$(openclaw agent --agent main --message \
"Synthesize these proposals into next week's strategy: $PROPOSALS")
Impact: Agents don't just execute -- they reflect and plan.
blogwatcher (RSS) --> New article URLs
hn-trending (API) --> High-scoring URLs
|
summarize (Jina Reader) --> Full-text markdown
| ^ 0 LLM cost
Agent analysis (1 call) --> RESEARCH-NOTES.md
|
Next autopost cites real data
The key: RSS monitoring, HN scraping, and URL summarization are all pure HTTP -- 0 LLM cost. Only the final "What does this mean for our clients?" uses 1 LLM call.
# Only process new URLs (dedup via seen list)
SUMMARY=$(timeout 20 summarize "$URL" | head -c 2000)
ANALYSIS=$(openclaw agent --agent main --message \
"Analyze this for business relevance: $SUMMARY")
echo "$ANALYSIS" >> RESEARCH-NOTES.md
Here's the actual daily consumption:
| Task | Frequency | Daily RPD | Category |
|---|---|---|---|
| Autopost x 4 agents | 2x/day | 8 | Content |
| Quality gate self-review | 1 per post | 8 | Content |
| Quality gate rewrite | ~50% trigger rate | ~4 | Content |
| Engage x 4 agents | 1x/day | 4 | Engagement |
| Reply-checker | 2x/day | ~15 | Engagement |
| Cross-engage | 2x/week | ~1 | Engagement |
| Research chain | 2x/day | ~8 | Intelligence |
| Daily reflect | 1x/day | 4 | Operations |
| Daily briefing | 1x/day | 1 | Operations |
| Auto-respond | Trigger-based | ~1 | Operations |
| Lead follow-up | Trigger-based | ~1 | Operations |
| Blog-to-social | Trigger-based | ~0.5 | Content |
| Weekly strategy | Sunday | ~0.7 | Strategy |
| Total | ~56-105 | ||
| Remaining | ~1,395-1,444 | For interactive use with Agents |
RPD utilization: 3-7%. 93% of quota is left for interactive use.
05:00 | research-chain -> RESEARCH-NOTES.md
05:30 | MindThread data sync -> MINDTHREAD-DATA.md
06:00 | Customer insights sync + inquiry tracking -> INQUIRY-STATUS.md
06:30 | competitor-watch -> COMPETITOR-INTEL.md
|
07:00 | autopost-probe (reads all intel -> quality gate -> publish)
08:00 | autopost-main
09:00 | autopost-mindthread
10:00 | autopost-advisor + engage x 4 (staggered 15 min apart)
|
11:00 | reply-checker (conversation management)
12:00 | inquiry tracking (round 2)
14:00 | blog-to-social (if new articles exist)
|
17:00 | research-chain (round 2) + daily-briefing
18:00 | inquiry tracking (round 3)
|
19-22 | autopost round 2 (4 agents)
22:00 | post-stats -> POST-PERFORMANCE.md
23:00 | reply-checker (round 2) + daily-reflect
|
Sun 12 | weekly-strategy -> STRATEGY-NEXT-WEEK.md
Tue/Fri| cross-engage (cross-Agent interaction)
Notice the data flow direction: upstream produces intelligence, downstream consumes it. The research chain finishes at 05:00, so the 07:00 autopost can cite the latest data. Post-stats finishes at 22:00, so the next day's autopost knows which headlines performed well.
Each Agent's workspace contains these .md files, all auto-updated:
| File | Source | Update Frequency |
|---|---|---|
| POST-PERFORMANCE.md | post-stats.sh | Daily 22:00 |
| COMPETITOR-INTEL.md | competitor-watch.sh | Daily 06:30 |
| RESEARCH-NOTES.md | research-chain.sh | 2x/day |
| INQUIRY-STATUS.md | inquiry-tracker.js | Every 6 hours |
| CUSTOMER-INSIGHTS.md | Firestore sync | Daily 06:00 |
| MINDTHREAD-DATA.md | Firestore sync | Daily 05:30 |
| STRATEGY-NEXT-WEEK.md | weekly-strategy.sh | Weekly (Sunday) |
| HEALTH-STATUS.md | health-monitor.sh | Hourly |
| IDENTITY.md | Manually maintained | Agent personality and product knowledge |
| STRATEGY.md | Manual + Agent updates | OKRs and decision framework |
| PRODUCTS.md | Manually maintained | Product knowledge base |
Agents don't need to "remember" anything -- they just read the latest .md files every time they're called. That's why short tasks are more efficient than long conversations: context is pre-computed and doesn't need to be restated during a conversation.
These tools consume zero LLM quota:
# blogwatcher -- monitors 5 AI industry blogs via RSS
blogwatcher scan --json
# Tracks: LangChain, OpenAI, Anthropic, Google AI, OWASP LLM
# hn-trending -- HN trending articles
hn-trending 10 --json
# Returns: title, url, score, comments
# summarize -- URL -> Markdown (Jina Reader)
summarize "https://example.com/article"
# Returns: clean markdown full-text
# curl -- Moltbook API, Firestore, Telegram Bot API
# All REST APIs, 0 LLM cost
Core principle: If it can be solved with HTTP, never use an LLM. LLMs only handle work that requires "thinking."
An API key created from Google Cloud Console, where the project had billing enabled. Result:
Fix: Always create keys from AI Studio, never from a GCP project with billing. Use openclaw secrets audit to verify all key sources.
Pillar rotation used day_of_year % 5 -- running multiple times on the same day always picked the same pillar.
# Bad
PILLAR_INDEX=$(( DAY_OF_YEAR % 5 ))
# Good -- different pillar for each post
POST_SLOT=$(( DAY_OF_YEAR * 2 + HOUR / 12 ))
PILLAR_INDEX=$(( POST_SLOT % 5 ))
The health check script called getUpdates, which conflicted with the gateway's long-polling. Result:
Lesson: Never call getUpdates from diagnostic scripts.
Accumulated unread comments were all replied to at once, consuming the entire rate limit and starving all other tasks.
Lesson: Set a cap for backlog clearing. Or better yet -- run the reply-checker more frequently but limit it to 5 replies per run.
The Moltbook API must use www.moltbook.com. The non-www version (moltbook.com) strips the Authorization header. All 6 skill scripts had it wrong. Took an hour to debug.
Long conversation mode (most people):
Human -> Agent -> Human -> Agent -> Human -> Agent
Each turn carries full history, context snowballs
20-turn conversation ~ 100 RPD, produces 1 result
Short task mode (us):
Timer triggers -> read .md intelligence -> 1 prompt -> 1 response -> done
1 task = 1 RPD, produces 1 result
What's the difference?
Context is pre-computed. POST-PERFORMANCE.md already has performance rankings calculated -- the Agent doesn't need a "please analyze my recent post performance" back-and-forth.
Every request is self-contained. No dependency on conversation history, no "last time we discussed..." needed.
Research steps don't use LLMs. RSS monitoring, HN scraping, URL summarization are all HTTP. Only the final analysis uses an LLM.
Failures are isolated. One task failing doesn't affect the other 24. In a long conversation, a mid-session error wastes the entire context.
| Item | Monthly Cost |
|---|---|
| Gemini 2.5 Flash | $0 (free tier) |
| Vercel hosting | $0 (hobby plan) |
| Firebase Firestore | $0 (free tier) |
| Resend email | $0 (100 emails/day free) |
| Telegram Bot API | $0 |
| Moltbook API | $0 |
| Jina Reader (summarize) | $0 |
| HN API | $0 |
| blogwatcher | $0 (self-hosted) |
| Windows electricity | ~$5 |
| Total | ~$5/month |
No Mac Mini needed. No VPS needed. No paid APIs needed.
Who it's NOT for:
We've open-sourced the entire architecture. 12 production scripts, 18 systemd timers, architecture docs, RPD budget spreadsheet, pitfall log -- all on GitHub:
github.com/ppcvote/free-tier-agent-fleet
free-tier-agent-fleet/
|-- scripts/
| |-- core/ # autopost (quality gate), team-context, peer-review
| |-- intelligence/ # research-chain, competitor-watch, post-stats, hn-trending
| |-- engagement/ # reply-checker (conversation tracking), blog-to-social
| |-- operations/ # inquiry-tracker, lead-followup, health-monitor, weekly-strategy
|-- timers/ # 18 systemd timer/service files
|-- docs/ # architecture deep-dive, RPD budget, pitfall log
|-- examples/ # openclaw.json + credentials examples
Clone it, swap in your API key and content pillars, and you're running.
Complete setup steps:
If you want us to set it up for you -- including custom Agent personalities, a visual command center dashboard, complete timer configuration, and feedback loops -- contact us.
We also have a live Agent command center demo where you can watch 4 Agents walk around in a pixel-art office.
The real bottleneck for AI Agents isn't model capability -- it's architecture design.
With the same free quota, you can have 15 chat sessions, or run 105 automated tasks. The only difference is how you break work into short, precise, one-shot units and feed them pre-computed data.
You don't need a more expensive model. You don't need more tokens. You need a smarter architecture.
Ultra Lab -- AI that works.
https://ultralab.tw
Originally published on Ultra Lab — we build AI products that run autonomously.
Try UltraProbe free — our AI security scanner checks your website for vulnerabilities in 30 seconds: ultralab.tw/probe