OzorWeb scraping usually means installing Puppeteer, fighting with anti-bot systems, rotating proxies,...
Web scraping usually means installing Puppeteer, fighting with anti-bot systems, rotating proxies, and parsing messy HTML. What if you could skip all of that?
curl -X POST "https://agent-gateway-kappa.vercel.app/v1/agent-scraper/scrape" \
-H "Authorization: Bearer YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://news.ycombinator.com", "format": "markdown"}'
One call. Clean markdown back. No Selenium, no Puppeteer, no proxies.
curl -X POST https://agent-gateway-kappa.vercel.app/api/keys/create
200 free calls, no signup required.
import requests
API = "https://agent-gateway-kappa.vercel.app"
KEY = "gw_your_key_here"
HEADERS = {
"Authorization": f"Bearer {KEY}",
"Content-Type": "application/json"
}
def scrape(url, fmt="markdown"):
r = requests.post(f"{API}/v1/agent-scraper/scrape",
headers=HEADERS,
json={"url": url, "format": fmt})
return r.json()
# Scrape a blog post as clean markdown
result = scrape("https://example.com/blog/interesting-post")
print(result["content"][:500])
The response gives you extracted text content — no nav bars, no footers, no cookie banners. Just the article.
competitors = [
"https://competitor1.com/pricing",
"https://competitor2.com/pricing",
"https://competitor3.com/pricing",
]
for url in competitors:
data = scrape(url)
print(f"\n--- {url} ---")
# Look for price-related content
for line in data.get("content", "").split("\n"):
if "$" in line or "price" in line.lower() or "month" in line.lower():
print(line.strip())
import json
SOURCES = {
"Hacker News": "https://news.ycombinator.com",
"Lobsters": "https://lobste.rs",
"Dev.to": "https://dev.to",
}
all_news = {}
for name, url in SOURCES.items():
result = scrape(url)
all_news[name] = {
"title": result.get("title", ""),
"content_preview": result.get("content", "")[:300],
"url": url
}
print(f"Scraped {name}: {len(result.get('content', ''))} chars")
# Save aggregated news
with open("news_digest.json", "w") as f:
json.dump(all_news, f, indent=2)
The same API key gives you screenshots too:
def screenshot(url, viewport="desktop"):
r = requests.post(f"{API}/v1/agent-screenshot/screenshot",
headers=HEADERS,
json={"url": url, "viewport": viewport})
return r.json()
# Get both text content AND a visual screenshot
url = "https://example.com"
text_data = scrape(url)
visual = screenshot(url)
print(f"Text: {len(text_data.get('content', ''))} chars")
print(f"Screenshot: {visual.get('url', 'check response')}")
Scraping is great, but combining it with an LLM makes it powerful:
def ask_llm(prompt):
"""Use the built-in LLM proxy."""
r = requests.post(f"{API}/v1/agent-llm/chat",
headers=HEADERS,
json={"messages": [{"role": "user", "content": prompt}]})
return r.json()
# Scrape a page and summarize it
page = scrape("https://blog.example.com/long-technical-post")
content = page.get("content", "")[:3000]
summary = ask_llm(f"Summarize this article in 3 bullet points:\n\n{content}")
print(summary)
You absolutely can. But here's what you skip with an API:
| Problem | DIY | API |
|---|---|---|
| Browser install | Install Chrome/Chromium | None |
| Anti-bot detection | Rotate proxies, user agents | Handled |
| JavaScript rendering | Full browser needed | Handled |
| Memory usage | 200MB+ per browser instance | 0 |
| Maintenance | Update selectors when sites change | Not your problem |
| Scaling | Manage browser pool | Just make more calls |
For one-off scripts and small projects, the API saves hours of setup.
Check your remaining credits:
curl "https://agent-gateway-kappa.vercel.app/api/keys/balance" \
-H "Authorization: Bearer YOUR_KEY"
The scraping API is one of 39 services on the same gateway. Same key also gives you:
Browse everything: api-catalog-three.vercel.app
Next time you need data from a webpage, try the one-liner before reaching for Puppeteer. You might not need it.