Alexandre CaramaschiIf you've optimized for Google with robots.txt and sitemap.xml, you already understand the idea: give...
If you've optimized for Google with robots.txt and sitemap.xml, you already understand the idea: give crawlers a structured entry point to your content. The llms.txt file does the same thing — but for large language models.
This guide covers everything you need to implement llms.txt from scratch, validate it, and deploy it on your stack.
llms.txt is a proposed specification that provides LLMs with a machine-readable summary of your site's content. While robots.txt tells crawlers what they can access, llms.txt tells AI engines what they should read and how your site is organized.
The file lives at the root of your domain:
https://yourdomain.com/llms.txt
An llms.txt file uses a simple markdown-like format:
# Your Site Name
> A one-line description of what your site or product does.
## Docs
- [Getting Started](https://yourdomain.com/docs/getting-started): How to set up the product.
- [API Reference](https://yourdomain.com/docs/api): Complete REST API documentation.
## Blog
- [Launching v2.0](https://yourdomain.com/blog/v2-launch): Announcement of version 2.0.
- [Performance Guide](https://yourdomain.com/blog/performance): Benchmarks and optimization tips.
## Optional
- [Pricing](https://yourdomain.com/pricing): Plans and pricing details.
- [About](https://yourdomain.com/about): Company background and team.
# heading with your site/product name.>) with a one-sentence summary.## headings that group your links.- [Label](URL): Description.## Optional section for pages that are useful but not critical.Focus on pages with high informational density:
Avoid thin pages, login screens, terms of service, or paginated list views.
Place the file in your public/ directory:
your-nextjs-project/
public/
llms.txt
llms-full.txt
// app/llms.txt/route.ts
import { getAllPosts } from '@/lib/content';
export async function GET() {
const posts = await getAllPosts();
const blogLinks = posts
.map((post) => `- [${post.title}](https://yourdomain.com/blog/${post.slug}): ${post.description}`)
.join('\n');
const content = `# Your Site Name\n\n> One-line description.\n\n## Blog\n\n${blogLinks}\n`;
return new Response(content, {
headers: {
'Content-Type': 'text/plain; charset=utf-8',
'Cache-Control': 'public, max-age=86400',
},
});
}
Simple Python validator:
import re, sys
def validate_llms_txt(filepath):
errors = []
with open(filepath, 'r', encoding='utf-8') as f:
lines = f.read().strip().split('\n')
if not lines or not lines[0].startswith('# '):
errors.append("Missing title heading.")
has_blockquote = any(l.strip().startswith('> ') for l in lines[:5])
if not has_blockquote:
errors.append("Missing description blockquote.")
sections = [l for l in lines if l.startswith('## ')]
if not sections:
errors.append("Missing sections.")
link_pattern = re.compile(r'^- \[.+\]\(https?://.+\)')
links = [l for l in lines if link_pattern.match(l.strip())]
if not links:
errors.append("No valid links found.")
return errors
if __name__ == '__main__':
fp = sys.argv[1] if len(sys.argv) > 1 else 'llms.txt'
errors = validate_llms_txt(fp)
if errors:
print(f"Failed with {len(errors)} error(s):")
for e in errors: print(f" - {e}")
sys.exit(1)
else:
print("llms.txt is valid.")
Add a reference in robots.txt:
# llms.txt — structured content map for AI engines
# https://yourdomain.com/llms.txt
Add a <link> tag in your HTML head:
<link rel="alternate" type="text/plain" href="/llms.txt" title="LLMs Content Map" />
llms.txt is a lightweight, zero-dependency way to help AI engines understand your site. It takes 15 minutes to implement and can meaningfully improve how LLMs reference your content.
The specification is still evolving. Track the latest at llmstxt.org.
I'm Alexandre Caramaschi, CEO of Brasil GEO. I write about Generative Engine Optimization — the practice of making your content visible to AI search. More at alexandrecaramaschi.com.