
Jon Ibañez del campoYou just finished a statistics assignment. The code works. The results are correct. Then you read...
You just finished a statistics assignment.
The code works. The results are correct. Then you read the submission requirements and see it: "all functions must be properly documented."
You have twenty functions. No docstrings. Twenty minutes left.
That's the problem this tutorial solves. A Python script that takes any function, sends it to Claude, and gets back a complete docstring — parameters, return values, exceptions, and a usage example. All automatically.
This is part three of a series on building real tools with the Claude API. Parts one and two cover the basics and a code review chatbot. This one is immediately useful.
A script that does one thing well: takes a Python function as input and returns a complete, properly formatted docstring.
Input: def calculate_pca(data, n_components):
Output: """
Performs Principal Component Analysis on the input dataset.
Args:
data: Input matrix of shape (n_samples, n_features).
n_components: Number of principal components to return.
Returns:
Transformed data of shape (n_samples, n_components).
Raises:
ValueError: If n_components exceeds the number of features.
Example:
result = calculate_pca(X, n_components=2)
"""
Same environment from the previous tutorials. If you're starting fresh:
mkdir doc-generator
cd doc-generator
python -m venv venv
Activate:
# Mac/Linux
source venv/bin/activate
# Windows
venv\Scripts\activate
Install:
pip install anthropic python-dotenv
Create your .env:
ANTHROPIC_API_KEY=your-key-here
The idea is simple: send Claude your function and ask for a docstring back. The system prompt does most of the work.
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
def generate_docstring(function_code: str) -> str:
"""Generate a docstring for a given Python function."""
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=(
"You are a Python documentation assistant. "
"When given a Python function, return only a Google-style docstring for it. "
"Include: a one-line summary, Args, Returns, Raises (if applicable), and Example. "
"Return only the docstring text inside triple quotes. No explanation, no extra text."
),
messages=[
{"role": "user", "content": function_code}
]
)
return response.content[0].text
The key instruction is "Return only the docstring text inside triple quotes. No explanation, no extra text." Without that, Claude adds commentary around the docstring and you'd need to parse it out.
sample_function = """
def calculate_mean(values):
total = sum(values)
return total / len(values)
"""
docstring = generate_docstring(sample_function)
print(docstring)
Output:
"""
Calculate the arithmetic mean of a list of values.
Args:
values: A list of numeric values.
Returns:
The arithmetic mean as a float.
Raises:
ZeroDivisionError: If the input list is empty.
TypeError: If the list contains non-numeric values.
Example:
mean = calculate_mean([1, 2, 3, 4, 5])
# Returns: 3.0
"""
Claude infers the edge cases — empty list, wrong types — even though the original function doesn't handle them. That's useful: it documents what the function should handle, not just what it does.
Getting the docstring is half the job. Inserting it automatically into the original function is the other half.
def insert_docstring(function_code: str, docstring: str) -> str:
"""Insert a generated docstring into a function definition."""
lines = function_code.split("\n")
# Find the line with the function definition
for i, line in enumerate(lines):
if line.strip().startswith("def "):
# Insert the docstring after the def line
indent = " " # Standard 4-space indent
docstring_lines = docstring.strip().split("\n")
indented = [indent + line for line in docstring_lines]
lines = lines[:i+1] + indented + lines[i+1:]
break
return "\n".join(lines)
Test it:
sample_function = """
def calculate_mean(values):
total = sum(values)
return total / len(values)
"""
docstring = generate_docstring(sample_function)
documented = insert_docstring(sample_function, docstring)
print(documented)
Output:
def calculate_mean(values):
"""
Calculate the arithmetic mean of a list of values.
Args:
values: A list of numeric values.
Returns:
The arithmetic mean as a float.
Raises:
ZeroDivisionError: If the input list is empty.
TypeError: If the list contains non-numeric values.
Example:
mean = calculate_mean([1, 2, 3, 4, 5])
# Returns: 3.0
"""
total = sum(values)
return total / len(values)
One function at a time works for quick fixes. For a full assignment with twenty functions, you want to process the whole file at once.
import re
def extract_functions(file_content: str) -> list[str]:
"""Extract all function definitions from a Python file."""
pattern = r"(def \w+\(.*?\):(?:\n(?: .+|\s*))*)"
return re.findall(pattern, file_content, re.MULTILINE)
def document_file(input_path: str, output_path: str) -> None:
"""Read a Python file, document all functions, and save the result."""
with open(input_path, "r") as f:
content = f.read()
functions = extract_functions(content)
print(f"Found {len(functions)} functions. Generating docstrings...\n")
documented_content = content
for i, function in enumerate(functions):
print(f"Processing function {i+1}/{len(functions)}...")
# Skip functions that already have docstrings
if '"""' in function or "'''" in function:
print(f" Already documented, skipping.")
continue
docstring = generate_docstring(function)
documented_function = insert_docstring(function, docstring)
documented_content = documented_content.replace(function, documented_function)
with open(output_path, "w") as f:
f.write(documented_content)
print(f"\nDone. Documented file saved to: {output_path}")
Usage:
document_file("statistics_assignment.py", "statistics_assignment_documented.py")
It reads the original file, skips functions that already have docstrings, generates the missing ones, and saves a new file. The original stays untouched.
import re
from dotenv import load_dotenv
from anthropic import Anthropic, APIError, RateLimitError, APIConnectionError
load_dotenv()
client = Anthropic()
def generate_docstring(function_code: str) -> str:
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=(
"You are a Python documentation assistant. "
"When given a Python function, return only a Google-style docstring for it. "
"Include: a one-line summary, Args, Returns, Raises (if applicable), and Example. "
"Return only the docstring text inside triple quotes. No explanation, no extra text."
),
messages=[{"role": "user", "content": function_code}]
)
return response.content[0].text
except RateLimitError:
return '"""Rate limit reached. Document this function manually."""'
except APIConnectionError:
return '"""Connection failed. Document this function manually."""'
except APIError as e:
return f'"""API error {e.status_code}."""'
def insert_docstring(function_code: str, docstring: str) -> str:
lines = function_code.split("\n")
for i, line in enumerate(lines):
if line.strip().startswith("def "):
indent = " "
docstring_lines = docstring.strip().split("\n")
indented = [indent + line for line in docstring_lines]
lines = lines[:i+1] + indented + lines[i+1:]
break
return "\n".join(lines)
def extract_functions(file_content: str) -> list[str]:
pattern = r"(def \w+\(.*?\):(?:\n(?: .+|\s*))*)"
return re.findall(pattern, file_content, re.MULTILINE)
def document_file(input_path: str, output_path: str) -> None:
with open(input_path, "r") as f:
content = f.read()
functions = extract_functions(content)
print(f"Found {len(functions)} functions. Generating docstrings...\n")
documented_content = content
for i, function in enumerate(functions):
print(f"Processing function {i+1}/{len(functions)}...")
if '"""' in function or "'''" in function:
print(" Already documented, skipping.")
continue
docstring = generate_docstring(function)
documented_function = insert_docstring(function, docstring)
documented_content = documented_content.replace(function, documented_function)
with open(output_path, "w") as f:
f.write(documented_content)
print(f"\nDone. Saved to: {output_path}")
if __name__ == "__main__":
document_file("your_file.py", "your_file_documented.py")
The system prompt needs to be strict
If you don't tell Claude to return only the docstring, it adds sentences like "Here's the docstring for your function:" — and that ends up in your code. The instruction "No explanation, no extra text" is not optional.
Functions with existing docstrings
The script checks for """ before calling the API. Remove that check and it overwrites documentation you already wrote.
Large files hit rate limits
If you have a file with fifty functions, Claude will process them fast enough to trigger rate limiting. Add a small delay between calls:
import time
time.sleep(0.5) # Half a second between calls
The script handles the common case well. A few natural extensions if you want to take it further:
Support for NumPy-style docstrings instead of Google-style — change the system prompt and the format changes with it.
A command-line interface so you can run python doc_generator.py myfile.py directly from the terminal without editing the script.
Batch processing for entire directories — walk through a project folder and document every .py file at once.
Full Claude API documentation at platform.claude.com/docs.
# One function. One API call. One docstring.
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="Return only a Google-style docstring. No extra text.",
messages=[{"role": "user", "content": your_function_code}]
)
print(response.content[0].text)