Building Your First MCP Server in Python

# programming# ai# beginners# tutorial

OnlineProxy

We have reached a pivotal moment in the evolution of Large Language Models (LLMs). For a long time,...

We have reached a pivotal moment in the evolution of Large Language Models (LLMs). For a long time, we treated them as "brains in a jar"—brilliant reasoning engines disconnected from the world, capable only of generating text based on training data. The Model Context Protocol (MCP) changes this paradigm entirely. It provides a standardized interface for these models to perceive (Resources), act (Tools), and function within specific interaction patterns (Prompts).

If you are a senior engineer or a developer looking to move beyond basic chatbot wrappers, building an MCP server is the critical next step. It allows you to expose local files, databases, and executable logic to clients like Claude Desktop or IDEs like Cursor.

This guide explores the architecture of a full-featured MCP server using the Python SDK. We will prioritize the FastMCP framework for its developer ergonomics, enabling us to implement tools, resources, and prompts in a cohesive system.

Why Construct a Server from Scratch?

Before writing a single line of code, we must ask: Is this necessary? If a server for your specific need (e.g., a PostgreSQL interface) already exists in the open-source ecosystem, the senior engineering decision is often to use the existing solution. Redundancy is technical debt.

However, custom business logic, specific local file manipulation, or unique workflow automations require bespoke servers. The Python SDK offers the most streamlined path for this, particularly for data-heavy operations. While TypeScript is a valid option, Python’s dominance in the AI space makes it the natural choice for integration.

We will build a "Multitool" server—a comprehensive implementation demonstrating the three core pillars of MCP:

Tools: Executable functions (e.g., a calculator).
Resources: Read-only context (e.g., documentation files).
Prompts: Templated interactions (e.g., meeting summaries).

The Trinity of Logic: Tools, Resources, and Prompts

To understand the architecture, we must visualize the user flow. An LLM client connects to your server. It queries the capabilities. Your server responds with a list of tools it can call, resources it can read, and prompt templates the user can trigger.

1. The Execution Layer: Tools
Tools are the hands of the model. They allow the LLM to perform calculations, execute scripts, or fetch dynamic data.

2. The Context Layer: Resources
Resources are the eyes of the model. They expose data—logs, code files, or database records—as direct context. Unlike tools, resources are generally read-only and passive.

3. The Interaction Layer: Prompts
Prompts are predefined workflows. Instead of typing "You are an executive assistant, please summarize this..." every time, the server exposes a "Meeting Summary" template with dynamic arguments.

Step-by-Step Implementation Guide

We will use uv for dependency management and fastmcp to scaffold the server. This modern Python stack avoids the bloat of traditional virtual environment management.

Phase 1: Environment and Initialization
First, verify your Python version. We are targeting Python 3.12+.

python --version

Next, install uv, a high-performance Python package installer and resolver.

# Verify installation
uv --version

Initialize your project structure. This creates a clean workspace without the "dependency hell" often associated with Python projects.

uv init .
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv add mcp[cli]

Phase 2: The Core Logic (server.py)
We create server.py as our entry point. We import FastMCP and instantiate our server.

from mcp.server.fastmcp import FastMCP
import math

# Initialize the server
mcp = FastMCP("Calculator Server")

Implementing Mathematical Tools

The most reliable way to test an MCP server is with deterministic logic. We will implement a calculator. Note that for the LLM to use these tools effectively, type hints and docstrings are mandatory. The detailed description in the docstring is what the model uses to decide when to call the tool.

We define basic arithmetic and slightly more complex operations. For example, a square root function must handle edge cases, such as negative numbers, by returning descriptive strings rather than raising unhandled exceptions.

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers together."""
    return a + b

@mcp.tool()
def multiply(a: int, b: int) -> int:
    """Multiply two numbers together."""
    return a * b

@mcp.tool()
def sqrt(x: float) -> str:
    """Calculate the square root of a number."""
    if x < 0:
        return "Cannot calculate square root of a negative number"
    return str(math.sqrt(x))

Mathematically, we are enabling the LLM to interpret natural language requests like "What is the square root of 144?", while our code handles the computation.

Phase 3: Integrating Resources
Resources allow us to attach static content. For this implementation, let’s assume we have a Markdown file on our desktop containing technical documentation (e.g., a TypeScript SDK guide) that we want the LLM to reference.

We use the @mcp.resource decorator. The critical component here is the URI schema (resource://...).

@mcp.resource("resource://docs/typescript-sdk")
def get_typescript_sdk_docs() -> str:
    """Reads the TypeScript SDK documentation."""
    # Implementation reading local .md file
    path = "./docs/typescript_sdk.md" 
    try:
        with open(path, "r") as f:
            return f.read()
    except FileNotFoundError:
        return "Documentation file not found."

This acts as a retrieval pipe. When the user asks a question about the SDK, the model can pull this resource into its context window automatically.

Phase 4: Constructing Dynamic Prompts
Prompts are perhaps the most underutilized aspect of MCP. They act as "saved searches" or "macros" for the LLM.

We will create a Meeting Summary prompt. This requires dynamic arguments: date, title, and transcript. Instead of raw text manipulation, we define the prompt to accept these variables and inject them into a structured narrative.

@mcp.prompt()
def meeting_summary(date: str, title: str, transcript: str) -> str:
    """Generates a structured meeting summary."""
    return f"""
    You are an executive assistant. Analyze the following meeting.

    Data: {date}
    Title: {title}

    Transcript:
    {transcript}

    Provide a comprehensive analysis including participants and key decisions.
    """

In the UI (like Claude Desktop), this renders as a form. The user enters the variables, and the server constructs the final prompt sent to the model.

Debugging: The MCP Inspector

Developing strictly via "trial and error" inside a chat interface is inefficient. You need to restart the host application repeatedly to pick up code changes.

The MCP Inspector is a browser-based debugging suite. It allows you to simulate a client connection, list capabilities, and invoke tools manually.

To launch the inspector against your local code:

uv run mcp-dev server.py
# Or directly naming the inspector if installed via npm/npx:
npx @modelcontextprotocol/inspector uv run server.py

The inspector runs a proxy server on a specific port (e.g., localhost:3000). It provides specific tabs for:

Tools: You can input a=10, b=2and execute add.
Resources: View the loaded text content of your resources.
Prompts: Fill in the template fields and view the generated text payload.

Crucial Insight: If you utilize the Inspector via the provided URL (e.g., localhost:5173), ensure you check the terminal for the session token/URL generated. Browsers enforce security on local connections; opening the link directly from the terminal with the token included helps bypass authorization hurdles.

Configuration: Connecting to Claude Desktop

Once the server passes the Inspector's tests, we integrate it into a production-style environment like Claude Desktop. This requires a configuration file located at:

Windows: %APPDATA%\Claude\claude_desktop_config.json
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json The configuration uses the stdio transport layer by default. This connects the standard input/output of the Python script to the host application.

{
  "mcpServers": {
    "calculator-server": {
      "command": "uv",
      "args": [
        "run",
        "server.py"
      ],
      "env": {
        "PYTHONUTF8": "1"
      }
    }
  }
}

Developer Note: Ensure the pathing is absolute or that the cwd (current working directory) is handled correctly within your script if you are not pointing explicitly to the file path.

Transport Layers: Stdio vs. SSE vs. Streamable HTTP

The protocol supports different transport mechanisms.

Stdio (Standard Input/Output): The default for local processes. It is fast, relies on process piping, and requires no networking ports. This is ideal for local desktop integration.
SSE (Server-Sent Events): Previously a standalone transport, it is now often wrapped within streamable HTTP. It allows for unidirectional updates from server to client.
Streamable HTTP: The modern standard for remote MCP servers. It uses POST requests for client-to-server messages and SSE for server-to-client messages.

If you wish to expose your server over a network (e.g., controlling a remote virtual machine), you must switch from stdio to http.

# In server.py
if __name__ == "__main__":
    # Allows switching based on deployment needs
    mcp.run(transport="stdio") 
    # Or for network exposure:
    # mcp.run(transport="sse")

When debugging http/sse via the Inspector, the connection URL changes. You must reference the specific endpoint (often ending in /mcp or /sse), and you generally need to handle CORS and authentication if moving beyond localhost.

The "Vibe Coding" Workflow

A recurring theme in modern development is LLM-assisted coding—or "vibe coding." When building this MCP server, we do not write every boilerplate line manually. Instead, we act as architects.

We feed the context documentation (the MCP SDK README.md or llms.txt) into an IDE like Cursor. We then prompt the model:

"I want to create an MCP server with a calculator tool. Use the provided docs. Structure it with FastMCP."

The LLM generates the scaffold. We then iterate:

"Add a resource that reads a markdown file from the desktop."
"Add a prompt template for meeting summaries."

However, LLMs make mistakes. For instance, they might hallucinate methods like mcp.list_prompts() which don't exist in the high-level FastMCP abstraction, or confuse mcp.tool with mcp.resource.

The Senior Developer's Role: Your job shifts from syntax generation to verification. You must verify that the decorators match the intent and that the logic inside the functions handles exceptions (like dividing by zero). You use the Inspector to validate the LLM's output. If the implementation fails, you revert to a clean state (make backups of your server.py!) and refine the prompt.

Final Thoughts

We have built a server that calculates, reads documentation, and structures textual analysis. We have moved from a static chat interface to a dynamic, integrated system.

The power of MCP lies in its extensibility. Today, it is a calculator; tomorrow, it is a server that queries your internal SQL database, fetches live stock data via HTTP APIs, or manages your Docker containers.

The ecosystem is shifting. We are no longer just prompting models; we are architecting the environments in which they operate. By mastering the Python SDK, creating robust tools, and debugging effectively with the Inspector, you position yourself at the forefront of this agentic shift.

Go build something that does work, not just says work.