From Local to Cloud: What I Learned Building a Remote MCP Server on AWS for Smart Home Control

# aws# mcp# sideprojects# ai

Jui-Hung Yuan

I wanted to tell Claude to turn off my bedroom light. Not just from my laptop at home — but from...

I wanted to tell Claude to turn off my bedroom light. Not just from my laptop at home — but from anywhere.

What started as "let's try MCP" became "why does OAuth keep failing" and "how do cloud services reach devices behind my router".

This post walks through the architecture decisions and the stuff that tripped me up. Not every choice was obvious, and some things only made sense after I'd already built them wrong once.

The full code is here.

Why I Built This

I recently attended a session organized by the Berlin AWS User Group about Amazon Bedrock AgentCore (shoutout to the organizers — the session was really helpful!!). I'd been using MCP for a while but never built one myself. When I looked around, most posts covered the concept and building local MCP servers — not much about deploying remote MCP servers to the cloud or what actually trips you up when you build one. So I picked a concrete use case: control my TAPO smart light bulb via Claude, and build the whole thing on AWS.

The goal was simple and deliberately small. One bulb. A handful of tools. But with enough real infrastructure to actually learn from.

Phase 1: Local Prototype with FastMCP

I started with FastMCP and Claude Desktop. Getting a working local MCP server took one afternoon.

The developer experience is genuinely impressive. You define a tool like this:

@app.tool()
async def turn_on() -> str:
    """Turn on the TAPO smart light bulb."""
    bulb = await get_bulb()
    await bulb.turn_on()
    return "Light turned on"

That's it. FastMCP reads your function and auto-generates everything Claude needs — the tool name, description, and input schema. You write the function. FastMCP handles the rest.

Claude Desktop runs the FastMCP server as a local subprocess. The full path is:

It worked. I could chat with Claude and control my light. But then I left home.

The Turning Point

Claude Desktop's MCP only works with local subprocesses. The moment you close your laptop or step outside, it's gone. I wanted the Claude web app to work too — partly because it's more convenient, partly because building the remote version is where the real learning happens.

That's a fundamentally different problem.

Claude web app needs a publicly accessible MCP server with proper authentication.
MCP server somehow needs to reach a device on your local network.

This is where AWS comes in.

The Architecture

Each component has a specific job:

Cognito: Handles user login.
AgentCore Gateway: The MCP server.
Lambda: Serverless handler (No always-on server to manage).
IoT Core: Cloud-to-device message broker.
Local Bridge: Runs at home and controls the bulb.

Key Lesson 1: Why AWS IoT Core (and not just Lambda → HTTP)?

AWS IoT Core is a managed cloud service that acts as a message broker between cloud services and physical devices. It uses MQTT — a lightweight protocol designed for devices with unreliable connections — to route messages through a publish/subscribe model.

Think of it like a radio station: Lambda broadcasts on a channel, and any device tuned to that channel receives the message.

The obvious question: why not have Lambda call the local bridge directly over HTTP?

1. NAT Traversal — Lambda can't reach your home network

Lambda lives in AWS. Your home bridge lives behind your router. Your router blocks all inbound connections — Lambda has no address to call. To make direct HTTP work, you'd need a static IP and port forwarding (security risk), a VPN tunnel (operational overhead), or a reverse tunnel like ngrok (fragile, costs money).

IoT Core flips the direction. Your local bridge reaches out to AWS and holds a persistent MQTT connection open. Lambda publishes to a topic, IoT Core delivers it over that already-open connection. Your home network never needs a public address.

2. Cost and efficiency — MQTT is event-driven, HTTP requires polling

With HTTP, if Claude wants to know the current brightness, Lambda would need to make a request every time — or poll regularly to keep state fresh. That's an HTTP call (and cost) for every status check.

With MQTT, the bridge reports state changes to IoT Core's Device Shadow automatically. Claude asks for brightness? The Shadow answers instantly from cache. No new request needed. The bridge only sends updates when something actually changes.

3. Resilience — MQTT handles disconnections automatically

HTTP assumes both sides are reliably reachable. If your bridge restarts or your internet hiccups during a Lambda call, the request just fails.

MQTT is designed for unreliable connections. If your bridge goes offline, IoT Core queues messages. When it reconnects, pending commands are delivered automatically. No retry logic to write, no state to track manually.

Key Lesson 2: FastMCP vs. AgentCore Gateway

Amazon Bedrock AgentCore Gateway is a fully managed service that turns your backend functions into an MCP-compliant server that AI clients like Claude can talk to. It handles OAuth authentication, protocol translation, and tool discovery — so you only write business logic.

Think of it as the bouncer and translator standing between Claude and your Lambda: it checks credentials, speaks MCP fluently, and routes the right instructions inward.

Here's how it compares to FastMCP as an MCP hosting approach:

Aspect	Winner	Why
Tool Schema	Tie🤝	FastMCP auto-generates from decorators (better DX). AgentCore requires explicit JSON (~70 lines for 4 tools), but makes the contract reviewable.
Hosting	AgentCore🏆	FastMCP needs a persistent runtime (container, EC2, Fargate) — costs money even when idle. AgentCore is serverless — Lambda only runs when invoked.
Auth Support	Tie🤝	Both handle OAuth well now (FastMCP 2.11+ added JWT, OAuth proxy, full OAuth server).

Key Lesson 3: OAuth — The Part That Actually Took Time

If I'm honest, the infrastructure was the easy part. Authentication is where I spent most of my debugging time.

The problem started with how I set up my Cognito configuration. It defaulted to the client_credentials flow — machine-to-machine (M2M) auth where a service exchanges a client ID and secret directly for a token. No login page, no user interaction.

That works fine for service-to-service communication. But the Claude web app is a browser-based client. It needs to redirect the user to a login page, have them authenticate, and receive an authorization code back — the authorization_code flow. These are fundamentally different OAuth patterns. It took me a while to figure out I was using the wrong flow entirely.

Again think of it like a nightclub with a strict bouncer. The bouncer (AgentCore Gateway) doesn't know you personally, but trusts the ID checker down the street (Cognito). You walk down to Cognito, prove who you are, and Cognito gives you a wristband. You bring that wristband back to the bouncer at /auth_callback, and now you're in.

That callback address is the key. The authorization_code flow exists precisely because Claude (the browser client) needs a human to authenticate interactively. The code is the shop's way of receiving confirmation from Cognito without the user handing over their password directly.

Once I understood that distinction, I knew what to fix: create a separate Cognito app client configured for authorization_code flow with the correct callback URL.

There was a second, subtler issue. When the AgentCore Gateway's resource metadata (/.well-known/oauth-protected-resource) doesn't specify a scope, Claude falls back to Cognito's OIDC discovery endpoint (/.well-known/openid-configuration), which advertises standard scopes: openid, email, phone, profile. But my Cognito app client only allowed openid, smarthome-gateway/read, and smarthome-gateway/write. Claude requesting email and phone caused an invalid_scope error. The fix: explicitly configure the allowed scopes on the client to match exactly what Claude will request.

Neither of these issues had anything to do with MCP itself. They were pure OAuth configuration problems. But you can only diagnose them if you understand the handshake well enough to know which step is failing.

Real-World Performance

Once everything was wired up, I measured actual round-trip times from Claude's perspective:

Tool	Round-trip
`get_status`	1,131 ms
`turn_on`	3,367 ms

get_status only hits the IoT Device Shadow — no trip to the physical bulb.
turn_on goes the full path: Lambda → IoT Core → bridge → bulb → confirmation back. Three seconds is noticeable but acceptable for a chat experience with the light switch. For anything latency-sensitive, you'd want to think harder about this.

The monthly cost for 1,000 tool calls across all services: $0.07.

What's Next

A few things I want to clean up:

Terraform for infrastructure provisioning. Right now I have five boto3 scripts that need to be run in a specific order. It works, but it's tedious. Terraform would make this reproducible and shareable.
Dockerize the local bridge to run on a Raspberry Pi or similar edge device, so it doesn't depend on a laptop being on.
More device types. The architecture already supports it — BaseDevice is an abstract interface, and the DeviceRegistry manages multiple devices. Adding a smart plug or thermostat is mostly a new implementation, not a new architecture.

Closing Thoughts

I started this project to understand MCP better. I ended up learning more about AWS IoT, OAuth flows, and serverless architecture than I expected. That's usually the sign of a good learning project — the stated goal was an excuse to dig into something real.

A few things I'd tell someone starting this from scratch:

Get the local FastMCP version working first. It takes an afternoon and gives you immediate feedback. Only then add the AWS layer.
The OAuth debugging will take longer than the infrastructure. Learn the authorization_code vs client_credentials distinction before you start configuring Cognito — it'll save you hours.
AgentCore Gateway is genuinely easy to set up compared to what I expected. The tool schema verbosity is real, but it's a one-time cost.
IoT Core is the right tool for this specific problem. The NAT traversal alone justifies it.

If you're building something similar or have questions about any of the architectural decisions, I'd love to hear from you in the comments. And if you spot something I got wrong — even better.

→ Full code on GitHub