pramodmisraInsurance claims are broken. The average property claim takes 25 minutes on the phone, costs...

Insurance claims are broken. The average property claim takes 25 minutes on the phone, costs insurers $40 to process, and the industry loses $80 billion annually to fraud — now supercharged by AI-generated deepfakes. I built ClaimSight to fix this: a multi-agent AI system that handles claims from first contact to submission in one live conversation using voice, vision, and real-time fraud detection.
Here's how I built it with Google AI models and Google Cloud.
ClaimSight uses Google ADK (Agent Development Kit) to orchestrate 6 specialized agents, each powered by Gemini 2.5 Flash:
ADK made multi-agent orchestration surprisingly straightforward. Each agent is defined with its own system instructions, tools, and personality. The ADK runner handles agent-to-agent transfers, tool execution routing, and session state — I just had to define the graph.
# Agent definition with ADK
triage_agent = Agent(
name="claimsight_triage",
model="gemini-2.5-flash",
instruction=TRIAGE_INSTRUCTIONS,
tools=[policy_lookup],
sub_agents=[maya_property, alex_auto, jordan_liability, fraud_sentinel, weather_verifier],
)
ClaimSight isn't a chatbot with extras — it's a fundamentally different interaction paradigm.
The frontend captures frames from the user's camera at 1 FPS and sends them to the backend via WebSocket. When an agent needs to document damage, it calls photo_capture which grabs the latest frame — complete with a camera shutter flash effect. The captured photo is then sent to Gemini's image generation for AI damage annotation with severity-coded markers.
Browser Speech Recognition API provides continuous speech-to-text with interim results. The user sees their words appear in real-time as they speak, and can seamlessly switch between voice and text input.
Each agent has their own natural voice via Google Cloud Text-to-Speech using Journey and Neural2 voice models. Maya sounds different from Alex, which sounds different from Jordan. Text is split into sentences and spoken progressively — the chat text reveals in sync with the speech.
The hardest part was echo prevention: the agent's TTS voice would get picked up by the user's microphone and fed back as input. I solved this by pausing speech recognition during TTS playback and resuming with a delay after playback ends.
This is the innovation I'm most proud of. The Fraud Sentinel runs 11 tools in parallel behind every claim:
Layer 1 — Visual Analysis: Detect AI-generated and manipulated damage photos. Checks for GAN artifacts, inconsistent lighting, and synthetic patterns using detect_ai_generated_image.
Layer 2 — Content Provenance: Verify image origin through C2PA content credentials. Check for stripped or forged metadata, analyze narrative consistency across the claimant's statement.
Layer 3 — Financial Verification: Cross-reference claim details with financial records through Plaid API integration. Detect inflated claims, staged losses, and suspicious patterns.
All three layers feed into a unified calculate_fraud_risk_score that the system uses for triage decisions.
The backend is a Python FastAPI server with a persistent WebSocket connection per session. Every message flows through a typed protocol:
transcript — Chat messages (user and agent)tool_call / tool_result — Real-time tool execution visibilityimage — AI-generated images (annotations, visualizations, infographics)agent_transfer — Agent handoff animationsthinking — Processing indicatorsThe frontend Agent Brain panel shows every tool call as it happens, with a 9-step progress tracker that fills in as the claim advances. Users can watch the AI think in real-time — it's both a demo feature and a trust mechanism.
The entire system runs on Google Cloud Run with a multi-stage Docker build:
Infrastructure is managed with Terraform — Cloud Run, Firestore, Artifact Registry, and Cloud Storage are all defined as code and reproducible from the repo.
gcloud run deploy claimsight --source . --region us-central1
Google Cloud services used:
I didn't just build a demo — I built 13 comprehensive test scenarios covering the most complex claim disputes in US insurance:
Each scenario tests different agent capabilities, tool combinations, and edge cases.
Built for the Gemini Live Agent Challenge. #GeminiLiveAgentChallenge