Socket.IO Explained: Why It Exists, How It Works Internally, and How It Scales to Millions

# backend# javascript# networking# systemdesign

Munna Thakur

You've probably used or heard of Socket.IO. Most tutorials just show you socket.emit() and...

You've probably used or heard of Socket.IO. Most tutorials just show you socket.emit() and socket.on() and call it a day.

But if you actually want to understand what's happening under the hood — why Socket.IO exists, why it's more reliable than raw WebSockets, and how apps like WhatsApp handle millions of live connections — this post is for you.

Let's go through the whole thing from the beginning.

The Problem That Started All of This

Normal web apps work on HTTP. You send a request, server sends back a response, connection closes. That's it.

But some apps need the server to push data to the client at any time:

Chat apps
Live notifications
Live sports scores
Multiplayer games
Real-time dashboards

HTTP alone can't do this well. So developers got creative.

What People Did Before WebSockets

Before WebSockets became standard, real-time was painful. Here's what developers actually used:

HTTP Polling — Client asks the server every few seconds: "Any new data?"

setInterval(() => {
  fetch('/messages').then(res => res.json()).then(console.log)
}, 5000)

Works, but wasteful. 10,000 users = 10,000 requests every 5 seconds hitting your server constantly.

Long Polling — Client sends a request, server holds it open until there's new data, then responds. Client immediately sends another request.

Client ──────→ request
Server ─────── waiting...
Server ──────→ here's your data
Client ──────→ new request

Better than polling, but still HTTP overhead on every cycle. Gmail and early Facebook chat used this.

Server-Sent Events (SSE) — Server pushes data to client as a stream. One-way only though. Client can't send back easily.

Comet — A collection of hacks from the 2000s (hidden iframes, streaming, long polling combined). Gmail used it. It was messy.

Then in 2011, WebSocket was standardized in HTML5 and changed everything.

WebSocket: The Real Foundation

WebSocket gives you a persistent, two-way connection between browser and server.

Normal HTTP:
Client → request → Server → response → connection closed

WebSocket:
Client ←————————— persistent connection —————————→ Server
       ←— data anytime —→ ←— data anytime —→

One connection stays open. Server can push data whenever it wants. Client can send whenever it wants. No repeated handshakes.

This is what powers WhatsApp Web, Slack, Discord, live trading platforms, and multiplayer games today.

So Why Does Socket.IO Exist?

Raw WebSocket is powerful, but it has rough edges:

Some corporate firewalls and proxies block WebSocket connections
Older browsers don't support it
If the connection drops, you have to handle reconnection yourself
There's no built-in concept of rooms, events, or broadcasting

Socket.IO is a library built on top of WebSocket that handles all of this for you.

Raw WebSocket:

const ws = new WebSocket('ws://localhost:3000')
ws.onmessage = (event) => console.log(event.data)
ws.send('Hello')

Socket.IO:

import { io } from 'socket.io-client'
const socket = io('http://localhost:3000')

socket.on('newMessage', (data) => console.log(data))
socket.emit('newMessage', 'Hello')

Socket.IO gives you named events, rooms, automatic reconnection, and fallback transport — all built in.

How Socket.IO Actually Works Internally

This is the part most developers skip. Socket.IO has an internal engine called Engine.IO, and connection happens in 4 phases.

Phase 1 — HTTP Handshake

When you call io('http://localhost:5000'), Socket.IO does NOT open a WebSocket immediately. It starts with a plain HTTP request:

GET /socket.io/?EIO=4&transport=polling

Server responds with session info:

{
  "sid": "abc123",
  "upgrades": ["websocket"],
  "pingInterval": 25000,
  "pingTimeout": 20000
}

A session ID is established. Server confirms WebSocket upgrade is possible.

Phase 2 — Long Polling as Fallback

Before upgrading to WebSocket, Engine.IO starts with HTTP long polling. This ensures the connection works even on networks that block WebSockets (corporate proxies, firewalls).

Client ──→ request
Server ─── hold...
Server ──→ sends data when available
Client ──→ immediately sends next request

Phase 3 — Transport Upgrade

Once the polling works, Socket.IO tries to upgrade to WebSocket:

GET /socket.io/?EIO=4&transport=websocket&sid=abc123

Server responds: HTTP 101 Switching Protocols

Polling stops. WebSocket takes over. You now have a persistent connection.

Phase 4 — Real-Time Communication

Now socket.emit() and socket.on() work over WebSocket frames.

// Client
socket.emit('newMessage', 'Hello everyone')

// Server
io.on('connection', (socket) => {
  socket.on('newMessage', (msg) => {
    io.emit('newMessage', msg) // broadcast to all
  })
})

The full flow looks like this:

io() called
    │
HTTP Handshake (get session ID)
    │
Long Polling starts (fallback)
    │
WebSocket upgrade attempt
    │
WebSocket connected
    │
Real-time events ↔

Heartbeat (Ping/Pong)

Socket.IO keeps the connection alive using a heartbeat:

Server ──→ ping
Client ──→ pong

If pong doesn't come back within the timeout, connection is marked dead and auto-reconnect kicks in.

Socket.IO in React

Install:

npm install socket.io-client

Basic usage:

import { useEffect, useState } from 'react'
import { io } from 'socket.io-client'

const socket = io('http://localhost:5000')

function Chat() {
  const [messages, setMessages] = useState([])

  useEffect(() => {
    socket.on('newMessage', (msg) => {
      setMessages((prev) => [...prev, msg])
    })

    return () => socket.off('newMessage')
  }, [])

  const sendMessage = () => {
    socket.emit('newMessage', 'Hello!')
  }

  return (
    <div>
      {messages.map((msg, i) => <p key={i}>{msg}</p>)}
      <button onClick={sendMessage}>Send</button>
    </div>
  )
}

Rooms (Important for Group Chats)

Rooms let you target specific groups of users:

// Server
socket.join('cricket-room')
io.to('cricket-room').emit('scoreUpdate', { score: '45/2' })

Only users in cricket-room receive that event. This is how group chats, channels, and game lobbies work.

The Scaling Problem (This is Where It Gets Interesting)

Socket.IO on a single server works great. But production apps have multiple servers behind a load balancer.

Here's the problem:

           Load Balancer
                │
      ┌─────────┼─────────┐
      │         │         │
   Server1   Server2   Server3

User A → Server1
User B → Server3

User A sends a message to "cricket-room". Server1 knows who's in that room on its own connections. But it has no idea about User B sitting on Server3.

Result: User B never gets the message. ❌

The Fix: Redis Adapter

Redis acts as a message broker between all your Socket.IO servers.

           Load Balancer
                │
      ┌─────────┼─────────┐
      │         │         │
   Server1   Server2   Server3
      │         │         │
      └──────── Redis ────┘

When Server1 emits an event, it publishes to Redis. Redis broadcasts it to all other servers. Each server delivers the message to its own connected clients.

Setup:

npm install @socket.io/redis-adapter redis

import { createClient } from 'redis'
import { Server } from 'socket.io'
import { createAdapter } from '@socket.io/redis-adapter'

const io = new Server(3000)

const pubClient = createClient({ url: 'redis://localhost:6379' })
const subClient = pubClient.duplicate()

await Promise.all([pubClient.connect(), subClient.connect()])

io.adapter(createAdapter(pubClient, subClient))

That's it. Now all your Socket.IO servers share room state through Redis.

Message flow with Redis:

User A → Server1
Server1 → publish to Redis
Redis → broadcast to Server2, Server3
Server2 → deliver to User B
Server3 → deliver to User C

Everyone gets the message regardless of which server they're on.

How Apps Like WhatsApp Handle Millions of Connections

This is the big question. One server can handle maybe 50k–100k WebSocket connections. How do you get to millions?

1. Horizontal Scaling

Simple math: 100 servers × 50k connections = 5 million connections.

2. Load Balancer Routes Traffic

Users
  │
Load Balancer (Nginx / AWS ELB)
  │
WebSocket Server Pool

3. Stateless Servers + Message Broker

Servers don't store state. They just receive and forward. Redis or Kafka sits in the middle.

User A → Server1 → Kafka → Server7 → User B

4. Event Loop Servers (Node.js / Erlang / Go)

Traditional servers use one thread per connection. That doesn't scale.

Node.js uses a single-threaded event loop with non-blocking I/O. One process can handle tens of thousands of concurrent connections.

WhatsApp famously used Erlang — a language designed for millions of lightweight concurrent processes. It's literally built for this.

5. Edge / Regional Clusters

Users connect to the nearest data center. India users hit Asia servers. US users hit US servers. Latency drops, load distributes.

6. Idle Connection Optimization

Most WebSocket connections are idle most of the time. The server is just sending tiny ping/pong packets (10–20 bytes). Millions of idle connections are actually manageable if your server is event-driven.

Full production architecture:

Users
  │
Global Load Balancer
  │
Regional Load Balancers
  │
WebSocket Server Pool (stateless)
  │
Redis / Kafka (message bus)
  │
Database

Quick Comparison: What to Use When

Scenario	Use This
Simple real-time, reliability matters	Socket.IO
You need full control, minimal overhead	Raw WebSocket
One-way server → client stream (feeds, logs)	Server-Sent Events
Multiple servers in production	Socket.IO + Redis Adapter
Millions of connections, serious scale	Kafka + stateless servers

Summary

Here's the full picture in one place:

Before WebSockets: Polling, long polling, Comet hacks — all workarounds
WebSocket: Real persistent two-way connection, changed everything in 2011
Socket.IO: Library on top of WebSocket with events, rooms, reconnection, and fallback transport
Internally: HTTP handshake → polling → upgrade → WebSocket → heartbeat
Scaling problem: Multiple servers can't share room state by default
Redis adapter: Acts as pub/sub bus between servers, solves the problem cleanly
Millions of connections: Horizontal scaling + stateless servers + message broker + edge infrastructure

If this helped, drop a reaction. If you've hit a Socket.IO issue in production that wasn't covered here, drop it in the comments.