Server room with glowing blue lights
← Blog·Engineering

Real-Time at Scale: How Knotos Delivers Messages Instantly

V
Van Thuong Dao·May 3, 2026·7 min read

Real-time messaging sounds simple until you have to actually build it. In this post I'll walk through the message delivery pipeline in Knotos — from the moment you tap send to when the other person sees the read receipt.

The architecture in one sentence

A NestJS API server writes messages to MongoDB, publishes events to Redis, and a Socket.IO gateway subscribes to Redis and fans out to connected clients.

Why separate the API from the gateway?

The first temptation is to handle everything in one Socket.IO server. It works — until you need to scale horizontally. Knotos separates concerns from day one:

  • REST API handles durable operations: write message to DB, update last message preview on the Chat document.
  • Redis pub/sub decouples the write path from delivery. Any number of gateway instances can subscribe to the same channel.
  • The Socket.IO gateway is stateless except for in-memory socket connections.

Sending a message: step by step

// 1. Client emits over Socket.IO
socket.emit('send_message', {
  chatId: '...',
  content: 'Hey!',
  type: 'text'
});

// 2. Gateway validates + calls MessagesService
@SubscribeMessage('send_message')
async handleSendMessage(client: Socket, dto: SendMessageDto) {
  const message = await this.messagesService.send(userId, dto);
  // publish to Redis channel
  await this.redis.publish(
    `channel:chat:${dto.chatId}`,
    JSON.stringify({ event: 'new_message', data: message })
  );
}

// 3. Redis subscriber fans out to all members
this.redis.pSubscribe('channel:*', (message, channel) => {
  const chatId = channel.split(':')[2];
  this.server.to(`chat:${chatId}`).emit('new_message', message);
});

The Chat document denormalisation trick

Fetching the full last message for every chat in the list would require joining messages to chats on every load. Instead, Knotos denormalises a lastMessagePreview object directly onto the Chat document whenever a message is sent:

// MessagesService.send()
await this.chatModel.findByIdAndUpdate(chatId, {
  lastMessagePreview: {
    content: message.content,
    type: message.type,
    senderId: message.senderId,
    createdAt: message.createdAt,
  },
  updatedAt: new Date(),
});

The chat list query becomes a single find on the Chat collection — no joins, no N+1, instant load.

Per-user state without polluting the Chat document

Archive, mute, pin, and unread count are per-user, not per-chat. If they lived on the Chat document, User A's pin would show up as pinned for User B too. Knotos stores this in a separate UserChatState collection, keyed by(userId, chatId). The chat list query joins these two collections for the requesting user only.

The Redis pub/sub pattern means adding a second API server instance requires zero code changes — just spin up another container pointing at the same Redis instance.

Typing indicators and presence

Typing events are ephemeral — they never touch MongoDB. The client emitstyping_start and typing_stop; the gateway publishes directly to the Redis channel with a 3-second TTL, and connected clients receive it instantly. If a client disconnects mid-type, the TTL cleans up automatically.