Two people on a video call
← Blog·Engineering

WebRTC Voice & Video Calls in SwiftUI: What We Learned

V
Van Thuong Dao·May 3, 2026·8 min read

WebRTC on iOS is powerful but poorly documented for Swift. After two weeks of fighting ICE candidates, NAT traversal, and audio session conflicts, here's everything I wish I'd known before starting.

The package

Knotos uses stasel/WebRTC (version 147.0.0) — a pre-compiled WebRTC framework distributed as a Swift Package. Building WebRTC from source takes hours; this package gives you a drop-in binary. Add it in Xcode via File → Add Package Dependencies.

The call state machine

Before writing any WebRTC code, model the state machine. Knotos uses:

enum CallState {
  case idle
  case calling        // outgoing, waiting for answer
  case ringing        // incoming, showing IncomingCallView
  case connecting     // ICE negotiation in progress
  case connected      // media flowing
  case ended(reason: EndReason)
}

enum EndReason {
  case localHangup, remoteHangup, declined, missed, failed
}

Every state transition goes through CallService, which ownsWebRTCClient and coordinates with SocketService for signalling.

Signalling over Socket.IO

WebRTC needs a signalling channel to exchange SDP offers/answers and ICE candidates. Knotos uses Socket.IO events for this — no extra infrastructure needed:

// Caller sends offer
socket.emit("call_offer", [
  "toUserId": recipientId,
  "offer": sdp.sdp,
  "callType": "video"
])

// Callee receives and answers
socket.on("call_offer") { data in
  let sdp = RTCSessionDescription(type: .offer, sdp: offerSdp)
  await webRTCClient.set(remoteSdp: sdp)
  let answer = try await webRTCClient.answer()
  socket.emit("call_answer", ["offer": answer.sdp])
}

// ICE candidates trickled in parallel
socket.on("ice_candidate") { data in
  let candidate = RTCIceCandidate(...)
  webRTCClient.set(remoteCandidate: candidate)
}

The STUN/TURN problem

On the same LAN, calls work perfectly with just Google's STUN servers. On mobile networks with strict NAT (most carriers), STUN alone fails — ICE can't find a reachable path and the connection times out at the connecting state.

The fix is a TURN server. Knotos runs Coturn in Docker. TURN relays media through the server when peer-to-peer fails — slightly higher latency, but the call connects.

// ICE server config in WebRTCClient
let iceServers = [
  RTCIceServer(urlStrings: ["stun:stun.l.google.com:19302"]),
  RTCIceServer(
    urlStrings: ["turn:(turnHost):(turnPort)"],
    username: username,   // HMAC-generated, time-limited
    credential: credential
  )
]

Audio session headaches

The most frustrating bug: audio worked in the simulator but was silent on a real device. The issue was AVAudioSession configuration. WebRTC needs.playAndRecord mode with .voiceChat option, and this must be set before creating the peer connection:

let session = AVAudioSession.sharedInstance()
try session.setCategory(.playAndRecord,
  mode: .voiceChat,
  options: [.allowBluetooth, .allowBluetoothA2DP])
try session.setActive(true)
Always test WebRTC on a physical device over mobile data, not just the simulator on Wi-Fi. That's where the real issues hide.

Video tracks

Adding video on top of audio required three additions: creating anRTCVideoSource and RTCVideoTrack, startingRTCCameraVideoCapturer with the front camera, and exposinglocalVideoTrack and remoteVideoTrack to the SwiftUI view via RTCMTLVideoView wrapped in a UIViewRepresentable.