Skip to main content

Audio codecs negotiation and support

Learn how audio codecs are negotiated during SIP calls and which codecs LiveKit supports.

Overview

An audio codec defines how voice audio is compressed, encoded, transmitted, and decoded during a call. In SIP calls, codecs determine the audio quality, bandwidth usage, and compatibility between endpoints.

Different codecs trade off quality, latency, and network efficiency. For example, uncompressed, or lightly compressed codecs offer higher quality but use more bandwidth, while highly compressed codecs conserve bandwidth at the cost of audio fidelity.

Choosing the right codec matters because both sides must support the same codec to exchange audio. If both sides can't negotiate a common codec, the call might connect, but audio exchange fails between endpoints or requires an external service to transcode the audio.

SDP offer and answer

As part of the SIP handshake, codecs are negotiated by exchanging Session Description Protocol (SDP) messages. SDP is a text-based format used to describe media capabilities including codecs, ports, and other associated properties, to negotiate communication between endpoints.

SDP offer

The SDP offer lists all audio codecs the caller supports, in order of preference. The caller frequently includes an SDP offer in the initial INVITE. This message lets the callee know the codecs the caller supports and allows the callee to select the best codec to use for the call.

Early offer

There are multiple types of SDP offers, including early offer, delayed offer, re-INVITE, UPDATE, and more. The steps outlined in this guide are for an early offer where the SDP offer is sent in the initial INVITE. LiveKit only supports early offers.

SDP offer example

The following example is a simplified SDP offer in an INVITE:

m=audio 49170 RTP/AVP 0 8 96
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:96 opus/48000/2

What it means:

  • The caller is offering PCMU (G.711 µ-law), PCMA (G.711 A-law), and Opus.
  • The order indicates preference.

SDP answer

The callee selects one or more codecs from the offer list, in order of preference, and returns an SDP answer in the 200 OK response. This message lets the caller know the codec the callee selected and allows the caller to confirm the codec selection.

The following is an example SDP answer in a 200 OK response:

m=audio 53000 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000

What it means:

  • The callee chose PCMU/8000 as the codec.
  • Both sides now send and receive audio using the PCMU/8000 codec.

When does media start?

Codec negotiation completes when the caller sends an ACK for the 200 OK. RTP audio typically starts flowing immediately after the ACK is received. For handshake details, see Caller acknowledges the final response.

Early media

If early media is established, RTP audio starts flowing immediately after the 183 Session Progress response is received. To learn more, see Alerting / early media in the SIP handshake topic.

Supported audio codecs

LiveKit supports the audio codecs in the following sections.

PCMU (G.711 µ-law)

A lightly compressed, narrowband codec operating at 8 kHz. It offers high compatibility and low latency but uses more bandwidth (~64 kbps plus overhead). PCMU is widely used in North America and is a baseline codec for PSTN interoperability.

PCMA (G.711 A-law)

Similar to PCMU in quality, frequency, and bandwidth, but uses A-law companding.

G.722

A wideband codec that operates at 16 kHz, providing higher audio quality (HD voice) while using a similar bitrate to G.711.

Additional resources

The following resources provide additional details about the topics covered in this guide.