Module livekit.rtc.apm
Classes
class AudioProcessingModule (*,
echo_cancellation: bool = False,
noise_suppression: bool = False,
high_pass_filter: bool = False,
auto_gain_control: bool = False)-
Expand source code
class AudioProcessingModule: """ Provides WebRTC audio processing capabilities including echo cancellation, noise suppression, high-pass filtering, and gain control. """ def __init__( self, *, echo_cancellation: bool = False, noise_suppression: bool = False, high_pass_filter: bool = False, auto_gain_control: bool = False, ) -> None: """ Initialize an AudioProcessingModule instance with the specified audio processing features. Args: echo_cancellation (bool, optional): Whether to enable echo cancellation. noise_suppression (bool, optional): Whether to enable noise suppression. high_pass_filter (bool, optional): Whether to enable a high-pass filter. auto_gain_control (bool, optional): Whether to enable auto gain control. """ req = proto_ffi.FfiRequest() req.new_apm.echo_canceller_enabled = echo_cancellation req.new_apm.noise_suppression_enabled = noise_suppression req.new_apm.high_pass_filter_enabled = high_pass_filter req.new_apm.gain_controller_enabled = auto_gain_control resp = FfiClient.instance.request(req) self._ffi_handle = FfiHandle(resp.new_apm.apm.handle.id) def process_stream(self, data: AudioFrame) -> None: """ Process the provided audio frame using the configured audio processing features. The input audio frame is modified in-place (if applicable) by the underlying audio processing module (e.g., echo cancellation, noise suppression, etc.). Important: Audio frames must be exactly 10 ms in duration. """ bdata = data.data.cast("b") req = proto_ffi.FfiRequest() req.apm_process_stream.apm_handle = self._ffi_handle.handle req.apm_process_stream.data_ptr = get_address(memoryview(bdata)) req.apm_process_stream.size = len(bdata) req.apm_process_stream.sample_rate = data.sample_rate req.apm_process_stream.num_channels = data.num_channels resp = FfiClient.instance.request(req) if resp.apm_process_stream.error: raise RuntimeError(resp.apm_process_stream.error) def process_reverse_stream(self, data: AudioFrame) -> None: """ Process the reverse audio frame (typically used for echo cancellation in a full-duplex setup). In an echo cancellation scenario, this method is used to process the "far-end" audio prior to mixing or feeding it into the echo canceller. Like `process_stream`, the input audio frame is modified in-place by the underlying processing module. Important: Audio frames must be exactly 10 ms in duration. """ bdata = data.data.cast("b") req = proto_ffi.FfiRequest() req.apm_process_reverse_stream.apm_handle = self._ffi_handle.handle req.apm_process_reverse_stream.data_ptr = get_address(memoryview(bdata)) req.apm_process_reverse_stream.size = len(bdata) req.apm_process_reverse_stream.sample_rate = data.sample_rate req.apm_process_reverse_stream.num_channels = data.num_channels resp = FfiClient.instance.request(req) if resp.apm_process_stream.error: raise RuntimeError(resp.apm_process_stream.error) def set_stream_delay_ms(self, delay_ms: int) -> None: """ This must be called if and only if echo processing is enabled. Sets the `delay` in ms between `process_reverse_stream()` receiving a far-end frame and `process_stream()` receiving a near-end frame containing the corresponding echo. On the client-side this can be expressed as delay = (t_render - t_analyze) + (t_process - t_capture) where, - t_analyze is the time a frame is passed to `process_reverse_stream()` and t_render is the time the first sample of the same frame is rendered by the audio hardware. - t_capture is the time the first sample of a frame is captured by the audio hardware and t_process is the time the same frame is passed to `process_stream()`. """ req = proto_ffi.FfiRequest() req.apm_set_stream_delay.apm_handle = self._ffi_handle.handle req.apm_set_stream_delay.delay_ms = delay_ms resp = FfiClient.instance.request(req) if resp.apm_set_stream_delay.error: raise RuntimeError(resp.apm_set_stream_delay.error)
Provides WebRTC audio processing capabilities including echo cancellation, noise suppression, high-pass filtering, and gain control.
Initialize an AudioProcessingModule instance with the specified audio processing features.
Args
echo_cancellation
:bool
, optional- Whether to enable echo cancellation.
noise_suppression
:bool
, optional- Whether to enable noise suppression.
high_pass_filter
:bool
, optional- Whether to enable a high-pass filter.
auto_gain_control
:bool
, optional- Whether to enable auto gain control.
Methods
def process_reverse_stream(self, data: AudioFrame) ‑> None
-
Expand source code
def process_reverse_stream(self, data: AudioFrame) -> None: """ Process the reverse audio frame (typically used for echo cancellation in a full-duplex setup). In an echo cancellation scenario, this method is used to process the "far-end" audio prior to mixing or feeding it into the echo canceller. Like `process_stream`, the input audio frame is modified in-place by the underlying processing module. Important: Audio frames must be exactly 10 ms in duration. """ bdata = data.data.cast("b") req = proto_ffi.FfiRequest() req.apm_process_reverse_stream.apm_handle = self._ffi_handle.handle req.apm_process_reverse_stream.data_ptr = get_address(memoryview(bdata)) req.apm_process_reverse_stream.size = len(bdata) req.apm_process_reverse_stream.sample_rate = data.sample_rate req.apm_process_reverse_stream.num_channels = data.num_channels resp = FfiClient.instance.request(req) if resp.apm_process_stream.error: raise RuntimeError(resp.apm_process_stream.error)
Process the reverse audio frame (typically used for echo cancellation in a full-duplex setup).
In an echo cancellation scenario, this method is used to process the "far-end" audio prior to mixing or feeding it into the echo canceller. Like
process_stream
, the input audio frame is modified in-place by the underlying processing module.Important
Audio frames must be exactly 10 ms in duration.
def process_stream(self, data: AudioFrame) ‑> None
-
Expand source code
def process_stream(self, data: AudioFrame) -> None: """ Process the provided audio frame using the configured audio processing features. The input audio frame is modified in-place (if applicable) by the underlying audio processing module (e.g., echo cancellation, noise suppression, etc.). Important: Audio frames must be exactly 10 ms in duration. """ bdata = data.data.cast("b") req = proto_ffi.FfiRequest() req.apm_process_stream.apm_handle = self._ffi_handle.handle req.apm_process_stream.data_ptr = get_address(memoryview(bdata)) req.apm_process_stream.size = len(bdata) req.apm_process_stream.sample_rate = data.sample_rate req.apm_process_stream.num_channels = data.num_channels resp = FfiClient.instance.request(req) if resp.apm_process_stream.error: raise RuntimeError(resp.apm_process_stream.error)
Process the provided audio frame using the configured audio processing features.
The input audio frame is modified in-place (if applicable) by the underlying audio processing module (e.g., echo cancellation, noise suppression, etc.).
Important
Audio frames must be exactly 10 ms in duration.
def set_stream_delay_ms(self, delay_ms: int) ‑> None
-
Expand source code
def set_stream_delay_ms(self, delay_ms: int) -> None: """ This must be called if and only if echo processing is enabled. Sets the `delay` in ms between `process_reverse_stream()` receiving a far-end frame and `process_stream()` receiving a near-end frame containing the corresponding echo. On the client-side this can be expressed as delay = (t_render - t_analyze) + (t_process - t_capture) where, - t_analyze is the time a frame is passed to `process_reverse_stream()` and t_render is the time the first sample of the same frame is rendered by the audio hardware. - t_capture is the time the first sample of a frame is captured by the audio hardware and t_process is the time the same frame is passed to `process_stream()`. """ req = proto_ffi.FfiRequest() req.apm_set_stream_delay.apm_handle = self._ffi_handle.handle req.apm_set_stream_delay.delay_ms = delay_ms resp = FfiClient.instance.request(req) if resp.apm_set_stream_delay.error: raise RuntimeError(resp.apm_set_stream_delay.error)
This must be called if and only if echo processing is enabled.
Sets the
delay
in ms betweenprocess_reverse_stream()
receiving a far-end frame andprocess_stream()
receiving a near-end frame containing the corresponding echo. On the client-side this can be expressed as delay = (t_render - t_analyze) + (t_process - t_capture) where, - t_analyze is the time a frame is passed toprocess_reverse_stream()
and t_render is the time the first sample of the same frame is rendered by the audio hardware. - t_capture is the time the first sample of a frame is captured by the audio hardware and t_process is the time the same frame is passed toprocess_stream()
.