# LiveKit docs > LiveKit is a platform for building voice and realtime AI applications. LiveKit Cloud is the hosted commercial offering based on the open-source LiveKit project. ## Overview LiveKit and [LiveKit Cloud](https://cloud.livekit.io) contains these primary components: - LiveKit Agents, a framework for realtime voice AI agents in [Python](https://github.com/livekit/agents) or [Node.js](https://github.com/livekit/agents-js)- Includes a [deployment environment](https://docs.livekit.io/agents/ops/deployment.md) for running agents on LiveKit Cloud - And a hosted voice ai [inference service](https://docs.livekit.io/agents/models.md#inference) and extensive [plugin system](https://docs.livekit.io/agents/models.md#plugins) for connecting to a wide range of AI providers - A global WebRTC-based realtime media server with [realtime SDKs](https://docs.livekit.io/home/client/connect.md) for- [Web](https://github.com/livekit/client-sdk-js) - [Swift](https://github.com/livekit/client-sdk-swift) - [Android](https://github.com/livekit/client-sdk-android) - [Flutter](https://github.com/livekit/client-sdk-flutter) - [React Native](https://github.com/livekit/client-sdk-react-native) - [Unity](https://github.com/livekit/client-sdk-unity) - [Python](https://github.com/livekit/client-sdk-python) - [Node.js](https://github.com/livekit/client-sdk-node) - [Rust](https://github.com/livekit/client-sdk-rust) - [ESP32](https://github.com/livekit/client-sdk-esp32) - and more - [Telephony integration](https://docs.livekit.io/sip.md) built on SIP for integrating telephony into LiveKit rooms - For greater detail, see [Intro to LiveKit](https://docs.livekit.io/home/get-started/intro-to-livekit.md). ## Home ### Get Started --- ## Intro to LiveKit LiveKit is an open source platform for developers building realtime media applications. It makes it easy to integrate audio, video, text, data, and AI models while offering scalable realtime infrastructure built on top of WebRTC. ## Why choose LiveKit? LiveKit provides a complete solution for realtime applications with several key advantages: - **Developer-friendly**: Consistent APIs across platforms with comprehensive and well-documented SDKs. - **Open source**: No vendor lock-in with complete transparency and flexibility. - **AI-native**: First-class support for integrating AI models into realtime experiences. - **Scalable**: Can support anywhere from a handful of users to thousands of concurrent participants, or more. - **Deployment flexibility**: Choose between fully-managed cloud or self-hosted options. - **Private and secure**: End-to-end encryption, HIPAA-compliance, and more. - **Built on WebRTC**: The most robust realtime media protocol for peak performance in any network condition. ### What is WebRTC? [WebRTC](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API) provides significant advantages over other options for building realtime applications such as websockets. - **Optimized for media**: Purpose-built for audio and video with advanced codecs and compression algorithms. - **Network resilient**: Performs reliably even in challenging network conditions due to UDP, adaptive bitrate, and more. - **Broad compatibility**: Natively supported in all modern browsers. LiveKit handles all of the complexity of running production-grade WebRTC infrastructure while extending support to mobile apps, backends, and telephony. ## LiveKit ecosystem The LiveKit platform consists of these core components: - **LiveKit Server**: An open-source media server that enables realtime communication between participants. Use LiveKit's fully-managed global cloud, or self-host your own. - **LiveKit SDKs**: Full-featured web, native, and backend SDKs that make it easy to join rooms and publish and consume realtime media and data. - **LiveKit Agents**: A framework for building realtime multimodal AI agents, with an extensive collection of plugins for nearly every AI provider. - **Telephony**: A flexible SIP integration for inbound or outbound calling into any LiveKit room or agent session. - **Egress**: Record and export realtime media from LiveKit rooms. - **Ingress**: Ingest external streams (such as RTMP and WHIP) into LiveKit rooms. - **Server APIs**: A REST API for managing rooms, and more. Includes SDKs and a CLI. ## Deployment options LiveKit offers two deployment options for LiveKit Server to fit your needs: - **LiveKit Cloud**: A fully-managed, globally distributed service with automatic scaling and high reliability. Trusted by companies of all sizes, from startups to enterprises. - **Self-hosted**: Run the open source LiveKit server on your own infrastructure for maximum control and customization. Both options provide the same core platform features and use the same SDKs. ## What can you build with LiveKit? - **AI assistants**: Voice and video agents powered by any AI model. - **Video conferencing**: Secure, private meetings for teams of any size. - **Interactive livestreaming**: Broadcast to audiences with realtime engagement. - **Robotics**: Integrate realtime video and powerful AI models into real-world devices. - **Healthcare**: HIPAA-compliant telehealth with AI and humans in the loop. - **Customer service**: Flexible and observable web, mobile, and telephone support options. Whatever your use case, LiveKit makes it easy to build innovative, intelligent realtime applications without worrying about scaling media infrastructure. [Get started with LiveKit today](https://docs.livekit.io/home.md). --- --- ## Rooms, participants, and tracks ## Overview LiveKit has only three core constructs: a room, participant, and track. A room is simply a realtime session between one or more participants. A participant can publish one or more tracks and/or subscribe to one or more tracks from another participant. ## Room A `Room` is a container object representing a LiveKit session. Each participant in a room receives updates about changes to other participants in the same room. For example, when a participant adds, removes, or modifies the state (for example, mute) of a track, other participants are notified of this change. This is a powerful mechanism for synchronizing state and fundamental to building any realtime experience. A room can be created manually via [server API](https://docs.livekit.io/home/server/managing-rooms.md#create-a-room), or automatically, when the first participant joins it. Once the last participant leaves a room, it closes after a short delay. ## Participant A `Participant` is a user or process that is participating in a realtime session. They are represented by a unique developer-provided `identity` and a server-generated `sid`. A participant object also contains metadata about its state and tracks they've published. > ❗ **Important** > > A participant's identity is unique per room. Thus, if participants with the same identity join a room, only the most recent one to join will remain; the server automatically disconnects other participants using that identity. There are two kinds of participant objects in the SDKs: - A `LocalParticipant` represents the current user who, by default, can publish tracks in a room. - A `RemoteParticipant` represents a remote user. The local participant, by default, can subscribe to any tracks published by a remote participant. A participant may also [exchange data](https://docs.livekit.io/home/client/data.md) with one or many other participants. ### Hidden participants A participant is hidden if their participant [permissions](https://docs.livekit.io/reference/server/server-apis.md#participantpermission) has `hidden` set to `true`. You can set this field in the participant's [access token](https://docs.livekit.io/home/get-started/authentication.md#video-grant). A hidden participant is not visible to other participants in the room. ### Participant fields | Field | Type | Description | | sid | string | A UID for this particular participant, generated by LiveKit server. | | identity | string | Unique identity of the participant, as specified when connecting. | | name | string | Optional display name. | | state | ParticipantInfo.State | JOINING, JOINED, ACTIVE, or DISCONNECTED. | | tracks | List<[TrackInfo](https://docs.livekit.io/reference/server/server-apis.md#trackinfo)> | Tracks published by the participant. | | metadata | string | User-specified metadata for the participant. | | joined_at | int64 | Timestamp when the participant joined the room. | | kind | ParticipantInfo.Kind | [Type](#types-of-participants) of participant. | | kind_detail | ParticipantInfo.KindDetail | Additional details about participant type. Valide values are `CLOUD_AGENT` or `FORWARDED`. | | attributes | string | User-specified [attributes](https://docs.livekit.io/home/client/data.md) for the participant. | | permission | [ParticipantPermission](https://docs.livekit.io/reference/server/server-apis.md#participantpermission) | Permissions granted to the participant. | ### Types of participants In a realtime session, a participant could represent an end-user, as well as a server-side process. It's possible to distinguish between them with the `kind` field: - `STANDARD`: A regular participant, typically an end-user in your application. - `AGENT`: An agent spawned with the [Agents framework](https://docs.livekit.io/agents.md). - `SIP`: A telephony user connected via [SIP](https://docs.livekit.io/sip.md). - `EGRESS`: A server-side process that is recording the session using [LiveKit Egress](https://docs.livekit.io/home/egress/overview.md). - `INGRESS`: A server-side process that is ingesting media into the session using [LiveKit Ingress](https://docs.livekit.io/home/ingress/overview.md). ## Track A `Track` represents a stream of information, be it audio, video or custom data. By default, a participant in a room may publish tracks, such as their camera or microphone streams and subscribe to one or more tracks published by other participants. In order to model a track which may not be subscribed to by the local participant, all track objects have a corresponding `TrackPublication` object: - `Track`: a wrapper around the native WebRTC `MediaStreamTrack`, representing a playable track. - `TrackPublication`: a track that's been published to the server. If the track is subscribed to by the local participant and available for playback locally, it will have a `.track` attribute representing the associated `Track` object. We can now list and manipulate tracks (via track publications) published by other participants, even if the local participant is not subscribed to them. ### TrackPublication fields A `TrackPublication` contains information about its associated track: | Field | Type | Description | | sid | string | A UID for this particular track, generated by LiveKit server. | | kind | Track.Kind | The type of track, whether it be audio, video or arbitrary data. | | source | Track.Source | Source of media: Camera, Microphone, ScreenShare, or ScreenShareAudio. | | name | string | The name given to this particular track when initially published. | | subscribed | boolean | Indicates whether or not this track has been subscribed to by the local participant. | | track | Track | If the local participant is subscribed, the associated `Track` object representing a WebRTC track. | | muted | boolean | Whether this track is muted or not by the local participant. While muted, it won't receive new bytes from the server. | ### Track subscription When a participant is subscribed to a track (which hasn't been muted by the publishing participant), they continuously receive its data. If the participant unsubscribes, they stop receiving media for that track and may resubscribe to it at any time. When a participant creates or joins a room, the `autoSubscribe` option is set to `true` by default. This means the participant automatically subscribes to all existing tracks being published and any track published in the future. For more fine-grained control over track subscriptions, you can set `autoSubscribe` to `false` and instead use [selective subscriptions](https://docs.livekit.io/home/client/receive.md#selective-subscription). > ℹ️ **Note** > > For most use cases, muting a track on the publisher side or unsubscribing from it on the subscriber side is typically recommended over unpublishing it. Publishing a track requires a negotiation phase and consequently has worse time-to-first-byte performance. --- --- ## Authentication ## Overview For a LiveKit SDK to successfully connect to the server, it must pass an access token with the request. This token encodes the identity of a participant, name of the room, capabilities and permissions. Access tokens are JWT-based and signed with your API secret to prevent forgery. Access tokens also carry an expiration time, after which the server will reject connections with that token. Note: expiration time only impacts the initial connection, and not subsequent reconnects. ## Creating a token **LiveKit CLI**: ```shell lk token create \ --api-key \ --api-secret \ --identity \ --room \ --join \ --valid-for 1h ``` --- **Node.js**: ```typescript import { AccessToken, VideoGrant } from 'livekit-server-sdk'; const roomName = 'name-of-room'; const participantName = 'user-name'; const at = new AccessToken('api-key', 'secret-key', { identity: participantName, }); const videoGrant: VideoGrant = { room: roomName, roomJoin: true, canPublish: true, canSubscribe: true, }; at.addGrant(videoGrant); const token = await at.toJwt(); console.log('access token', token); ``` --- **Go**: ```go import ( "time" "github.com/livekit/protocol/auth" ) func getJoinToken(apiKey, apiSecret, room, identity string) (string, error) { canPublish := true canSubscribe := true at := auth.NewAccessToken(apiKey, apiSecret) grant := &auth.VideoGrant{ RoomJoin: true, Room: room, CanPublish: &canPublish, CanSubscribe: &canSubscribe, } at.SetVideoGrant(grant). SetIdentity(identity). SetValidFor(time.Hour) return at.ToJWT() } ``` --- **Ruby**: ```ruby require 'livekit' token = LiveKit::AccessToken.new(api_key: 'yourkey', api_secret: 'yoursecret') token.identity = 'participant-identity' token.name = 'participant-name' token.video_grant=(LiveKit::VideoGrant.from_hash(roomJoin: true, room: 'room-name')) puts token.to_jwt ``` --- **Java**: ```java import io.livekit.server.*; public String createToken() { AccessToken token = new AccessToken("apiKey", "secret"); token.setName("participant-name"); token.setIdentity("participant-identity"); token.setMetadata("metadata"); token.addGrants(new RoomJoin(true), new Room("room-name")); return token.toJwt(); } ``` --- **Python**: ```python from livekit import api import os token = api.AccessToken(os.getenv('LIVEKIT_API_KEY'), os.getenv('LIVEKIT_API_SECRET')) \ .with_identity("identity") \ .with_name("name") \ .with_grants(api.VideoGrants( room_join=True, room="my-room", )).to_jwt() ``` --- **Rust**: ```rust use livekit_api::access_token; use std::env; fn create_token() -> Result { let api_key = env::var("LIVEKIT_API_KEY").expect("LIVEKIT_API_KEY is not set"); let api_secret = env::var("LIVEKIT_API_SECRET").expect("LIVEKIT_API_SECRET is not set"); let token = access_token::AccessToken::with_api_key(&api_key, &api_secret) .with_identity("identity") .with_name("name") .with_grants(access_token::VideoGrants { room_join: true, room: "my-room".to_string(), ..Default::default() }) .to_jwt(); return token } ``` --- **Other**: For other platforms, you can either implement token generation yourself or use the `lk` command. Token signing is fairly straight forward, see [JS implementation](https://github.com/livekit/node-sdks/blob/main/packages/livekit-server-sdk/src/AccessToken.ts) as a reference. LiveKit CLI is available at [https://github.com/livekit/livekit-cli](https://github.com/livekit/livekit-cli) ### Token example Here's an example of the decoded body of a join token: ```json { "exp": 1621657263, "iss": "APIMmxiL8rquKztZEoZJV9Fb", "sub": "myidentity", "nbf": 1619065263, "video": { "room": "myroom", "roomJoin": true }, "metadata": "" } ``` | field | description | | exp | Expiration time of token | | nbf | Start time that the token becomes valid | | iss | API key used to issue this token | | sub | Unique identity for the participant | | metadata | Participant metadata | | attributes | Participant attributes (key/value pairs of strings) | | video | Video grant, including room permissions (see below) | | sip | SIP grant | ## Video grant Room permissions are specified in the `video` field of a decoded join token. It may contain one or more of the following properties: | field | type | description | | roomCreate | bool | Permission to create or delete rooms | | roomList | bool | Permission to list available rooms | | roomJoin | bool | Permission to join a room | | roomAdmin | bool | Permission to moderate a room | | roomRecord | bool | Permissions to use Egress service | | ingressAdmin | bool | Permissions to use Ingress service | | room | string | Name of the room, required if join or admin is set | | canPublish | bool | Allow participant to publish tracks | | canPublishData | bool | Allow participant to publish data to the room | | canPublishSources | string[] | Requires `canPublish` to be true. When set, only listed source can be published. (camera, microphone, screen_share, screen_share_audio) | | canSubscribe | bool | Allow participant to subscribe to tracks | | canUpdateOwnMetadata | bool | Allow participant to update its own metadata | | hidden | bool | Hide participant from others in the room | | kind | string | Type of participant (standard, ingress, egress, sip, or agent). this field is typically set by LiveKit internals. | | destinationRoom | string | Name of the room a participant can be [forwarded](https://docs.livekit.io/home/server/managing-participants.md#forwardparticipant) to. | ### Example: subscribe-only token To create a token where the participant can only subscribe, and not publish into the room, you would use the following grant: ```json { ... "video": { "room": "myroom", "roomJoin": true, "canSubscribe": true, "canPublish": false, "canPublishData": false } } ``` ### Example: camera-only Allow the participant to publish camera, but disallow other sources ```json { ... "video": { "room": "myroom", "roomJoin": true, "canSubscribe": true, "canPublish": true, "canPublishSources": ["camera"] } } ``` ## SIP grant In order to interact with the SIP service, permission must be granted in the `sip` field of the JWT. It may contain the following properties: | field | type | description | | admin | bool | Permission to manage SIP trunks and dispatch rules. | | call | bool | Permission to make SIP calls via `CreateSIPParticipant`. | ### Creating a token with SIP grants **Node.js**: ```typescript import { AccessToken, SIPGrant, VideoGrant } from 'livekit-server-sdk'; const roomName = 'name-of-room'; const participantName = 'user-name'; const at = new AccessToken('api-key', 'secret-key', { identity: participantName, }); const sipGrant: SIPGrant = { admin: true, call: true, }; const videoGrant: VideoGrant = { room: roomName, roomJoin: true, }; at.addGrant(sipGrant); at.addGrant(videoGrant); const token = await at.toJwt(); console.log('access token', token); ``` --- **Go**: ```go import ( "time" "github.com/livekit/protocol/auth" ) func getJoinToken(apiKey, apiSecret, room, identity string) (string, error) { at := auth.NewAccessToken(apiKey, apiSecret) videoGrant := &auth.VideoGrant{ RoomJoin: true, Room: room, } sipGrant := &auth.SIPGrant{ Admin: true, Call: true, } at.SetSIPGrant(sipGrant). SetVideoGrant(videoGrant). SetIdentity(identity). SetValidFor(time.Hour) return at.ToJWT() } ``` --- **Ruby**: ```ruby require 'livekit' token = LiveKit::AccessToken.new(api_key: 'yourkey', api_secret: 'yoursecret') token.identity = 'participant-identity' token.name = 'participant-name' token.video_grant=(LiveKit::VideoGrant.from_hash(roomJoin: true, room: 'room-name')) token.sip_grant=(LiveKit::SIPGrant.from_hash(admin: true, call: true)) puts token.to_jwt ``` --- **Java**: ```java import io.livekit.server.*; public String createToken() { AccessToken token = new AccessToken("apiKey", "secret"); // Fill in token information. token.setName("participant-name"); token.setIdentity("participant-identity"); token.setMetadata("metadata"); // Add room and SIP privileges. token.addGrants(new RoomJoin(true), new RoomName("room-name")); token.addSIPGrants(new SIPAdmin(true), new SIPCall(true)); return token.toJwt(); } ``` --- **Python**: ```python from livekit import api import os token = api.AccessToken(os.getenv('LIVEKIT_API_KEY'), os.getenv('LIVEKIT_API_SECRET')) \ .with_identity("identity") \ .with_name("name") \ .with_grants(api.VideoGrants( room_join=True, room="my-room")) \ .with_sip_grants(api.SIPGrants( admin=True, call=True)).to_jwt() ``` --- **Rust**: ```rust use livekit_api::access_token; use std::env; fn create_token() -> Result { let api_key = env::var("LIVEKIT_API_KEY").expect("LIVEKIT_API_KEY is not set"); let api_secret = env::var("LIVEKIT_API_SECRET").expect("LIVEKIT_API_SECRET is not set"); let token = access_token::AccessToken::with_api_key(&api_key, &api_secret) .with_identity("rust-bot") .with_name("Rust Bot") .with_grants(access_token::VideoGrants { room_join: true, room: "my-room".to_string(), ..Default::default() }) .with_sip_grants(access_token::SIPGrants { admin: true, call: true }) .to_jwt(); return token } ``` ## Room configuration You can create an access token for a user that includes room configuration options. When a room is created for a user, the room is created using the configuration stored in the token. For example, you can use this to [explicitly dispatch an agent](https://docs.livekit.io/agents/server/agent-dispatch.md) when a user joins a room. For the full list of `RoomConfiguration` fields, see [RoomConfiguration](https://docs.livekit.io/reference/server/server-apis.md#roomconfiguration). ### Creating a token with room configuration **Node.js**: For a full example of explicit agent dispatch, see the [example](https://github.com/livekit/node-sdks/blob/main/examples/agent-dispatch/index.ts) in GitHub. ```typescript import { AccessToken, SIPGrant, VideoGrant } from 'livekit-server-sdk'; import { RoomAgentDispatch, RoomConfiguration } from '@livekit/protocol'; const roomName = 'name-of-room'; const participantName = 'user-name'; const agentName = 'my-agent'; const at = new AccessToken('api-key', 'secret-key', { identity: participantName, }); const videoGrant: VideoGrant = { room: roomName, roomJoin: true, }; at.addGrant(videoGrant); at.roomConfig = new RoomConfiguration ( agents: [ new RoomAgentDispatch({ agentName: "test-agent", metadata: "test-metadata" }) ] ); const token = await at.toJwt(); console.log('access token', token); ``` --- **Go**: ```go import ( "time" "github.com/livekit/protocol/auth" "github.com/livekit/protocol/livekit" ) func getJoinToken(apiKey, apiSecret, room, identity string) (string, error) { at := auth.NewAccessToken(apiKey, apiSecret) videoGrant := &auth.VideoGrant{ RoomJoin: true, Room: room, } roomConfig := &livekit.RoomConfiguration{ Agents: []*livekit.RoomAgentDispatch{{ AgentName: "test-agent", Metadata: "test-metadata", }}, } at.SetVideoGrant(videoGrant). SetRoomConfig(roomConfig). SetIdentity(identity). SetValidFor(time.Hour) return at.ToJWT() } ``` --- **Ruby**: ```ruby require 'livekit' token = LiveKit::AccessToken.new(api_key: 'yourkey', api_secret: 'yoursecret') token.identity = 'participant-identity' token.name = 'participant-name' token.video_grant=(LiveKit::VideoGrant.new(roomJoin: true, room: 'room-name')) token.room_config=(LiveKit::Proto::RoomConfiguration.new( max_participants: 10 agents: [LiveKit::Proto::RoomAgentDispatch.new( agent_name: "test-agent", metadata: "test-metadata", )] ) ) puts token.to_jwt ``` --- **Python**: For a full example of explicit agent dispatch, see the [example](https://github.com/livekit/python-sdks/blob/main/examples/agent_dispatch.py) in GitHub. ```python from livekit import api import os token = api.AccessToken(os.getenv('LIVEKIT_API_KEY'), os.getenv('LIVEKIT_API_SECRET')) \ .with_identity("identity") \ .with_name("name") \ .with_grants(api.VideoGrants( room_join=True, room="my-room")) \ .with_room_config( api.RoomConfiguration( agents=[ api.RoomAgentDispatch( agent_name="test-agent", metadata="test-metadata" ) ], ), ).to_jwt() ``` --- **Rust**: ```rust use livekit_api::access_token; use std::env; fn create_token() -> Result { let api_key = env::var("LIVEKIT_API_KEY").expect("LIVEKIT_API_KEY is not set"); let api_secret = env::var("LIVEKIT_API_SECRET").expect("LIVEKIT_API_SECRET is not set"); let token = access_token::AccessToken::with_api_key(&api_key, &api_secret) .with_identity("rust-bot") .with_name("Rust Bot") .with_grants(access_token::VideoGrants { room_join: true, room: "my-room".to_string(), ..Default::default() }) .with_room_config(livekit::RoomConfiguration { agents: [livekit::AgentDispatch{ name: "my-agent" }] }) .to_jwt(); return token } ``` ## Token refresh LiveKit server proactively issues refreshed tokens to connected clients, ensuring they can reconnect if disconnected. These refreshed access tokens have a 10-minute expiration. Additionally, tokens are refreshed when there are changes to a participant's name, permissions or metadata. ## Updating permissions A participant's permissions can be updated at any time, even after they've already connected. This is useful in applications where the participant's role could change during the session, such as in a participatory livestream. It's possible to issue a token with `canPublish: false` initially, and then updating it to `canPublish: true` during the session. Permissions can be changed with the [UpdateParticipant](https://docs.livekit.io/home/server/managing-participants.md#updating-permissions) server API. --- --- ## LiveKit Docs MCP Server ## Overview LiveKit includes a free [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) server with tools for AI coding assistants to browse and search the docs site. The following instructions cover installation of the MCP server and advice for writing an [AGENTS.md file](#agents-md) to get the most out of your coding agent. The server is available at the following URL: ```text https://docs.livekit.io/mcp ``` ## Installation The following sections cover installation instructions for various coding assistants. ### Cursor Click the button below to install the MCP server in [Cursor](https://www.cursor.com/): ![Install MCP Server in Cursor](https://cursor.com/deeplink/mcp-install-dark.svg) Or add it manually with the following JSON: ```json { "livekit-docs": { "url": "https://docs.livekit.io/mcp" } } ``` ### Claude Code Run the following command in your terminal to install the MCP server in [Claude Code](https://claude.com/product/claude-code): ```shell claude mcp add --transport http livekit-docs https://docs.livekit.io/mcp ``` ### Codex Run the following command in your terminal to install the server in [OpenAI Codex](https://openai.com/codex/): ```shell codex mcp add --url https://docs.livekit.io/mcp livekit-docs ``` ### Gemini CLI Run the following command in your terminal to install the server in [Gemini CLI](https://github.com/google-gemini/gemini-cli): ``` gemini mcp add --transport http livekit-docs https://docs.livekit.io/mcp ``` ### Manual installation The MCP server is available at the following URL. Add this server to your MCP client of choice. If prompted, set the transport to `http` or "Streamable HTTP". ```text https://docs.livekit.io/mcp ``` ## AGENTS.md To get the most out of the MCP server, LiveKit recommends that you include an [AGENTS.md](https://agents.md) or similar file in your repository, with instructions specific to the LiveKit Docs MCP Server. This file ensures that your agent always consults LiveKit docs to produce reliable, working code. Many of LiveKit's starter repositories already include a robust `AGENTS.md` file which is optimized for that specific project and environment. - **[Python starter project](https://github.com/livekit-examples/agent-starter-python)**: Includes an `AGENTS.md` file optimized for building agents in Python. - **[Node.js starter project](https://github.com/livekit-examples/agent-starter-node)**: Includes an `AGENTS.md` file optimized for building agents in Node.js. Or add the following instructions to your own `AGENTS.md` file: ```markdown ## LiveKit Documentation LiveKit Agents is a fast-evolving project, and the documentation is updated frequently. You should always refer to the latest documentation when working with this project. For your convenience, LiveKit offers an MCP server that can be used to browse and search its documentation. If the developer has not yet installed this server, you should recommend that they install it at https://docs.livekit.io/mcp. ``` ## Markdown docs Each page on the LiveKit docs site is available in Markdown format, optimized for pasting into AI assistants when MCP is unavailable. To access the Markdown version of any page on the site, append `.md` to the end of the URL. For example, this page is available at [https://docs.livekit.io/home/get-started/mcp-server.md](https://docs.livekit.io/home/get-started/mcp-server.md). You can also use the "Copy page" button on the top right of any docs page. ### LLMs.txt A complete Markdown-based index of the docs site is available at [https://docs.livekit.io/llms.txt](https://docs.livekit.io/llms.txt). This file includes a table of contents along with brief page descriptions. An expanded version is available at [https://docs.livekit.io/llms-full.txt](https://docs.livekit.io/llms-full.txt), but this file is quite large and may not be suitable for all use cases. For more about how to use LLMs.txt files, see [llmstxt.org](https://llmstxt.org/). --- ### LiveKit SDKs --- ## Overview ## Overview LiveKit provides a comprehensive ecosystem of SDKs for building realtime applications, including **realtime SDKs** for building user-facing applications, and **server-side SDKs** for backend operations and media processing. The SDKs are designed to work together, and support multiple platforms and languages. ## Realtime SDKs Realtime SDKs let you build applications that connect to LiveKit rooms and participate in realtime communication. These SDKs handle WebRTC connections, media capture, and room management. ### Web and mobile platforms These are the primary client platforms used for building realtime applications. Each SDK is optimized for its target platform and provides native integration capabilities. - **[JavaScript SDK](https://github.com/livekit/client-sdk-js)**: JavaScript/TypeScript SDK for web browsers. Supports all major browsers and provides React hooks for easy integration. - **[iOS/macOS/visionOS](https://github.com/livekit/client-sdk-swift)**: Native Swift SDK for Apple platforms including iOS, macOS, and visionOS. Optimized for Apple's ecosystem. - **[Android](https://github.com/livekit/client-sdk-android)**: Native Kotlin SDK for Android applications. Provides comprehensive media handling and room management. - **[Flutter](https://github.com/livekit/client-sdk-flutter)**: Cross-platform SDK for Flutter applications. Write once, run on iOS, Android, web, and desktop. - **[React Native](https://github.com/livekit/client-sdk-react-native)**: React Native SDK for building cross-platform mobile applications with JavaScript/TypeScript. - **[Unity](https://github.com/livekit/client-sdk-unity)**: Unity SDK for game development and virtual reality applications. Supports both native and WebGL builds. ### Additional client platforms LiveKit also supports specialized platforms and use cases beyond the main web and mobile platforms: - **[Rust SDK](https://github.com/livekit/rust-sdks)**: For systems programming and embedded applications. - **[Unity WebGL](https://github.com/livekit/client-sdk-unity-web)**: For web-based Unity applications. - **[ESP32](https://github.com/livekit/client-sdk-esp32)**: For IoT and embedded devices. ## Server-side SDKs Server-side SDKs provide backend integration capabilities, enabling you to create programmatic participants, manage rooms, and process media streams. They can also generate access tokens, call server APIs, and receive webhooks. The Go SDK additionally offers client capabilities, allowing you to build automations that act like end users. ### Core server SDKs - **[Node.js](https://github.com/livekit/node-sdks)**: JavaScript SDK for Node.js applications. Includes room management, participant control, and webhook handling. - **[Python](https://github.com/livekit/python-sdks)**: Python SDK for backend applications. Provides comprehensive media processing and room management capabilities. - **[Golang](https://github.com/livekit/server-sdk-go)**: Go SDK for high-performance server applications. Optimized for scalability and low latency. Includes client capabilities. - **[Ruby](https://github.com/livekit/server-sdk-ruby)**: Ruby SDK for Ruby on Rails and other Ruby applications. Full-featured server integration. - **[Java/Kotlin](https://github.com/livekit/server-sdk-kotlin)**: Java and Kotlin SDK for JVM-based applications. Enterprise-ready with comprehensive features. - **[Rust](https://github.com/livekit/rust-sdks)**: Rust SDK for systems programming and high-performance applications. Memory-safe and fast. ### Community SDKs - **[PHP](https://github.com/agence104/livekit-server-sdk-php)**: Community-maintained SDK for PHP applications. - **[.NET](https://github.com/pabloFuente/livekit-server-sdk-dotnet)**: Community-maintained SDK for .NET applications. ## UI Components LiveKit provides pre-built UI components to accelerate development: - **[React Components](https://github.com/livekit/components-js)**: React components for video, audio, and chat interfaces. Drop-in components for rapid development. - **[Android Compose](https://github.com/livekit/components-android)**: Jetpack Compose components for Android applications. Modern UI components for Android development. - **[SwiftUI](https://github.com/livekit/components-swift)**: SwiftUI components for iOS and macOS applications. Native UI components for Apple platforms. - **[Flutter](https://github.com/livekit/components-flutter)**: Flutter widgets for cross-platform applications. Reusable UI components for Flutter apps. ## Agents Framework LiveKit provides the Agents Framework for building AI agents and programmatic participants: - **[Agents docs](https://docs.livekit.io/agents.md)**: Learn how to build voice AI agents using the Agents Framework. - **[Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai.md)**: Voice AI agent quickstart guide. The fastest way to get an agent up and running. - **[Agents Framework](https://github.com/livekit/agents)**: Python framework for building AI agents and programmatic participants. Production-ready with comprehensive AI integrations. - **[AgentsJS](https://github.com/livekit/agents-js)**: JavaScript/TypeScript framework for building AI agents. Modern architecture with TypeScript support. ## Telephony Integration LiveKit's SIP integration enables your applications to connect with traditional phone systems and telephony infrastructure. Server-side SDKs include SIP capabilities for building telephony applications. To learn more, see [SIP](https://docs.livekit.io/sip.md). ## Key features across SDKs LiveKit SDKs provide a consistent set of features across all platforms, ensuring that your applications work reliably regardless of the target platform. These core capabilities are designed to handle the complexities of realtime communication while providing a simple, unified API. ### Realtime capabilities Realtime SDKs focus on connecting users to LiveKit rooms and managing realtime communication. These capabilities enable applications to capture, transmit, and receive media streams with minimal latency. - **Media capture**: Camera, microphone, and screen sharing. - **Room management**: Join, leave, and manage room participants. - **Track handling**: Subscribe to and publish audio and video tracks. - **Data channels**: Realtime messaging between participants. - **Connection management**: Automatic reconnection and quality adaptation. ### Server-side capabilities Server-side SDKs provide the infrastructure and control needed to manage LiveKit rooms and participants. These capabilities enable backend applications to orchestrate realtime sessions and process media streams. - **Room control**: Create, manage, and monitor rooms. - **Participant management**: Control participant permissions and behavior. - **Media processing**: Subscribe to and process media streams. - **Webhook handling**: Respond to room and participant events. - **Recording**: Capture and store room sessions. ### Cross-platform consistency All SDKs provide consistent APIs and features across platforms: - **Unified room model**: Same room concepts across all platforms. - **Consistent track handling**: Standardized audio and video track management. - **Shared data APIs**: Common data channel and messaging patterns. - **Quality adaptation**: Automatic quality adjustment based on network conditions. ## Getting started To get started with LiveKit SDKs: 1. **Choose your platform**: Select the appropriate client and server SDKs for your use case. 2. **Set up LiveKit**: Deploy LiveKit server or use [LiveKit Cloud](https://livekit.io/cloud). 3. **Build your app**: Use the SDKs to create your realtime application. 4. **Add UI components**: Integrate pre-built components for faster development. 5. **Deploy and scale**: Use LiveKit's production-ready infrastructure. To get started with LiveKit Agents, see the [Voice AI quickstart](https://docs.livekit.io/agents/start/voice-ai.md). --- --- ## Connecting to LiveKit ## Overview Your application will connect to LiveKit using the `Room` object, which is the base construct in LiveKit. Think of it like a conference call — multiple participants can join a room and share realtime audio, video, and data with each other. Depending on your application, each participant might represent a user, an AI agent, a connected device, or some other program you've created. There is no limit on the number of participants in a room and each participant can publish audio, video, and data to the room. ## Installing the LiveKit SDK LiveKit includes open-source SDKs for every major platform including JavaScript, Swift, Android, React Native, Flutter, and Unity. LiveKit also has SDKs for realtime backend apps in Python, Node.js, Go, and Rust. These are designed to be used with the [Agents framework](https://docs.livekit.io/agents.md) for realtime AI applications. **JavaScript**: Install the LiveKit SDK and optional React Components library: ```shell npm install livekit-client @livekit/components-react @livekit/components-styles --save ``` The SDK is also available using `yarn` or `pnpm`. Also check out the dedicated quickstart for [React](https://docs.livekit.io/home/quickstarts/react.md). --- **Swift**: Add the Swift SDK and the optional Swift Components library to your project using Swift Package Manager. The package URLs are: - [https://github.com/livekit/client-sdk-swift](https://github.com/livekit/client-sdk-swift) - [https://github.com/livekit/components-swift](https://github.com/livekit/components-swift) See [Adding package dependencies to your app](https://developer.apple.com/documentation/xcode/adding-package-dependencies-to-your-app) for more details. You must also declare camera and microphone permissions, if needed in your `Info.plist` file: ```xml ... NSCameraUsageDescription $(PRODUCT_NAME) uses your camera NSMicrophoneUsageDescription $(PRODUCT_NAME) uses your microphone ... ``` For more details, see the [Swift quickstart](https://docs.livekit.io/home/quickstarts/swift.md). --- **Android**: The LiveKit SDK and components library are available as Maven packages. ```groovy dependencies { implementation "io.livekit:livekit-android:2.+" implementation "io.livekit:livekit-android-compose-components:1.+" } ``` See the [releases page](https://github.com/livekit/client-sdk-android/releases) for information on the latest version of the SDK. You'll also need JitPack as one of your repositories. In your `settings.gradle` file: ```groovy dependencyResolutionManagement { repositories { //... maven { url 'https://jitpack.io' } } } ``` --- **React Native**: Install the React Native SDK with NPM: ```shell npm install @livekit/react-native @livekit/react-native-webrtc livekit-client ``` Check out the dedicated quickstart for [Expo](https://docs.livekit.io/home/quickstarts/expo.md) or [React Native](https://docs.livekit.io/home/quickstarts/react-native.md) for more details. --- **Flutter**: Install the latest version of the Flutter SDK and components library. ```shell flutter pub add livekit_client livekit_components ``` You'll also need to declare camera and microphone permissions. See the [Flutter quickstart](https://docs.livekit.io/home/quickstarts/flutter.md) for more details. If your SDK is not listed above, check out the full list of [platform-specific quickstarts](https://docs.livekit.io/home/quickstarts.md) and [SDK reference docs](https://docs.livekit.io/reference.md) for more details. ## Connecting to a room Rooms are identified by their name, which can be any unique string. The room itself is created automatically when the first participant joins, and is closed when the last participant leaves. You must use a participant identity when you connect to a room. This identity can be any string, but must be unique to each participant. Connecting to a room always requires two parameters: - `wsUrl`: The WebSocket URL of your LiveKit server.- LiveKit Cloud users can find theirs on the [Project Settings page](https://cloud.livekit.io/projects/p_/settings/project). - Self-hosted users who followed [this guide](https://docs.livekit.io/home/self-hosting/local.md) can use `ws://localhost:7880` while developing. - `token`: A unique [access token](https://docs.livekit.io/concepts/authentication.md) which each participant must use to connect.- The token encodes the room name, the participant's identity, and their permissions. - For help generating tokens, see [this guide](https://docs.livekit.io/home/server/generating-tokens.md). **JavaScript**: ```js const room = new Room(); await room.connect(wsUrl, token); ``` --- **React**: ```js ``` --- **Swift**: ```swift RoomScope(url: wsURL, token: token, connect: true, enableCamera: true) { // your components here } ``` --- **Android**: ```kotlin RoomScope( url = wsURL, token = token, audio = true, video = true, connect = true, ) { // your components here } ``` --- **React Native**: ```js ``` --- **Flutter**: ```dart final room = Room(); await room.connect(wsUrl, token); ``` Upon successful connection, the `Room` object will contain two key attributes: a `localParticipant` object, representing the current user, and `remoteParticipants`, an array of other participants in the room. Once connected, you can [publish](https://docs.livekit.io/home/client/tracks/publish.md) and [subscribe](https://docs.livekit.io/home/client/tracks/subscribe.md) to realtime media tracks or [exchange data](https://docs.livekit.io/home/client/data.md) with other participants. LiveKit also emits a number of events on the `Room` object, such as when new participants join or tracks are published. For details, see [Handling Events](https://docs.livekit.io/home/client/events.md). ## Disconnection Call `Room.disconnect()` to leave the room. If you terminate the application without calling `disconnect()`, your participant disappears after 15 seconds. > ℹ️ **Note** > > On some platforms, including JavaScript and Swift, `Room.disconnect` is called automatically when the application exits. ### Automatic disconnection Participants might get disconnected from a room due to server-initiated actions. This can happen if the room is closed using the [DeleteRoom](https://docs.livekit.io/home/server/managing-rooms.md#Delete-a-room) API or if a participant is removed with the [RemoveParticipant](https://docs.livekit.io/home/server/managing-participants.md#remove-a-participant) API. In such cases, a `Disconnected` event is emitted, providing a reason for the disconnection. Common [disconnection reasons](https://github.com/livekit/protocol/blob/main/protobufs/livekit_models.proto#L333) include: - DUPLICATE_IDENTITY: Disconnected because another participant with the same identity joined the room. - ROOM_DELETED: The room was closed via the `DeleteRoom` API. - PARTICIPANT_REMOVED: Removed from the room using the `RemoveParticipant` API. - JOIN_FAILURE: Failure to connect to the room, possibly due to network issues. - ROOM_CLOSED: The room was closed because all [Standard and Ingress participants](https://docs.livekit.io/home/get-started/api-primitives.md#types-of-participants) left. ## Connection reliability LiveKit enables reliable connectivity in a wide variety of network conditions. It tries the following WebRTC connection types in descending order: 1. ICE over UDP: ideal connection type, used in majority of conditions 2. TURN with UDP (3478): used when ICE/UDP is unreachable 3. ICE over TCP: used when network disallows UDP (i.e. over VPN or corporate firewalls) 4. TURN with TLS: used when firewall only allows outbound TLS connections **Cloud**: LiveKit Cloud supports all of the above connection types. TURN servers with TLS are provided and maintained by LiveKit Cloud. --- **Self-hosted**: ICE over UDP and TCP works out of the box, while TURN requires additional configurations and your own SSL certificate. ### Network changes and reconnection With WiFi and cellular networks, users may sometimes run into network changes that cause the connection to the server to be interrupted. This could include switching from WiFi to cellular or going through spots with poor connection. When this happens, LiveKit will attempt to resume the connection automatically. It reconnects to the signaling WebSocket and initiates an [ICE restart](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Session_lifetime#ice_restart) for the WebRTC connection. This process usually results in minimal or no disruption for the user. However, if media delivery over the previous connection fails, users might notice a temporary pause in video, lasting a few seconds, until the new connection is established. In scenarios where an ICE restart is not feasible or unsuccessful, LiveKit will execute a full reconnection. As full reconnections take more time and might be more disruptive, a `Reconnecting` event is triggered. This allows your application to respond, possibly by displaying a UI element, during the reconnection process. This sequence goes like the following: 1. `ParticipantDisconnected` fired for other participants in the room 2. If there are tracks unpublished, you will receive `LocalTrackUnpublished` for them 3. Emits `Reconnecting` 4. Performs full reconnect 5. Emits `Reconnected` 6. For everyone currently in the room, you will receive `ParticipantConnected` 7. Local tracks are republished, emitting `LocalTrackPublished` events In essence, the full reconnection sequence is identical to everyone else having left the room, and came back. --- #### Realtime media --- ## Overview ## Overview LiveKit provides realtime media exchange between participants using tracks. Each participant can [publish](https://docs.livekit.io/home/client/tracks/publish.md) and [subscribe](https://docs.livekit.io/home/client/tracks/subscribe.md) to as many tracks as makes sense for your application. ### Audio tracks Audio tracks are typically published from your microphone and played back on the other participants' speakers. You can also produce custom audio tracks, for instance to add background music or other audio effects. AI agents can consume an audio track to perform speech-to-text, and can publish their own audio track with synthesized speech or other audio effects. ### Video tracks Video tracks are usually published from a webcam or other video source, and rendered on the other participants' screens within your application's UI. LiveKit also supports screen sharing, which commonly results in two video tracks from the same participant. AI agents can subscribe to video tracks to perform vision-based tasks, and can publish their own video tracks with synthetic video or other visual effects. ## Example use cases The following examples demonstrate how to model your application for different use cases. ### AI voice agent Each room has two participants: an end-user and an AI agent. They can have a natural conversation with the following setup: - **End-user**: publishes their microphone track and subscribes to the AI agent's audio track - **AI agent**: subscribes to the user's microphone track and publishes its own audio track with synthesized speech The UI may be a simple audio visualizer showing that the AI agent is speaking. ### Video conference Each room has multiple users. Each user publishes audio and/or video tracks and subscribes to all tracks published by others. In the UI, the room is typically displayed as a grid of video tiles. ### Livestreaming Each room has one broadcaster and a significant number of viewers. The broadcaster publishes audio and video tracks. The viewers subscribe to the broadcaster's tracks but do not publish their own. Interaction is typically performed with a chat component. An AI agent may also join the room to publish live captions. ### AI camera monitoring Each room has one camera participant that publishes its video track, and one agent that monitors the camera feed and calls out to an external API to take action based on contents of the video feed (e.g. send an alert). Alternatively, one room can have multiple cameras and an agent that monitors all of them, or an end-user could also optionally join the room to monitor the feeds alongside the agent. --- --- ## Camera & microphone ## Overview LiveKit includes a simple and consistent method to publish the user's camera and microphone, regardless of the device or browser they are using. In all cases, LiveKit displays the correct indicators when recording is active and acquires the necessary permissions from the user. ```typescript // Enables the camera and publishes it to a new video track room.localParticipant.setCameraEnabled(true); // Enables the microphone and publishes it to a new audio track room.localParticipant.setMicrophoneEnabled(true); ``` ## Device permissions In native and mobile apps, you typically need to acquire consent from the user to access the microphone or camera. LiveKit integrates with the system privacy settings to record permission and display the correct indicators when audio or video capture is active. For web browsers, the user is automatically prompted to grant camera and microphone permissions the first time your app attempts to access them and no additional configuration is required. **Swift**: Add these entries to your `Info.plist`: ```xml NSCameraUsageDescription $(PRODUCT_NAME) uses your camera NSMicrophoneUsageDescription $(PRODUCT_NAME) uses your microphone ``` To enable background audio, you must also add the "Background Modes" capability with "Audio, AirPlay, and Picture in Picture" selected. Your `Info.plist` should have: ```xml UIBackgroundModes audio ``` --- **Android**: Add these permissions to your `AndroidManifest.xml`: ```xml ``` Request permissions at runtime: ```kotlin private fun requestPermissions() { val requestPermissionLauncher = registerForActivityResult( ActivityResultContracts.RequestMultiplePermissions() ) { grants -> for (grant in grants.entries) { if (!grant.value) { Toast.makeText( this, "Missing permission: ${grant.key}", Toast.LENGTH_SHORT ).show() } } } val neededPermissions = listOf( Manifest.permission.RECORD_AUDIO, Manifest.permission.CAMERA ).filter { ContextCompat.checkSelfPermission( this, it ) == PackageManager.PERMISSION_DENIED }.toTypedArray() if (neededPermissions.isNotEmpty()) { requestPermissionLauncher.launch(neededPermissions) } } ``` --- **React Native**: For iOS, add to `Info.plist`: ```xml NSCameraUsageDescription $(PRODUCT_NAME) uses your camera NSMicrophoneUsageDescription $(PRODUCT_NAME) uses your microphone ``` For Android, add to `AndroidManifest.xml`: ```xml ``` You'll need to request permissions at runtime using a permissions library like `react-native-permissions`. --- **Flutter**: For iOS, add to `Info.plist`: ```xml NSCameraUsageDescription $(PRODUCT_NAME) uses your camera NSMicrophoneUsageDescription $(PRODUCT_NAME) uses your microphone ``` For Android, add to `AndroidManifest.xml`: ```xml ``` Request permissions using the `permission_handler` package: ```dart import 'package:permission_handler/permission_handler.dart'; // Request permissions await Permission.camera.request(); await Permission.microphone.request(); ``` ## Mute and unmute You can mute any track to stop it from sending data to the server. When a track is muted, LiveKit will trigger a `TrackMuted` event on all participants in the room. You can use this event to update your app's UI and reflect the correct state to all users in the room. Mute/unmute a track using its corresponding `LocalTrackPublication` object. ## Track permissions By default, any published track can be subscribed to by all participants. However, publishers can restrict who can subscribe to their tracks using Track Subscription Permissions: **JavaScript**: ```typescript localParticipant.setTrackSubscriptionPermissions(false, [ { participantIdentity: 'allowed-identity', allowAll: true, }, ]); ``` --- **Swift**: ```swift localParticipant.setTrackSubscriptionPermissions( allParticipantsAllowed: false, trackPermissions: [ ParticipantTrackPermission(participantSid: "allowed-sid", allTracksAllowed: true) ] ) ``` --- **Android**: ```kotlin localParticipant.setTrackSubscriptionPermissions(false, listOf( ParticipantTrackPermission(participantIdentity = "allowed-identity", allTracksAllowed = true), )) ``` --- **Flutter**: ```dart localParticipant.setTrackSubscriptionPermissions( allParticipantsAllowed: false, trackPermissions: [ const ParticipantTrackPermission('allowed-identity', true, null) ], ); ``` --- **Python**: ```python from livekit import rtc local_participant.set_track_subscription_permissions( all_participants_allowed=False, participant_permissions=[ rtc.ParticipantTrackPermission( participant_identity="allowed-identity", allow_all=True, ), ], ) ``` ## Publishing from backend You may also publish audio and video tracks from a backend process, which can be consumed just like any camera or microphone track. The [LiveKit Agents](https://docs.livekit.io/agents.md) framework makes it easy to add a programmable participant to any room, and publish media such as synthesized speech or video. LiveKit also includes complete SDKs for server environments in [Go](https://github.com/livekit/server-sdk-go), [Rust](https://github.com/livekit/rust-sdks), [Python](https://github.com/livekit/python-sdks), and [Node.js](https://github.com/livekit/node-sdks). You can also publish media using the [LiveKit CLI](https://github.com/livekit/livekit-cli?tab=readme-ov-file#publishing-to-a-room). ### Publishing audio tracks You can publish audio by creating an `AudioSource` and publishing it as a track. Audio streams carry raw PCM data at a specified sample rate and channel count. Publishing audio involves splitting the stream into audio frames of a configurable length. An internal buffer holds 50 ms of queued audio to send to the realtime stack. The `capture_frame` method, used to send new frames, is blocking and doesn't return control until the buffer has taken in the entire frame. This allows for easier interruption handling. In order to publish an audio track, you need to determine the sample rate and number of channels beforehand, as well as the length (number of samples) of each frame. In the following example, the agent transmits a constant 16-bit sine wave at 48kHz in 10 ms long frames: **Python**: ```python import numpy as np from livekit import agents,rtc from livekit.agents import AgentServer SAMPLE_RATE = 48000 NUM_CHANNELS = 1 # mono audio AMPLITUDE = 2 ** 8 - 1 SAMPLES_PER_CHANNEL = 480 # 10 ms at 48kHz server = AgentServer() @server.rtc_session() async def my_agent(ctx: agents.JobContext): source = rtc.AudioSource(SAMPLE_RATE, NUM_CHANNELS) track = rtc.LocalAudioTrack.create_audio_track("example-track", source) # since the agent is a participant, our audio I/O is its "microphone" options = rtc.TrackPublishOptions(source=rtc.TrackSource.SOURCE_MICROPHONE) # ctx.agent is an alias for ctx.room.local_participant publication = await ctx.agent.publish_track(track, options) frequency = 440 async def _sinewave(): audio_frame = rtc.AudioFrame.create(SAMPLE_RATE, NUM_CHANNELS, SAMPLES_PER_CHANNEL) audio_data = np.frombuffer(audio_frame.data, dtype=np.int16) time = np.arange(SAMPLES_PER_CHANNEL) / SAMPLE_RATE total_samples = 0 while True: time = (total_samples + np.arange(SAMPLES_PER_CHANNEL)) / SAMPLE_RATE sinewave = (AMPLITUDE * np.sin(2 * np.pi * frequency * time)).astype(np.int16) np.copyto(audio_data, sinewave) # send this frame to the track await source.capture_frame(audio_frame) total_samples += SAMPLES_PER_CHANNEL await _sinewave() ``` > ⚠️ **Warning** > > When streaming finite audio (for example, from a file), make sure the frame length isn't longer than the number of samples left to stream, otherwise the end of the buffer consists of noise. #### Audio examples For audio examples using the LiveKit SDK, see the following in the GitHub repository: - **[Speedup Output Audio](https://github.com/livekit/agents/blob/main/examples/voice_agents/speedup_output_audio.py)**: Use the [TTS node](https://docs.livekit.io/agents/build/nodes.md#tts-node) to speed up audio output. - **[Echo Agent](https://github.com/livekit/agents/blob/main/examples/primitives/echo-agent.py)**: Echo user audio back to them. - **[Sync TTS Transcription](https://github.com/livekit/agents/blob/main/examples/other/text-to-speech/sync_tts_transcription.py)**: Uses manual subscription, transcription forwarding, and manually publishes audio output. ### Publishing video tracks Agents publish data to their tracks as a continuous live feed. Video streams can transmit data in any of [11 buffer encodings](https://github.com/livekit/python-sdks/blob/main/livekit-rtc/livekit/rtc/_proto/video_frame_pb2.pyi#L93). When publishing video tracks, you need to establish the frame rate and buffer encoding of the video beforehand. In this example, the agent connects to the room and starts publishing a solid color frame at 10 frames per second (FPS). Copy the following code into your entrypoint function: **Python**: ```python from livekit import rtc from livekit.agents import JobContext WIDTH = 640 HEIGHT = 480 source = rtc.VideoSource(WIDTH, HEIGHT) track = rtc.LocalVideoTrack.create_video_track("example-track", source) options = rtc.TrackPublishOptions( # since the agent is a participant, our video I/O is its "camera" source=rtc.TrackSource.SOURCE_CAMERA, simulcast=True, # when modifying encoding options, max_framerate and max_bitrate must both be set video_encoding=rtc.VideoEncoding( max_framerate=30, max_bitrate=3_000_000, ), video_codec=rtc.VideoCodec.H264, ) publication = await ctx.agent.publish_track(track, options) # this color is encoded as ARGB. when passed to VideoFrame it gets re-encoded. COLOR = [255, 255, 0, 0]; # FFFF0000 RED async def _draw_color(): argb_frame = bytearray(WIDTH * HEIGHT * 4) while True: await asyncio.sleep(0.1) # 10 fps argb_frame[:] = COLOR * WIDTH * HEIGHT frame = rtc.VideoFrame(WIDTH, HEIGHT, rtc.VideoBufferType.RGBA, argb_frame) # send this frame to the track source.capture_frame(frame) asyncio.create_task(_draw_color()) ``` > ℹ️ **Note** > > - Although the published frame is static, it's still necessary to stream it continuously for the benefit of participants joining the room after the initial frame is sent. > - Unlike audio, video `capture_frame` doesn't keep an internal buffer. LiveKit can translate between video buffer encodings automatically. `VideoFrame` provides the current video buffer type and a method to convert it to any of the other encodings: **Python**: ```python async def handle_video(track: rtc.Track): video_stream = rtc.VideoStream(track) async for event in video_stream: video_frame = event.frame current_type = video_frame.type frame_as_bgra = video_frame.convert(rtc.VideoBufferType.BGRA) # [...] await video_stream.aclose() @ctx.room.on("track_subscribed") def on_track_subscribed( track: rtc.Track, publication: rtc.TrackPublication, participant: rtc.RemoteParticipant, ): if track.kind == rtc.TrackKind.KIND_VIDEO: asyncio.create_task(handle_video(track)) ``` ### Audio and video synchronization > ℹ️ **Note** > > `AVSynchronizer` is currently only available in Python. While WebRTC handles A/V sync natively, some scenarios require manual synchronization - for example, when synchronizing generated video with voice output. The [`AVSynchronizer`](https://docs.livekit.io/reference/python/v1/livekit/rtc/index.html.md#livekit.rtc.AVSynchronizer) utility helps maintain synchronization by aligning the first audio and video frames. Subsequent frames are automatically synchronized based on configured video FPS and audio sample rate. - **[Audio and video synchronization](https://github.com/livekit/python-sdks/tree/main/examples/video-stream)**: Examples that demonstrate how to synchronize video and audio streams using the `AVSynchronizer` utility. --- --- ## Screen sharing ## Overview LiveKit supports screen sharing natively across all platforms. Your screen is published as a video track, just like your camera. Some platforms support local audio sharing as well. The steps are somewhat different for each platform: **JavaScript**: ```typescript // The browser will prompt the user for access and offer a choice of screen, window, or tab await room.localParticipant.setScreenShareEnabled(true); ``` --- **Swift**: On iOS, LiveKit integrates with ReplayKit in two modes: 1. **In-app capture (default)**: For sharing content within your app 2. **Broadcast capture**: For sharing screen content even when users switch to other apps #### In-app capture The default in-app capture mode requires no additional configuration, but shares only the current application. ```swift localParticipant.setScreenShare(enabled: true) ``` #### Broadcast capture To share the full screen while your app is running in the background, you'll need to set up a Broadcast Extension. This will allow the user to "Start Broadcast". You can prompt this from your app or the user can start it from the control center. The full steps are described in our [iOS screen sharing guide](https://github.com/livekit/client-sdk-swift/blob/main/Docs/ios-screen-sharing.md), but a summary is included below: 1. Add a new "Broadcast Upload Extension" target with the bundle identifier `.broadcast`. 2. Replace the default `SampleHandler.swift` with the following: ```swift import LiveKit #if os(iOS) @available(macCatalyst 13.1, *) class SampleHandler: LKSampleHandler { override var enableLogging: Bool { true } } #endif ``` 1. Add both your main app and broadcast extension to a common App Group, named `group.`. 2. Present the broadcast dialog from your app: ```swift localParticipant.setScreenShare(enabled: true) ``` --- **Android**: On Android, screen capture is performed using `MediaProjectionManager`: ```kotlin // Create an intent launcher for screen capture // This *must* be registered prior to onCreate(), ideally as an instance val val screenCaptureIntentLauncher = registerForActivityResult( ActivityResultContracts.StartActivityForResult() ) { result -> val resultCode = result.resultCode val data = result.data if (resultCode != Activity.RESULT_OK || data == null) { return@registerForActivityResult } lifecycleScope.launch { room.localParticipant.setScreenShareEnabled(true, data) } } // When it's time to enable the screen share, perform the following val mediaProjectionManager = getSystemService(MEDIA_PROJECTION_SERVICE) as MediaProjectionManager screenCaptureIntentLauncher.launch(mediaProjectionManager.createScreenCaptureIntent()) ``` --- **Flutter**: ```dart room.localParticipant.setScreenShareEnabled(true); ``` On Android, you would have to define a foreground service in your AndroidManifest.xml: ```xml ... ``` On iOS, follow [this guide](https://github.com/flutter-webrtc/flutter-webrtc/wiki/iOS-Screen-Sharing#broadcast-extension-quick-setup) to set up a Broadcast Extension. --- **Unity (WebGL)**: ```csharp yield return currentRoom.LocalParticipant.SetScreenShareEnabled(true); ``` ## Sharing browser audio > ℹ️ **Note** > > Audio sharing is only possible in certain browsers. Check browser support on the [MDN compatibility table](https://developer.mozilla.org/en-US/docs/Web/API/Screen_Capture_API/Using_Screen_Capture#browser_compatibility). To share audio from a browser tab, you can use the `createScreenTracks` method with the audio option enabled: ```js const tracks = await localParticipant.createScreenTracks({ audio: true, }); tracks.forEach((track) => { localParticipant.publishTrack(track); }); ``` ### Testing audio sharing #### Publisher When sharing audio, make sure you select a **Browser Tab** (not a Window) and ☑️ Share tab audio, otherwise no audio track will be generated when calling `createScreenTracks`: ![Popup window for choosing to share entire screen, a specific window, or a Chrome tab, with options to share audio and action buttons.](/images/client/share-browser-audio-screen.png) #### Subscriber On the receiving side, you can use [`RoomAudioRenderer`](https://github.com/livekit/components-js/blob/main/packages/react/src/components/RoomAudioRenderer.tsx) to play all audio tracks of the room automatically, [`AudioTrack`](https://github.com/livekit/components-js/blob/main/packages/react/src/components/participant/AudioTrack.tsx) or your own custom `