Overview
Use the LiveKit CLI to configure, deploy, and manage your agent deployments. This guide covers deployment configuration, deploying new versions, rolling back, and understanding cold starts.
Configuration
The livekit.toml file contains your agent's deployment configuration. The CLI automatically looks for this file in the current directory, and uses it when any lk agent commands are run in that directory.
[project]subdomain = "<my-project-subdomain>"[agent]id = "<agent-id>"
To generate a new livekit.toml file, run:
lk agent config
Deploying new versions
To deploy a new version of your agent, run the following command:
lk agent deploy
LiveKit Cloud builds a container image that includes your agent code. The new version is pushed to production using a rolling deployment strategy. The rolling deployment allows new instances to serve new sessions, while existing instances are given up to 1 hour to complete active sessions. This ensures your new version is deployed without user interruptions or service downtime.
Loading diagram…
When you run lk agent deploy, LiveKit Cloud follows this process:
- Build: The CLI uploads your code and builds a container image from your Dockerfile. See Builds and Dockerfiles for more information).
- Deploy: New agent instances with your updated code are deployed alongside existing instances.
- Route new sessions: New agent requests are routed to new instances.
- Graceful shutdown: Old instances stop accepting new sessions, while remaining active for up to 1 hour to complete any active sessions.
- Autoscale: New instances are automatically scaled up and down to meet demand.
Rolling back
You can quickly rollback to a previous version of your agent, without a rebuild, by using the following command:
lk agent rollback
Rollback operates in the same rolling manner as a normal deployment.
Instant rollback is available only on paid LiveKit Cloud plans. Users on free plans should revert their code to an earlier version and then redeploy.
Cold start
On certain plans, agents can be scaled down to zero replicas. When a new user connects to the agent, the instance does a "cold start" to serve them. This can take a little longer than normal to connect to the user. For more info, see the Quotas and limits guide.