Google Realtime Model for real-time voice conversations with Gemini models

Hierarchy (view full)

Constructors

Accessors

Methods

Constructors

  • Parameters

    • options: {
          apiKey?: string;
          apiVersion?: string;
          candidateCount?: number;
          connOptions?: APIConnectOptions;
          contextWindowCompression?: ContextWindowCompressionConfig;
          enableAffectiveDialog?: boolean;
          frequencyPenalty?: number;
          geminiTools?: LLMTools;
          httpOptions?: HttpOptions;
          imageEncodeOptions?: {
              format: "JPEG";
              quality: number;
              resizeOptions: {
                  height: number;
                  strategy: "scale_aspect_fit";
                  width: number;
              };
          };
          inputAudioTranscription?: null | AudioTranscriptionConfig;
          instructions?: string;
          language?: string;
          location?: string;
          maxOutputTokens?: number;
          modalities?: Modality[];
          model?: string;
          outputAudioTranscription?: null | AudioTranscriptionConfig;
          presencePenalty?: number;
          proactivity?: boolean;
          project?: string;
          realtimeInputConfig?: RealtimeInputConfig;
          temperature?: number;
          topK?: number;
          topP?: number;
          vertexai?: boolean;
          voice?: string;
      } = {}
      • Optional apiKey?: string

        Google Gemini API key. If not provided, will attempt to read from GOOGLE_API_KEY environment variable

      • Optional apiVersion?: string

        API version to use

      • Optional candidateCount?: number

        The number of candidate responses to generate

      • Optional connOptions?: APIConnectOptions

        The configuration for the API connection

      • Optional contextWindowCompression?: ContextWindowCompressionConfig

        The configuration for context window compression

      • Optional enableAffectiveDialog?: boolean

        Whether to enable affective dialog

      • Optional frequencyPenalty?: number

        The frequency penalty for response generation

      • Optional geminiTools?: LLMTools

        Gemini-specific tools to use for the session

      • Optional httpOptions?: HttpOptions

        HTTP options for API requests

      • Optional imageEncodeOptions?: {
            format: "JPEG";
            quality: number;
            resizeOptions: {
                height: number;
                strategy: "scale_aspect_fit";
                width: number;
            };
        }

        The configuration for image encoding

        • format: "JPEG"
        • quality: number
        • resizeOptions: {
              height: number;
              strategy: "scale_aspect_fit";
              width: number;
          }
          • height: number
          • strategy: "scale_aspect_fit"
          • width: number
      • Optional inputAudioTranscription?: null | AudioTranscriptionConfig

        The configuration for input audio transcription

      • Optional instructions?: string

        Initial system instructions for the model

      • Optional language?: string

        The language (BCP-47 Code) to use for the API See https://ai.google.dev/gemini-api/docs/live#supported-languages

      • Optional location?: string

        The location to use for the API (for VertexAI)

      • Optional maxOutputTokens?: number

        Maximum number of tokens in the response

      • Optional modalities?: Modality[]

        Modalities to use, such as [Modality.TEXT, Modality.AUDIO]

      • Optional model?: string

        The name of the model to use

      • Optional outputAudioTranscription?: null | AudioTranscriptionConfig

        The configuration for output audio transcription

      • Optional presencePenalty?: number

        The presence penalty for response generation

      • Optional proactivity?: boolean

        Whether to enable proactive audio

      • Optional project?: string

        The project ID to use for the API (for VertexAI)

      • Optional realtimeInputConfig?: RealtimeInputConfig

        The configuration for realtime input

      • Optional temperature?: number

        Sampling temperature for response generation

      • Optional topK?: number

        The top-k value for response generation

      • Optional topP?: number

        The top-p value for response generation

      • Optional vertexai?: boolean

        Whether to use VertexAI for the API

      • Optional voice?: string

        Voice setting for audio outputs

    Returns beta.realtime.RealtimeModel

Accessors

Methods