Multilingual Transcription

You may find yourself with customers that speak in more than one language ("code switch") in a single meeting. This requires a special configuration to produce an accurate transcript, since standard transcription configurations will interpret the entire conversation as taking place in a single language. Some of our supported transcription providers offer code switching configurations in order to handle these situations.

Code switching is supported for both async and real-time models, but quality will generally be higher for async models, so we recommend using them unless you require transcription data in real time.

Transcription providers that support code-switching

Provider	Real-time	Async
Recall AI Transcription	✅ for `prioritize_accuracy` by default, ❌ for `prioritize_low_latency`	✅ (by default)
Deepgram	✅ (docs)	✅ (docs)
Gladia	✅ (docs)	✅ (docs)
Speechmatics	🚧 Only supported for specific language pairs, must be specified ahead of time (docs)	🚧 Only supported for specific language pairs, must be specified ahead of time (docs )
AWS Transcribe	🚧 Must provide a list of potential languages ahead of time (docs)	❌ Async model currently not supported by the Recall.ai API
AssemblyAI	🚧 Streaming supports code-switching but only between 6 languages (docs)	✅ (docs)

To get started with using any of these providers, you'll need to sign up for an account, get an API key, and add it to the Recall transcription dashboard.

📘
View the Transcription Dashboard
(US) us-east-1
(Pay-as-you-go) us-west-2
(EU) eu-central-1
(JP) ap-northeast-1

Real-time transcription code-switching configs

The following is a list of parameters you can add to your Create Bot or Create Desktop SDK Upload request to enable real-time multilingual transcription. Note: some transcription providers are only supported for meeting bots. For an up-to-date list of supported providers, see this page.

Deepgram

{
  "meeting_url": "...",
  "recording_config": {
    "transcript": {
      "provider": {
        "deepgram_streaming": {
          "model": "nova-3",
          "language": "multi"
        }
      }
    }
  }
}

Gladia

{
  "meeting_url": "...",
  "recording_config": {
    "transcript": {
      "provider": {
        "gladia_v2_streaming": {
          "language_config": { "code_switching": true }
        }
      }
    }
  }
}

Speechmatics

{
  "meeting_url": "...",
  "recording_config": {
    "transcript": {
      "provider": {
        "speechmatics_streaming": {
          "language": "cmn_en"   // example: Mandarin↔English bilingual model
        }
      }
    }
  }
}

AWS Transcribe

{
  "meeting_url": "...",
  "recording_config": {
    "transcript": {
      "provider": {
        "aws_transcribe_streaming": {
          "language_identification": true,
          "language_options": ["en-US", "es-US"],
          "preferred_language": "en-US"
        }
      }
    }
  }
}

Async transcription code-switching configs

The following is a list of parameters you can add to your Create Async Transcript request to enable async multilingual transcription.

Deepgram

{
  "provider": {
    "deepgram_async": {
      "model": "nova-3",
      "language": "multi"
    }
  }
}

Gladia

{
  "provider": {
    "gladia_v2_async": {
      "language_config": {
        "languages": [],
        "code_switching": true
      }
    }
  }
}

Speechmatics

{
  "provider": {
    "speechmatics_async": {
      "language": "cmn_en" // example: Mandarin↔English bilingual model
    }
  }
}

AssemblyAI

{
  "provider": {
    "assembly_ai_async": {
      "speech_model": "universal",
      "language_code": "es"   // set the non-English language (e.g., "es" for Spanish, "de" for German)
    }
  }
}

FAQ

Why is my recording transcribed in the wrong language?

If your transcript is coming back in the wrong language, it’s usually due to one of the following:

The transcription language was set incorrectly - Check your transcription configuration and confirm the language code/setting matches the language spoken in the recording.
The audio quality made language detection unreliable - Background noise, echo, low volume, overlapping speakers, or audio coming from a mobile-device can cause the model to misidentify the language.
Speech characteristics reduced language recognition accuracy - Very heavy accents, unclear speech, or fast/colloquial speech can increase the chance of the model selecting the wrong language.
Your transcription provider may have weaker support for that language - Some providers perform better in certain languages so you should see their documentation for which languages are supported.

If the configuration and audio quality look correct, you can try using a different transcription provider. Transcription quality differs between providers and use cases so it's best to test a few to find one that works best for you!

What languages are supported by each provider?

Transcription providers are constantly updating and adding support for more languages, so the most reliable source of information is their own documentation .