Multilingual transcription

Configure speech-to-text providers for transcription in multiple languages, including language detection and code-switching support.

In some meetings, participants may speak in more than one language, and you may not know ahead of time which language or languages will be used. In those cases, you can use a transcription provider that supports multilingual transcription. Multilingual transcription generally includes two distinct features:

FeatureDescription
Language detectionDetects the spoken language without requiring you to set a language explicitly.
Code-switchingHandles conversations where speakers switch between two or more languages during the same meeting.

Multilingual transcription lets you generate transcripts for recordings where the spoken language is not English, where the language should be detected automatically, or where multiple languages may be spoken in the same recording.

Provider support varies by transcription workflow. Some providers support automatic language detection, some support code-switching, and some require you to specify the language or possible languages ahead of time.

🚧

Multilingual transcription does not natively translate transcripts

Multilingual transcription generates transcripts in the language that was spoken. If you need translated transcripts in a specific language, use a translation feature from your speech-to-text provider or translate the transcript after it is generated.

❗️

Recall.ai Transcription real-time multilingual transcription requires mode: "prioritize_accuracy"

For recallai_streaming, multilingual transcription currently requires mode: "prioritize_accuracy" with language_code: "auto". This mode sends transcript.data and transcript.partial_data events every 3 to 10 minutes.

mode: "prioritize_low_latency" currently supports English only.


Multilingual transcription provider guides

Use the provider guides below for provider-specific multilingual transcription support, configuration options, supported workflows, and example request bodies.


FAQ

Why is my recording transcribed in the wrong language?

If your transcript is returned in the wrong language, it is usually due to one of the following:

  • The transcription language was configured incorrectly - Check your transcription configuration and confirm the language code or language detection setting matches the language spoken in the recording.
  • The audio quality made language detection unreliable - Background noise, echo, low volume, overlapping speakers, or audio from a mobile device can cause the model to misidentify the language.
  • Speech characteristics reduced language recognition accuracy - Heavy accents, unclear speech, or fast and colloquial speech can increase the chance of the model selecting the wrong language.
  • The provider may have weaker support for that language - Some providers perform better in certain languages than others.

If the configuration and audio quality look correct, try testing a different speech-to-text provider. Transcription quality varies by provider, language, audio quality, and use case.

What languages are supported by each provider?

Supported languages change over time and vary by provider, model, and transcription workflow. For the most accurate information, see the provider-specific multilingual transcription guide for the provider you want to use.