Multilingual Transcription
Transcribing conversations with more than one spoken language
You may find yourself with customers that speak in more than one language ("code switch") in a single meeting. This requires a special configuration to produce an accurate transcript, since standard transcription configurations will interpret the entire conversation as taking place in a single language. Some of our supported transcription providers offer code switching configurations in order to handle these situations.
Code switching is supported for both async and real-time models, but quality will generally be higher for async models, so we recommend using them unless you require transcription data in real time.
Transcription providers that support code-switching
Provider | Real-time | Async |
---|---|---|
Deepgram | ✅ (docs) | ✅ (docs) |
Gladia | ✅ (docs) | ✅ (docs) |
Speechmatics | 🚧 Only supported for specific language pairs, must be specified ahead of time (docs) | 🚧 Only supported for specific language pairs, must be specified ahead of time (docs ) |
AWS Transcribe | 🚧 Must provide a list of potential languages ahead of time (docs) | ❌ Async model currently not supported by the Recall.ai API |
AssemblyAI | ❌ Streaming is currently English-only | 🚧 Requires setting a primary language_code (usually the non-English one)(docs) |
To get started with using any of these providers, you'll need to sign up for an account, get an API key, and add it to the Recall transcription dashboard.
View the Transcription Dashboard
Real-time transcription code-switching configs
The following is a list of parameters you can add to your Create Bot or Create Desktop SDK Upload request to enable real-time multilingual transcription. Note: some transcription providers are only supported for meeting bots. For an up-to-date list of supported providers, see this page.
Deepgram
{
"meeting_url": "...",
"recording_config": {
"transcript": {
"provider": {
"deepgram_streaming": {
"language": "multi"
}
}
}
}
}
Gladia
{
"meeting_url": "...",
"recording_config": {
"transcript": {
"provider": {
"gladia_v2_streaming": {
"language_config": { "code_switching": true }
}
}
}
}
}
Speechmatics
{
"meeting_url": "...",
"recording_config": {
"transcript": {
"provider": {
"speechmatics_streaming": {
"language": "cmn_en" // example: Mandarin↔English bilingual model
}
}
}
}
}
AWS Transcribe
{
"meeting_url": "...",
"recording_config": {
"transcript": {
"provider": {
"aws_transcribe_streaming": {
"language_identification": true,
"language_options": ["en-US", "es-US"],
"preferred_language": "en-US"
}
}
}
}
}
Async transcription code-switching configs
The following is a list of parameters you can add to your Create Async Transcript request to enable async multilingual transcription.
Deepgram
{
"provider": {
"deepgram_async": {
"language": "multi"
}
}
}
Gladia
{
"provider": {
"gladia_async": {
"language_config": { "code_switching": true }
}
}
}
Speechmatics
{
"provider": {
"speechmatics_async": {
"language": "cmn_en" // example: Mandarin↔English bilingual model
}
}
}
AssemblyAI
{
"provider": {
"assembly_ai_async": {
"speech_model": "universal",
"language_code": "es" // set the non-English language (e.g., "es" for Spanish, "de" for German)
}
}
}
FAQ
What languages are supported by each provider?
Transcription providers are constantly updating and adding support for more languages, so the most reliable source of information is their own documentation .
Updated about 2 hours ago