Asynchronous Transcription

📘
You can use this sample app to see how to send a bot to a meeting and create/retrieve the transcript after the meeting has ended.

In addition to real-time transcription, Recall.ai also supports transcribing asynchronously after the call has ended. The async transcription process is the same for bots and the Desktop Recording SDK.

Quicklinks

Quickstart

Receive the `recording.done` webhook

You'll be notified when a given recording is ready for transcription by receiving a recording.done Status Change event:

{
  "event": "recording.done",
  "data": {
    "data": {
      "code": string,
      "sub_code": string | null,
      "updated_at": string
    },
    "recording": {
      "id": string,
      "metadata": object
    },
    "bot": {
      "id": string,
      "metadata": object
    } | null
  }
}

Upon receiving this, you can kick off an async transcript job, assuming your recording has generated an artifact suitable for transcription (e.g. a video or audio artifact).

Start an async transcription job

To kick off an asynchronous transcription job, call the Create Async Transcript endpoint.

At minimum, you must specify a provider configuration that should be used to transcribe the recording.

Example:

curl --request POST \
     --url https://us-west-2.recall.ai/api/v1/recording/{RECORDING_ID}/create_transcript/ \
     --header "Authorization: $RECALLAI_API_KEY" \
     --header "accept: application/json" \
     --header "content-type: application/json" \
     --data '
{
  "provider": {
    "recallai_async": {
      "language_code": "en"
    }
  }
}
'

In this example, we choose Recall.ai as the provider, and configure the language as English. For a full list of providers and their options, please see the Create Async Transcript API reference.

📘
Only 10 Successful Transcripts Allowed Per Recording
As each transcript created using the above triggers a transcription on the underlying provider incurring usage costs, we've limited maximum number of successful transcripts per recording to 10. This helps avoiding cases where bad loop on the consumer end can lead to large number of transcripts being created for the same recording.
In case you run into this limit for a recording, remediation steps are to delete existing transcript on the recording and retry.

Note that the minimum recording time to generate a transcript varies between by the different providers. See their docs for more info.

Diarization

By default, async transcriptions use the mixed audio that is a single stream for the entire recording. Alternatively, on supported platforms we allow transcribing each participant's stream separately, allowing perfect diarization. To use this, add the diarization object with use_separate_streams_when_available set to true

curl --request POST \
     --url https://us-west-2.recall.ai/api/v1/recording/{RECORDING_ID}/create_transcript/ \
     --header "Authorization: $RECALLAI_API_KEY" \
     --header "accept: application/json" \
     --header "content-type: application/json" \
     --data '
{
  "provider": {
    "recallai_async": {
      "language_code": "en"
    }
  },
  "diarization": {
    "use_separate_streams_when_available": true
  }
}
'

📘
Transcription Cost
For async transcriptions with perfect diarization, we trim out any silence and send only the speaking portions of audio to the transcription provider. This means that even though we are sending multiple streams of audio to your transcription provider, the cost to transcribe is typically similar to the default transcription
However, if there are multiple users speaking concurrently or background conversation then the transcription cost could be greater

By default, transcripts are diarized using speaker-timeline diarization which means the transcript will include speaker names. If you enable machine-diarization, you will receive anonymous speaker labels instead (e.g. A, B, C).

Success

The `transcript.done` webhook

If the async transcription job completes successfully, you will receive a transcript.done Artifact Status Change event when it completes:

{
  "event": "transcript.done",
  "data": {
    "data": {
      "code": "done",
      "sub_code": null,
      "updated_at": "2024-12-04T23:25:56.339940Z"
    },
    "transcript": {
      "id": "7d7387b1-874f-4950-a5b9-1ba6660e2f95",
      "metadata": {}
    },
    "recording": {
      "id": "03d06804-0cb2-42f8-a255-5b950dde7c57",
      "metadata": {}
    },
    "bot": {
      "id": "0b85d2f9-d54a-47f6-b28d-4c63229f4035",
      "metadata": {}
    }
  }
}

Fetching the transcript

Once you receive the transcript.done webhook, you can fetch the transcript data by calling Retrieve Transcript endpoint using its ID:

curl --request GET \
     --url https://us-west-2.recall.ai/api/v1/transcript/{TRANSCRIPT_ID}/ \
     --header "Authorization: $RECALLAI_API_KEY" \
     --header "accept: application/json"

The response will contain details about the transcript, such as the configuration used, as well as a pre-signed URL to access the transcript data:

{
  "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "recording": {
    "id": "03d06804-0cb2-42f8-a255-5b950dde7c57",
    "source": {
      "bot": {
        "id": "0b85d2f9-d54a-47f6-b28d-4c63229f4035"
      }
    }
  },
  "created_at": "2024-11-27T20:10:19.719Z",
  "expires_at": "2024-12-04T20:10:19.719Z",
  "status": {
    "code": "done",
    "sub_code": null,
    "updated_at": "2024-11-27T20:10:19.719Z"
  },
  "data": {
    "download_url": "..."
  },
  "diarization": {
    "use_separate_streams_when_available": false
  },
  "metadata": {
    "custom_field": "some_value"
  },
  "provider": {
    "assembly_ai_async": {
      "language": "en"
    }
  }
}

You can see the download_url's data schema here.

Error

The `transcript.error` webhook

If an async transcription job fails, you will receive a transcript.failed Artifact Status Change webhook event notifying you about the failure:

{
  "event": "transcript.failed",
  "data": {
    "data": {
      "code": string,
      "sub_code": string | null,
      "updated_at": string
    },
    "transcript": {
      "id": string,
      "metadata": object,
    },
    "recording": {
      "id": string,
      "metadata": object
    },
    "bot": {
      "id": string,
      "metadata": object
    } | null
  }
}

The reason for failure is included via a sub_code in the event payload (see Transcript Status Webhooks) and you can also check the bot logs in the dashboard to see why it failed.

If async transcription fails, you can retry the transcription with a backup transcription provider

Language Detection

If you don’t know ahead of time which language the conversation will be in, you can set up automatic language detection. Automatically detecting languages is broken up into two types:

Language Detection - Detecting the primary spoken language within a recording, without needing to explictly set it
Code switching - Alternating between two or more languages or language varieties within a single conversation or speech

Most of the third-party transcription providers that we integrate with support language detection.

The table below covers each of these, and their corresponding parameters in the Create Async Transcript provider configuration.

Provider	Create Bot `transcript` parameter	Supported Languages
`recallai_async`	Language detection: set `language_code` to `auto` Code switching: not supported	Docs
`assembly_ai_async`	Language detection: set `language_detection` to `true` (docs) Code switching: set `language_detection_options.code_switching` to `true` (docs)	Docs
`deepgram_async`	Language detection: set `model` to `nova-2` or `nova-3` and set `language` to `multi` (docs) Code switching: same as language detection (docs)	Docs

FAQs

How long does it take to transcribe a recording using async transcription?

This depends between providers and transcription configurations but a 1 hour recording takes ~1 minute with basic configurations (mixed audio using Recall.ai Transcription)

How to get the transcript summary?

To get the transcript summary, you will need to pass the transcript to a third-party LLM to analyze. Recall doesn't support transcription summaries out of the box.

Quickstart

Receive the recording.done webhook

Start an async transcription job

Only 10 Successful Transcripts Allowed Per Recording

Diarization

Transcription Cost

Success

The transcript.done webhook

Fetching the transcript

Error

The transcript.error webhook

Language Detection

FAQs

How long does it take to transcribe a recording using async transcription?

How to get the transcript summary?

Receive the `recording.done` webhook

The `transcript.done` webhook

The `transcript.error` webhook