Send AI Agents to Meetings

Use Recall.ai's Output Media API to create meeting bots that can listen and respond in real time

Recall lets you take full control over your bot's camera and microphone through the Output Media API. This API allows you to create interactive AI agents that can listen to a meeting and react in real time. Developers building on Recall use this functionality to power AI sales agents, coaches, recruiters, interviewers, and more.

For example implementations and use cases, see our demo repos:

Quickstart


Streaming a webpage's audio/video to a meeting

📘

Why Use a Webpage?

A webpage gives developers an easy and familiar interface to create real-time audio and visual responses: you can update charts, render an avatar, or play synthesized speech, all using standard HTML/CSS/JavaScript.

You can use the output_media configuration in the Create Bot endpoint to stream the audio and video contents of a webpage to your meeting. The bot can display the webpage either as a screen-share, or directly through its own camera.

output_media takes the following parameters:

  • kind: The type of media to stream (currently only webpage is supported)
  • config: The webpage configuration (currently only supports url)

Let's look at an example call to the Create Bot endpoint:

// POST /api/v1/bot/
{
  "meeting_url": "https://us02web.zoom.us/j/1234567890",
  "bot_name": "Recall.ai Notetaker",
  "output_media": {
    "camera": {
      "kind": "webpage",
      "config": {
        "url": "https://www.recall.ai"
      }
    }
  }
}

The example above tells Recall to create a bot that will continuously stream the contents of the recall.ai homepage to the provided meeting URL.

Stopping Output Media

You can stop the bot's media output at any point while the bot is streaming media by calling the Stop Output Media endpoint.

curl --request DELETE \
     --url https://us-east-1.recall.ai/api/v1/bot/{bot_id}/output_media/ \
     --header 'Authorization: ${RECALL_API_KEY}' \
     --header 'accept: application/json' \
     --header 'content-type: application/json'
     --data-raw '{ "camera": true }'

Starting Output Media

You can also choose to start streaming a webpage by calling the Output Media endpoint at any time while the bot is in a call.

The parameters for the request are the same as the output_media configuration.

curl --request POST \
     --url https://us-east-1.recall.ai/api/v1/bot/{bot_id}/output_media/ \
     --header 'Authorization: ${RECALL_API_KEY}' \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --data-raw '
{
    "camera": {
      "kind": "webpage", 
      "config": {
        "url": "https://recall.ai"
      }
    }
}
'

Making the bot interactive


So far you've seen how to stream any webpage to a meeting. But to build a real AI agent, you need a webpage that can listen to the information coming from meeting and respond dynamically. This means creating a webpage that can receive live meeting data and update in real time.

Setting up your development environment

For development, you'll want to be able to iterate quickly on your webpage and see changes reflected immediately in the bot. The easiest way to do this is:

  1. Run a local development server with your webpage. You can either create your own or clone one of our sample repos (Real-Time Translator, Voice Agent)
  2. Expose it publicly using a tunneling service like Ngrok
  3. Point your bot to the public tunnel URL

This lets you edit your code locally and instantly see the results in your meeting bot. Once everything is configured, here's what your Create Bot request should look like:

// POST /api/v1/bot/
{
  "meeting_url": "https://us02web.zoom.us/j/1234567890",
  "bot_name": "Recall.ai Notetaker",
  "output_media": {
    "camera": {
      "kind": "webpage",
      "config": {
        // traffic to this URL will be forwarded to your localhost
        "url": "https://my-static-domain.ngrok-free.app"
      }
    }
  }
}

Listening to meeting audio from the webpage

When your bot starts streaming your webpage, the webpage automatically gets access to the live audio from the meeting.

📘

No User Interaction Required

Normally, accessing microphone audio requires user permission and interaction (like a button click). However, the bot automatically grants microphone permissions, so your webpage will be able to access audio immediately without user prompts or click events.

You can access a MediaStream object and its audio track from the webpage running inside the bot. The following example shows how to get samples of the meeting audio in AudioData objects:


const mediaStream =   await navigator.mediaDevices.getUserMedia({ audio: true });
const meetingAudioTrack = mediaStream.getAudioTracks()[0];

const trackProcessor = new MediaStreamTrackProcessor({ track: meetingAudioTrack });
const trackReader = trackProcessor.readable.getReader();

while (true) {
  const { value, done } = await trackReader.read();
  const audioData = value;
  ... // Do something with the audio data
}

From here, you can process the audio however you need. For example, pipe it to OpenAI's Realtime API for speech-to-speech processing, then output the AI's audio response back to the meeting participants through your webpage's audio elements. This creates a fully interactive voice agent that can have natural conversations with meeting attendees.

Accessing real-time meeting data from the webpage

The bot exposes a websocket endpoint to retrieve real-time meeting data while the webpage is streaming audio and video to the call. Right now, only real-time transcripts are supported. You can connect to the real-time API from your webpage with the following example:


const ws = new WebSocket('wss://meeting-data.bot.recall.ai/api/v1/transcript');

ws.onmessage = (event) => {
  const message = JSON.parse(event.data).transcript?.words?.map(l => l.text)?.join(' ');

  // .. your logic to handle realtime transcripts
};

ws.onopen = () => {
  console.log('Connected to WebSocket server');
};

ws.onclose = () => {
  console.log('Disconnected from WebSocket server');
};

The websocket messages coming from the /api/v1/transcript endpoint have the same shape as the data object in Real-time transcription .

Debugging Your Webpage


Accessing Chrome Devtools

During the development process, you will need to debug issues with your Output Media bot's webpage. Recall provides an easy way to connect to the webpage's Chrome Devtools while the bot is running. Check the video demo below and read the following instructions to learn how to access your bot's Devtools.


  1. Send an output media bot to your meeting, and wait for its output media stream
  2. Log in to your Recall.ai dashboard
  3. Select Bot Explorer in the sidebar
  4. In the Bot Explorer app, search for your bot by ID
  5. Open the "Debug Data" tab for your bot. Then under CPU Metrics, click the "Open Remote Devtools" button. A devtools inspector connected to your live bot will open in a new tab.

This opens a full Chrome inspector connected to your bot's browser. You can inspect elements, check console logs, monitor network requests, and debug just like you would locally.

📘

Bot must be alive

Since Output Media Devtools are exposed by the bot itself and CPU metrics are in real-time, they are only available when the bot is actively in a call.

Profiling CPU usage

You can also view the CPU usage for an individual bot in the "Bot Details" section. You can use this graph to uncover any performance bottlenecks with your webpage which might be causing the webpage to lag or perform poorly.

Addressing audio and video issues: bot variants

While we expose CPU metrics to help you identify and address any performance issues on your end, sometimes this is out of your control and you just need more CPU power or hardware acceleration. Below is a breakdown of the compute resources available to the instance running your webpage:

VariantCPUMemoryWebGL
web (default)250 millicores750MB❌ Unsupported
web_4_core2250 millicores5250MB❌ Unsupported
web_gpu6000 millicores13250MB✅ Supported
native❌ Unsupported❌ Unsupported❌ Unsupported

To use these configurations, you can specify the variant in your Create Bot request. For example, this is how you can specify that your bot should use the 4 core bot variety on all platforms:

{
  ...
  "variant": {
    "zoom": "web_4_core",
    "google_meet": "web_4_core",
    "microsoft_teams": "web_4_core"
  }
}

These bots run on larger machines, which can help address any CPU bottlenecks hindering the audio & video quality of your Output Media feature.

❗️

Important

Due to the inherent cost of running larger machines, the prices for some variants are higher:

VariantPay-as-you-go planMonthly plans
web_4_core$1.10/hourstandard bot usage rate + $0.10/hour
web_gpu$2.00/hourstandard bot usage rate + $1.00/hour

Platform Support


PlatformBot Configuration (output_media)
Zoom*
Google Meet
Microsoft Teams
Cisco Webex
Slack Huddles

*Zoom native bots not supported

FAQ


Why is the bot's audio / video output choppy?

If the audio or video output from your bot is choppy, it's likely that your bot's instance doesn't have enough CPU power to handle your use case. You can test this by upgrading the bot to a larger, more powerful instance. Typically the web_4_core instance is sufficient for most Output Media use cases. To switch to 4 core bots, include this in your Create Bot request:

{
  ...
  "variant": {
    "zoom": "web_4_core",
    "google_meet": "web_4_core",
    "microsoft_teams": "web_4_core"
  }
}

What are the browser dimensions of the webpage?

1280x720px

Why doesn't the bot's video/screenshare show in the recording?

It currently isn't possible to include the recording of the bot in the final recording. That said, you can still include the bot's audio by setting the recording_config.include_bot_in_recording.audio = true

Can I use the Automatic Audio Output or Automatic Video Output parameters while using Output Media?

No. The Output Media cannot be used with automatic_video_output or automatic_audio_output parameters. These features are mutually exclusive.

Similarly, the Output Video and Output Audio endpoints can not be used if your bot is actively using the Output Media feature.