Send AI Agents to Meetings
Use Recall.ai's Output Media API to create meeting bots that can listen and respond in real time
Recall lets you take full control over your bot's camera and microphone through the Output Media API. This API allows you to create interactive AI agents that can listen to a meeting and react in real time. Developers building on Recall use this functionality to power AI sales agents, coaches, recruiters, interviewers, and more.
For example implementations and use cases, see our demo repos:
Quickstart
Streaming a webpage's audio/video to a meeting
Why Use a Webpage?
A webpage gives developers an easy and familiar interface to create real-time audio and visual responses: you can update charts, render an avatar, or play synthesized speech, all using standard HTML/CSS/JavaScript.
You can use the output_media
configuration in the Create Bot endpoint to stream the audio and video contents of a webpage to your meeting. The bot can display the webpage either as a screen-share, or directly through its own camera.
output_media
takes the following parameters:
kind
: The type of media to stream (currently onlywebpage
is supported)config
: The webpage configuration (currently only supportsurl
)
Let's look at an example call to the Create Bot endpoint:
// POST /api/v1/bot/
{
"meeting_url": "https://us02web.zoom.us/j/1234567890",
"bot_name": "Recall.ai Notetaker",
"output_media": {
"camera": {
"kind": "webpage",
"config": {
"url": "https://www.recall.ai"
}
}
}
}
The example above tells Recall to create a bot that will continuously stream the contents of the recall.ai homepage to the provided meeting URL.
Stopping Output Media
You can stop the bot's media output at any point while the bot is streaming media by calling the Stop Output Media endpoint.
curl --request DELETE \
--url https://us-east-1.recall.ai/api/v1/bot/{bot_id}/output_media/ \
--header 'Authorization: ${RECALL_API_KEY}' \
--header 'accept: application/json' \
--header 'content-type: application/json'
--data-raw '{ "camera": true }'
Starting Output Media
You can also choose to start streaming a webpage by calling the Output Media endpoint at any time while the bot is in a call.
The parameters for the request are the same as the output_media
configuration.
curl --request POST \
--url https://us-east-1.recall.ai/api/v1/bot/{bot_id}/output_media/ \
--header 'Authorization: ${RECALL_API_KEY}' \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data-raw '
{
"camera": {
"kind": "webpage",
"config": {
"url": "https://recall.ai"
}
}
}
'
Making the bot interactive
So far you've seen how to stream any webpage to a meeting. But to build a real AI agent, you need a webpage that can listen to the information coming from meeting and respond dynamically. This means creating a webpage that can receive live meeting data and update in real time.
Setting up your development environment
For development, you'll want to be able to iterate quickly on your webpage and see changes reflected immediately in the bot. The easiest way to do this is:
- Run a local development server with your webpage. You can either create your own or clone one of our sample repos (Real-Time Translator, Voice Agent)
- Expose it publicly using a tunneling service like Ngrok
- Point your bot to the public tunnel URL
This lets you edit your code locally and instantly see the results in your meeting bot. Once everything is configured, here's what your Create Bot request should look like:
// POST /api/v1/bot/
{
"meeting_url": "https://us02web.zoom.us/j/1234567890",
"bot_name": "Recall.ai Notetaker",
"output_media": {
"camera": {
"kind": "webpage",
"config": {
// traffic to this URL will be forwarded to your localhost
"url": "https://my-static-domain.ngrok-free.app"
}
}
}
}
Listening to meeting audio from the webpage
When your bot starts streaming your webpage, the webpage automatically gets access to the live audio from the meeting.
No User Interaction Required
Normally, accessing microphone audio requires user permission and interaction (like a button click). However, the bot automatically grants microphone permissions, so your webpage will be able to access audio immediately without user prompts or click events.
You can access a MediaStream
object and its audio track from the webpage running inside the bot. The following example shows how to get samples of the meeting audio in AudioData
objects:
const mediaStream = await navigator.mediaDevices.getUserMedia({ audio: true });
const meetingAudioTrack = mediaStream.getAudioTracks()[0];
const trackProcessor = new MediaStreamTrackProcessor({ track: meetingAudioTrack });
const trackReader = trackProcessor.readable.getReader();
while (true) {
const { value, done } = await trackReader.read();
const audioData = value;
... // Do something with the audio data
}
From here, you can process the audio however you need. For example, pipe it to OpenAI's Realtime API for speech-to-speech processing, then output the AI's audio response back to the meeting participants through your webpage's audio elements. This creates a fully interactive voice agent that can have natural conversations with meeting attendees.
Accessing real-time meeting data from the webpage
The bot exposes a websocket endpoint to retrieve real-time meeting data while the webpage is streaming audio and video to the call. Right now, only real-time transcripts are supported. You can connect to the real-time API from your webpage with the following example:
const ws = new WebSocket('wss://meeting-data.bot.recall.ai/api/v1/transcript');
ws.onmessage = (event) => {
const message = JSON.parse(event.data).transcript?.words?.map(l => l.text)?.join(' ');
// .. your logic to handle realtime transcripts
};
ws.onopen = () => {
console.log('Connected to WebSocket server');
};
ws.onclose = () => {
console.log('Disconnected from WebSocket server');
};
The websocket messages coming from the /api/v1/transcript
endpoint have the same shape as the data
object in Real-time transcription .
Debugging Your Webpage
Accessing Chrome Devtools
During the development process, you will need to debug issues with your Output Media bot's webpage. Recall provides an easy way to connect to the webpage's Chrome Devtools while the bot is running. Check the video demo below and read the following instructions to learn how to access your bot's Devtools.
- Send an output media bot to your meeting, and wait for its output media stream
- Log in to your Recall.ai dashboard
- Select Bot Explorer in the sidebar
- In the Bot Explorer app, search for your bot by ID
- Open the "Debug Data" tab for your bot. Then under CPU Metrics, click the "Open Remote Devtools" button. A devtools inspector connected to your live bot will open in a new tab.

This opens a full Chrome inspector connected to your bot's browser. You can inspect elements, check console logs, monitor network requests, and debug just like you would locally.
Bot must be alive
Since Output Media Devtools are exposed by the bot itself and CPU metrics are in real-time, they are only available when the bot is actively in a call.
Profiling CPU usage
You can also view the CPU usage for an individual bot in the "Bot Details" section. You can use this graph to uncover any performance bottlenecks with your webpage which might be causing the webpage to lag or perform poorly.

Addressing audio and video issues: bot variants
While we expose CPU metrics to help you identify and address any performance issues on your end, sometimes this is out of your control and you just need more CPU power or hardware acceleration. Below is a breakdown of the compute resources available to the instance running your webpage:
Variant | CPU | Memory | WebGL |
---|---|---|---|
web (default) | 250 millicores | 750MB | ❌ Unsupported |
web_4_core | 2250 millicores | 5250MB | ❌ Unsupported |
web_gpu | 6000 millicores | 13250MB | ✅ Supported |
native | ❌ Unsupported | ❌ Unsupported | ❌ Unsupported |
To use these configurations, you can specify the variant
in your Create Bot request. For example, this is how you can specify that your bot should use the 4 core bot variety on all platforms:
{
...
"variant": {
"zoom": "web_4_core",
"google_meet": "web_4_core",
"microsoft_teams": "web_4_core"
}
}
These bots run on larger machines, which can help address any CPU bottlenecks hindering the audio & video quality of your Output Media feature.
Important
Due to the inherent cost of running larger machines, the prices for some variants are higher:
Variant Pay-as-you-go plan Monthly plans web_4_core
$1.10/hour standard bot usage rate + $0.10/hour web_gpu
$2.00/hour standard bot usage rate + $1.00/hour
Platform Support
Platform | Bot Configuration (output_media ) |
---|---|
Zoom* | ✅ |
Google Meet | ✅ |
Microsoft Teams | ✅ |
Cisco Webex | ✅ |
Slack Huddles | ❌ |
*Zoom native bots not supported
FAQ
Why is the bot's audio / video output choppy?
If the audio or video output from your bot is choppy, it's likely that your bot's instance doesn't have enough CPU power to handle your use case. You can test this by upgrading the bot to a larger, more powerful instance. Typically the web_4_core
instance is sufficient for most Output Media use cases. To switch to 4 core bots, include this in your Create Bot request:
{
...
"variant": {
"zoom": "web_4_core",
"google_meet": "web_4_core",
"microsoft_teams": "web_4_core"
}
}
What are the browser dimensions of the webpage?
1280x720px
Why doesn't the bot's video/screenshare show in the recording?
It currently isn't possible to include the recording of the bot in the final recording. That said, you can still include the bot's audio by setting the recording_config.include_bot_in_recording.audio = true
Can I use the Automatic Audio Output or Automatic Video Output parameters while using Output Media?
No. The Output Media cannot be used with automatic_video_output
or automatic_audio_output
parameters. These features are mutually exclusive.
Similarly, the Output Video and Output Audio endpoints can not be used if your bot is actively using the Output Media feature.
Updated about 19 hours ago