Receive Real-Time Video: Websockets
Receive real-time video streams from a bot via websockets.
Video websockets are optimized for those doing real-time AI video analysis, providing 720p PNG image frames at 2fps.
If you're looking to receive real time video for human consumption instead, you should use RTMP by specifying the create_bot.real_time_media.rtmp_destination_url, which will give you back normal 30 frames per second video.
Quickstart
Setup
To configure a bot to receive real-time video, you should include your websocket URL in the Create Bot request by specifying the real_time_media.websocket_video_destination_url
:
curl --request POST \
--url https://us-east-1.recall.ai/api/v1/bot/ \
--header 'Authorization: '"$RECALL_API_KEY"'' \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '
{
"meeting_url": "https://meet.google.com/kbw-rphc-zsc",
"real_time_media": {
"websocket_video_destination_url": "wss://expert-puma-further.ngrok-free.app"
}
}
'
ws://
vswss://
While WebSockets can connect either via HTTP (
ws
) or HTTPS (wss
), we highly recommend establishing websocket connections over HTTPS (wss
) since the connection is SSL/TLS encrypted and much more secure.
Message Format
Each video stream will connect to your websocket server as its own connection.
The first message on websocket connection will be a JSON containing the bot ID:
{"protocol_version": 1, "bot_id": "<BOT_ID>"}
The following websocket messages will be in binary format as follows:
- First 32 bits are a little-endian unsigned integer representing the "participant_id". This participant ID is the same as the ID on the corresponding participant in the Bot's
meeting_participants
list. - Second 32 bits are a little-endian unsigned integer representing the millisecond timestamp of this frame. The timestamp is relative to the start of the video (not a unix timestamp).
- The remaining data in the websocket packet is the PNG encoded frame. See below for dimensions.
The following is sample code to decode these messages:
import asyncio
import websockets
async def echo(websocket):
async for message in websocket:
if isinstance(message, str):
print(message)
else:
stream_id = int.from_bytes(message[0:4], byteorder='little')
timestamp = int.from_bytes(message[4:8], byteorder='little')
with open(f'output/{stream_id}-{timestamp}.png', 'wb') as f:
f.write(message[8:])
print("wrote message")
async def main():
async with websockets.serve(echo, "0.0.0.0", 8765):
await asyncio.Future()
asyncio.run(main())
Image Frame Dimensions
The dimensions for the PNG images are the same for all meeting platforms.
Video stream | Image Dimensions |
---|---|
Participant - Default | 1280x720 |
Participant - While screensharing | 256x144 |
Screenshare | 1024x576 |
Known Issues
- If your bot is configured with
recording_mode: speaker_view
, you will always getstream_id=0
, and you will also receive only a single stream of video corresponding to the active speaker. You must setrecording_mode
togallery_view
orgallery_view_v2
for this feature to work properly.
FAQ
What is the retry behavior?
If we are unable to connect to your endpoint, or are disconnected, we will re-try the connection every 3 seconds, while the bot is alive.
Updated 3 months ago