Skip to main content

Best practice in integrating Cloud Recording

To improve application robustness, Agora recommends that you do the following when integrating Cloud Recording RESTful APIs:

Use dual domain names

If you send a Cloud Recording RESTful API request to api.agora.io and the request fails, retry with the same domain name first. If it fails again, replace the domain name with api.sd-rtn.com and retry. Best practice is to first try the DNS domain close to your server. See the domain name table for a list of DNS servers.

Agora recommends that you use a backoff strategy, for example, retry after 1, 3, and 6 seconds successively, to avoid exceeding the Queries Per Second (QPS) limits.

Get service status

You use Cloud Recording RESTful APIs to get the status of the recording service.

Agora recommends that core apps should not rely on the Notifications (NCS). If your apps already rely heavily on the NCS, Agora recommends that you contact support@agora.io to enable the message notification function, which doubles the received notifications and reduces the probability of message loss. After enabling the message notification function, you need to deduplicate messages based on sid. Message notification still cannot guarantee a 100% arrival rate.

The initial QPS limit is 10 per App ID when you register. You can estimate the QPS quota your project needs according to your Peak Concurrent Worker (PCW) quota and query frequency. The initial PCW limit is 50 per AppID when you register. If the RESTful API returns QPS limitation error code 429, or PCW quota limitation error code 406, then retry, or contact support@agora.io to increase your QPS or PCW quota.

Ensure the recording service starts successfully

Take the following steps to ensure that the recording service starts successfully:

  1. Ensure that the start request is successful, that is, you receive a sid (recording ID) from the response. If the request fails, take measures according to the HTTP status code:

    • If the returned HTTP status code is always 40x, check the parameter values in your request.
    • If the returned HTTP status code is 50x, you can retry several times with the same parameters until you receive a sid. Agora recommends that you use a backoff strategy, for example, retry after 3, 6, and 9 seconds successively, to avoid exceeding the QPS limits. If you retry three times and still do not get a sid, retry with a new user ID.
    • If you receive error code 65, retry with the same parameters. Agora recommends that you use a backoff strategy, for example, retry after 3 and 6 seconds successively.
  2. Five seconds after you receive a sid, use a backoff strategy to call query. Agora recommends that the interval between two query calls is shorter than maxIdleTime, which you set in the start call. If the query call succeeds, and status in serverResponse is 4 or 5, it means the recording service starts successfully. Otherwise, you can deem the recording service as not having started, or quit halfway after starting.

  • Agora recommends that you prepare a backup user ID, which you can use when restarting the recording service, to avoid two identical user IDs kicking each other out of the channel. You can alternate between the backup user ID and the original one.

Monitor service status during a recording

You can periodically call query to ensure that the recording service is in progress and in a normal state. Apart from query, you can use the NCS as a complementary method to monitor the service status. See Comparison Between the NCS and the query Method for detailed comparison between the two methods.

Periodically query service status

If the reliability of the status of a cloud recording is a high priority, Agora strongly recommends using the query method to periodically query the recording service status. The interval between two calls can be around two minutes. Take the corresponding measure based on the received HTTP status code:

  • If the returned HTTP status code is always 40x, check the parameter values in your request.
  • If the returned HTTP status code is 404, and the request parameters are confirmed to be correct, the recording has either not started successfully, or the recording quit after starting. Agora recommends that you use a backoff strategy, for example, retry after 5, 10, and 15 seconds successively.
  • If the returned HTTP status code is 50x, the query request failed, but it is not clear whether the recording has quit. The 50x error is rare. You can continue to use the backoff strategy (waiting for 5 seconds, 10 seconds, 15 seconds, or 30 seconds) to call query.

Redundant message notifications

After enabling the redundant message notification function, you need to deduplicate messages based on sid. For example, if you need to start recording again after a recording session times out and quits, the process is:

  1. Your server receives the notifications of event 31, 32, or 11, which means that the recording service quits normally.
  2. After receiving the notifications, your application calls acquire to restart the recording service.
  3. During this period, your server receives notifications of event 31, 32, or 11 again. If sid contained in the above notifications is identical to the previous ones, you can ignore them as redundant notifications.
  4. Call query if you need to fully ensure that the recording service successfully starts.

When the recording service enables the high availability mechanism, some notifications may be sent twice. You can distinguish them by the user ID in the notification. If the user ID is the same as the one you use when you start the recording, the notification belongs to the original recording session; otherwise, the notification belongs to the recording session initiated by the high availability mechanism.

Obtain the M3U8 file name

You can obtain the M3U8 file name by two means. One is by splicing according to the file naming rules. The other is by calling the query method. Agora recommends that you use splicing to obtain the M3U8 file name.

Obtain file name by splicing

In composite recording mode, the format of the M3U8 file name is <sid>_<cname>.m3u8. Therefore, you can predict the M3U8 file name by splicing. See Naming conventions for details.

Obtain file name via the query call

The M3U8 file name is generated after the first slice file is generated. Therefore, you should call query after the first slicing completes. See Slicing for details.

In composite recording mode, call query 15 seconds after the cloud recording starts; in individual recording mode, call query one minute after the cloud recording starts. If the first query call fails, you can try again after one minute.

Avoid frequent quits of the recording service

The default value of maxIdleTime in the start method is 30 seconds. If the host frequently goes online and offline, a brief maxIdleTime value causes the recording service to join and exit the channel frequently. For scenarios that require the recording service to be in the channel all the time, it is necessary to increase maxIdleTime in case the recording quits after a short idle time.

For example, if there is a fixed 5-minute break in each class, you can set maxIdleTime to 10 minutes to ensure uninterrupted recording of the entire class.

Fault recovery

Network failures and potential risks may occur due to factors such as cloud and network software, infrastructure, and other elements outside of Agora's control. To enhance the user experience, Cloud Recording offers automatic high availability task migration for failure recovery. When a failure is detected, the recording task will be migrated within 90 seconds. During this time, the recording may be disrupted and recorded files may be lost.

To guarantee high availability of important scenes with a large audience, best practice is to:

  1. Monitor recording tasks with calls to the query method.

    If the call returns a 404 error, create a new recording task with a different UID.

  2. Use Notifications to Handle notifications for specific events. After starting the recording, if you don't receive event 13 High availability register success within 10 seconds, create a new recording task with a different UID.

These fault recovery methods may result in multiple recording tasks. You are charged separately for each task. For more information, see Pricing.

Integration requirements checklist

To ensure reliability of the cloud recording service, refer to the following checklist to confirm that your solution meets the integration requirements:

SerialImportanceItemDescription
1requiredSubscribe to a serviceMake sure you have activated the cloud recording service.
2requiredrequest method
  • To query, use the POST request method; to query the recording status use GET.
  • Request URLs and request body content are case-sensitive.
3requiredGet recording resources
  • The passed in uid cannot duplicate any UID within the current channel.
  • For page recording, make sure an appid + cname + uid corresponds to a resource ID.
  • Make sure that a resource ID is only used for one cloud recording service.
  • Make sure to call the start method.
4requiredchannel sceneMake sure that the channel scene (channelType) is consistent with the settings of the Video SDK.
5requiredrecording parameters
  • Make sure that the type, case, and value range of all the parameters passed in when starting the recording are correct, and the required parameters are filled; otherwise, error code 2 is returned.
  • Set the layout and video bit rate of the combined recording by referring to the combined layout document and the recording bit rate comparison table.
6requiredConfirm that the recording service has started successfully
  • Make sure that the start request is successful, that is, the sid (recording ID) is successfully obtained.
  • Call the query method 5 seconds after getting the sid using the backoff strategy. If the returned status is still not 4 or 5, after 90 seconds, it can be considered that the recording has not started or exited after a timeout.
7requiredPCW & QPS limits
  • Make sure that each App ID does not exceed 10 requests per second (QPS).
  • To increase the QPS and PCW limits, please contact technical support.
8optionalNCS Service ActivationActivate the cloud recording callback service and subscribe to the following events as an auxiliary means of monitoring the recording service status:
  • 40 recorder_started: The recording service has started.
  • 11 session_exit: The recording service ended the task and exited.
  • 1 cloud_recording_error: An error occurred in the recording service.
  • 12 session_failover: Enable high availability mechanism for recording.
  • 31 uploaded: All recorded files have been uploaded to the specified third-party cloud storage.
9optionalUse dual domain namesIf the request fails with the primary domain name api.agora.io, try again with the primary domain name. If it fails again, switch to the secondary domain name api.sd-rtn.com and send the request again.
10optionaltimeout logicMake sure that the maxIdleTime setting is reasonable. The recommended value is 300 seconds.

Reference

Domain name table

Primary domain nameRegion domain nameRegion
api.sd-rtn.comapi-us-west-1.sd-rtn.comWestern United States
api-us-east-1.sd-rtn.comEastern United States
api-ap-southeast-1.sd-rtn.comSoutheast Asia Pacific
api-ap-northeast-1.sd-rtn.comNortheast Asia Pacific
api-eu-west-1.sd-rtn.comWestern Europe
api-eu-central-1.sd-rtn.comCentral Europe
api-cn-east-1.sd-rtn.comEast China
api-cn-north-1.sd-rtn.comNorth China
api.agora.ioapi-us-west-1.agora.ioWestern United States
api-us-east-1.agora.ioEastern United States
api-ap-southeast-1.agora.ioSoutheast Asia Pacific
api-ap-northeast-1.agora.ioNortheast Asia Pacific
api-eu-west-1.agora.ioWestern Europe
api-eu-central-1.agora.ioCentral Europe
api-cn-east-1.agora.ioEast China
api-cn-north-1.agora.ioNorth China