How to setup Marsview Conversation Intelligence APIs with Agora Cloud Recordings

Rahul B Prakash
Marsview.ai Inc.
Published in
7 min readAug 31, 2021
Get conversation insights using Marsview on Agora RTE

Introduction

This is a comprehensive guide on how to set up Marsview Conversation Intelligence APIs with Agora Cloud Recordings. By Integrating Agora with Marsview Conversation Intelligence APIs one can extract rich metadata and insights from the webRTC Conversations.

What is Marsview Conversation Intelligence API Platform?

Marsview Conversation Intelligence API platform offers a comprehensive suite of proprietary APIs and developer tools for automatic speech recognition, speaker separation, emotion and sentiment recognition, intent recognition, time-sequenced visual recognition, and more. Marsview APIs provide end-to-end workflows from call listening, recording, insights generation, and Voice of Customer Insights.

How do Marsview Conversation Intelligence APIs work?

Shown above is the process of extracting Conversation Intelligence Insights on an input audio/video file.

Marsview APIs connect with various sources of Input like Telephony streams, Audio/Video Streams, or an Audio/Video File.

In this article, we will be Connecting Marsview Speech Analytics APIs with a private AWS S3 Bucket. This S3 Bucket will be configured to store Call Recordings in the Agora App.

The Process flow can be split into four API calls.

  1. Upload Recording: A recording link (AWS S3 Presigned URL in this case) will be uploaded. On successful upload, this method returns a Transaction ID
  2. Compute Insights: A compute request will be sent with the desired models to be enabled on the Transaction ID. This method returns a unique Request ID for each of the models that are enabled.
  3. Long Polling for Status: The compute insights process will take some time to finish. The ‘get_request_status’ method can be used to check the computation status of these models.
  4. Fetch Insights: Once the compute status is set to ‘completed’, the ‘Get Metadata’ route can be used to fetch the generated insights for that Transaction ID

How Does Agora Cloud Recording work?

Agora Cloud Recording is a component provided by Agora to record and save voice calls, video calls, and interactive streaming on your cloud storage. In this article, we will configure the Agora app to save these recordings on a private AWS S3 Bucket.

Shown above is the complete process of how cloud recording works. (Source: Agora)

The recording process flow can be split up into five API Calls.

  1. Acquire: This method is used to acquire a cloud recording resource on which the recording will be saved. This method returns a resource ID pointing to the cloud recording resource.c
  2. Start: This method is used to start a cloud recording. If the recording has started successfully a Recording ID will be returned.
  3. Query: This method can be used to check recording status during the recording.
  4. Stop and Upload: This method is used to stop the Cloud recording. Once the recording is stopped the recording file will be saved on the Configured AWS S3 bucket.

Project Setup

Part 1: Agora App

  1. Sign up on Agora and create a new project with an App ID and a temporary token.
Create a new project on the Agora Console (Source: Agora)

2. Ensure that a third-party cloud storage service has been enabled. In this case, we will be using a private AWS S3 Bucket.

3. Select a project from the drop-down list in the upper-left corner, and click Duration under Cloud Recording. Enable Cloud Recording and apply the changes.

Menu > Products & Usage > Duration > Enable Cloud Recording (Source: Agora)

For more information and prerequisites on how to set up a Project on Agora refer to this doc page under the References Section.

Part 2. Marsview APIs

  1. Create an account on https://app.marsview.ai/ and choose the Speech Analytics API Bundle
  2. Get your API Key and Secret from the portal. (as shown below)
Marsview Portal homepage(Source: Marsview)

Implementing the workflow

The UML sequence diagram below shows us the API Call sequence from your App server to Agora Cloud and Marsview Insights APIs.

Sequence Diagram. (Reference: Agora, Marsview)

CURL requests for each of these steps are given below:

1. Aquire Resource

Call the acquire method to request a resource ID for cloud recording. Once we get the Resource ID it is valid for 5 minutes and the recording must start before it expires. A Resource ID is only valid for one Recording. (Source: Agora)

curl --location \
--request POST \ 'https://api.agora.io/v1/apps/<appid>/cloud_recording/acquire' \
--header 'Authorization: Basic MjdiZjhjMmRkNTNhNGQwZGEwXXXXXXXXXE5Yzc6YjM2N2NiMjRiOTExNDQyYTg5YjU5YTdmN2Y0YjM1OWM=' \ --header 'Content-Type: application/json' \
--data-raw '{ "cname": "<YourChannelName>",
"uid": "<YourRecordingUID>",
"clientRequest":{ }
}'

2. Start Recording

Call the start to start the recording on the Resource ID. Choose composite recording as the recording mode. (Source: Agora)

curl --location \
--request POST \ 'https://api.agora.io/v1/apps/<appid>/cloud_recording/resourceid/<resourceid>/mode/<mode>/start' \
--header 'Authorization: Basic MjdiZjhjMmRkNTNhNGQwZGEwXXXXXXXXXE5Yzc6YjM2N2NiMjRiOTExNDQyYTg5YjU5YTdmN2Y0YjM1OWM=' \
--header 'Content-Type: application/json' \
--data-raw '{ "uid": "<YourRecordingUID>",
"clientRequest": {
"token": "<YourToken>",
"storageConfig": {
"secretKey": "<YourSecretKey>",
"vendor": 0,
"region": 0,
"bucket": "<YourBucketName>",
"accessKey": "<YourAccessKey>"
},
"recordingConfig": {"channelType": 0}
}
}'

3. Query recording status

Call the query method to query the recording status multiple times. (Source: Agora)

curl --location 
--request GET \ 'https://api.agora.io/v1/apps/<appid>/cloud_recording/resourceid/<resourceid>/sid/<sid>/mode/<mode>/query' \
--header 'Authorization: Basic MjdiZjhjMmRkNTNhNGQwZGEwXXXXXXXXXE5Yzc6YjM2N2NiMjRiOTExNDQyYTg5YjU5YTdmN2Y0YjM1OWM=' \
--header 'Content-Type: application/json'

4. Stop Recording

Call the stop the recording. After calling this method successfully, you can get the status of the recording file upload and information about the recording file from the response body. (Source: Agora).

Generate a pre-signed URL for this S3 Object and upload it to Marsview Conversation Insights APIs (Shown in the next step)

curl --location --request POST \
'https://api.agora.io/v1/apps/<appid>/cloud_recording/resourceid/<resourceid>/sid/<sid>/mode/<mode>/stop' \
--header 'Content-Type: application/json;charset=utf-8' \
--header 'Authorization: Basic MjdiZjhjMmRkNTNhNGQwZGEwXXXXXXXXXE5Yzc6YjM2N2NiMjRiOTExNDQyYTg5YjU5YTdmN2Y0YjM1OWM=' \
--data-raw '{
"uid": "<YourRecordingUID>",
"cname": "<YourChannelName>",
"clientRequest":{}
}'

5. Upload Recording

Using your apiKeyand apiSecretyou can generate the accessToken as shown below.

curl --location --request POST 'https://api.marsview.ai/cb/v1/auth/create_access_token' \
--header 'Content-Type: application/json' \
--data-raw '{
"apiKey": "{{Insert API Key}}",
"apiSecret": "{{Insert API Secret}}",
"userId": "demo@marsview.ai"
}'

Create a presigned URL on the Agora Cloud Recording using the object name. For more information on how to create an s3 pre-signed URL refer to AWS Documentation

Please keep a minimum 20 minutes expiry on the pre-signed URL for ensuring a higher success rate of processing. In case the URL has already expired by the time Marsview picks up the request the compute API will return a Failure status with status code AIRDOW002

Upload the presigned URL onto Marsview API Platform as shown below. Upon successful upload, the route will return a Transaction ID

curl --location --request POST \
'https://api.marsview.ai/cb/v1/conversation/save_file_link' \
--header 'Content-Type: application/json' \
--header 'authorization: {{Authentication Token}}' \
--data-raw '{
"title":"A sample Call",
"description":"A sample interview call",
"link": "{{Presigned URL}}"
}'

6. Upload a Compute Request

Using Transaction IDyou can now select models usingenableModels and Configure Models using modelConfig and submit a request using POST Compute Request.

POST a Compute request as shown below.

curl --location --request POST 'https://api.marsview.ai/cb/v1/conversation/compute' \
--header 'Content-Type: application/json' \
--header 'authorization: <Your access token>' \
--data-raw '{
"txnId": "your txn id",
"enableModels":[{
"modelType":"speech_to_text",
"modelConfig":{
"custom_vocabulary":["Marsview", "Communication"],
"speaker_seperation":{"num_speakers":2},
"topics":true}}]}'

7. Long Poll for Compute request Status

Using the Get Processing State method the Processing request status can be fetched periodically (every 300 seconds in this case).

  • If the process is in uploaded state we continue to poll until the process is in either processed or error state.
  • If the process is in processed state or error state we can move to Step 8.

8. Fetch Metadata

Once a particular request is in processed state Metadata for that model can be fetched using the Get Request Metadata method.

Shown below is a CURL request for fetching the metadata.

curl --location --request GET 'https://api.marsview.ai/cb/v1/conversation/fetch_metadata/{{Transaction_ID}}' \
--header 'authorization: Bearer {{AUTH_TOKEN}}'

References and Documentation

Given below are the API Documentation pages and Reference articles used to write this article.

Agora:

Marsview:

--

--