Turn Alexa into Your Personal Messaging Assistant

Published in

RingCentral Developers

13 min readSep 7, 2017

Do you want to have an assistant to check for unread SMS messages, read them aloud and let you reply to the sender without the need of using your phone or touching a single button on a device? If yes, go on to read this article as we will discuss about developing an application to perform such a task.

In this tutorial, we will walk through necessary steps to develop an Alexa skill that links to your RingCentral account and perform the following tasks.

Ask a user to login his/her RingCentral account.
Fetch the user’s name and phone number.
Fetch unread SMS messages.
Read aloud each SMS message and let the user decide to:
a. Reply to the sender with a text message.
b. Listen to the next message.
Mark an unread message as read after reading the message.

Prerequisites

You must have a RingCentral developer account. If you don’t have one, click here to create a free developer account.
You must have an AWS developer account. If you don’t have one, click here to create a free account.

Pre-knowledge requirements

You must have basic knowledge about how to create an Alexa Skill Kit and AWS Lambda function. If you are new to Alexa Skill development, click here to get started before moving on to read this article.

We use Node JS for the programming language to build our skill. If you want to have the source code in other programming languages, please send your request to devsupport@ringcentral.com.

Note that the code snippets shown in this article are just for illustrations and may not work directly with copy and paste. We recommend you download the entire project from here.

Create a RingCentral app

Enter the application name and description

Specify the application type and choose the platform type as shown below:

Add required permissions and provide the OAuth redirect URI. For now, we can leave the OAuth redirect URI field blank. The redirect URI will be obtained from the Account Linking section below.

Finally, click the Create button to complete the app creation process. At the application dashboard, copy the App Key and App Secret credentials and keep them in a safe place so we can use them later.

If you want to learn more about creating a RingCentral application. Please refer to this getting started document for details.

Create an AWS Lambda function for Alexa Skill

We assumed that you already know how to create a Lambda function for Alexa skill so we won’t discuss in detailed how to create one in this tutorial. If you don’t know where to start, please refer to this document for detailed instructions.

When creating a Lambda function, provide all the mandatory information and choose Node JS 6.10 for the Runtime option.

You will also need to add the following key/value parameters to the function’s Environment variables.

RC_APP_SERVER_URL: https://platform.devtest.ringcentral.com
AppID: the value will be obtained when we create an Alexa Custom Skill in the next section.
RC_APP_SECRET: the RingCentral App Secret obtained from the previous section.
RC_APP_KEY: the RingCentral App Key obtained from the previous section.

At the Lambda function code section, select “Upload a Zip file” from the Code entry type dropdown list. We will implement the code and upload the files later.

Create a new Alexa Custom Skill

Login to your Amazon Developer account and open your skills list. Click on the Add a New Skill button on the upper right corner of the page. On the Create New Skill page, set the Skill Type radio button to “Custom Interaction Model”. Set the Name to “RingCentral Messaging Skill”, and the Invocation Name to “my assistant”.

Save the app and click Next to move on to the Interaction Model form, then copy the following code block and paste it into the Intent Schema field.

{
  "intents": [
    {
      "intent": "GetUnreadTextMessageIntent"
    },
    {
      "intent": "ReadTextMessageIntent"
    },
    {
      "intent": "ReplyTextMessageIntent"
    },
    {
      "slots": [
        {
          "name": "MessageBody",
          "type": "AMAZON.LITERAL"
        }
      ],
      "intent": "TextMessageIntent"
    },
    {
      "intent": "DoneIntent"
    },
    {
      "intent": "AMAZON.HelpIntent"
    },
    {
      "intent": "AMAZON.YesIntent"
    },
    {
      "intent": "AMAZON.NoIntent"
    }
  ]
}

Then define a set of sample utterances. Simply copy the following code block and paste it into the Sample Utterances field.

GetUnreadTextMessageIntent get unread message
GetUnreadTextMessageIntent get text message
ReadTextMessageIntent next
ReadTextMessageIntent next message
ReplyTextMessageIntent reply
ReplyTextMessageIntent reply message
DoneIntent I'm done
DoneIntent I am done
TextMessageIntent message body {short message|MessageBody}
TextMessageIntent message body {this is for a long text message|MessageBody}

Link your RingCentral account

The best way to let each user login to RingCentral with their own account credentials, is to enable the account linking. To enable the feature, we continue from the step above by selecting the Configuration option from the Alexa skill dashboard and under the Account Linking section, select the Yes radio button option and provide required information as shown and explained below:

Authorization URL: Use RingCentral authorization endpoint https://platform.devtest.ringcentral.com/restapi/oauth/authorize
Client Id: Use the RingCentral app’s AppKey for this field.

Follow the steps below to complete the account linking settings:

Copy the Redirect URL and paste it to the OAuth redirect URL of the RingCentral app as explained in the last step of the Create a RingCentral App section. Assumed that the user device is registered in the U.S, the https://pitangui.amazon.com/api/skill/link/XXXXXXXX URL will be used.
Set the Auth Code Grant radio button for the Authorization Grant Type
Copy this URI and paste it into the Access Token URI field: https://platform.devtest.ringcentral.com/restapi/oauth/token
Copy your RingCentral app’s AppSecret and paste it into the Client Secret field.

Complete your Alexa Skill creation process by providing publishing information and skills beta testing so you can let other users to test the skill.

Also remember to copy the Alexa skill’s App Id (found from the Skill Information form) and paste it into the Lambda function environment variables as discussed earlier in the Create an AWS Lambda function for Alexa Skill section.

Implement Code for a Lambda Function

From a local machine, create a new project named rc-alexa-skill.

Note: The complete source code for this skill is available for download from here.

$ mkdir rc-alexa-skill
$ cd rc-alexa-skill

Then install the Alexa and the RingCentral Node JS SDKs. We save the SDKs locally so that we can zip them and upload the SDKs’ files to AWS Lambda server later.

$ npm install alexa-sdk — save
$ npm install ringcentral –save

To keep thing simple, we create a single file named index.js and then complete the code step by step as discussed in the following section.

‘use strict’;const Alexa = require(‘alexa-sdk’);
const RC = require(‘ringcentral’);var rcsdk = new RC({
    server: process.env.RC_APP_SERVER_URL,
    appKey: process.env.RC_APP_KEY,
    appSecret: process.env.RC_APP_SECRET
});var platform = rcsdk.platform();
var speech_output = ""
var reprompt_text = ""exports.handler = function(event, context){
    var alexa = Alexa.handler(event, context);
    alexa.appId = process.env.AppID;
    alexa.registerHandlers(handlers);
    alexa.execute();
};

From the code above, we import the SDKs and create an instance rcsdk of the RingCentral SDK. We will use the RingCentral app’s AppKey and AppSecret specified in the lambda function’s environment variables.

Important: When you publish your RingCentral app, don’t forget to change the Lambda function’s environment variables with the production server (https://platform.ringcentral.com) and app’s credentials for production!

We also get the platform instance from the SDK, so we can use it to call RingCentral APIs later.

Then we create and export a Lambda function handler. The next step is to define the handlers object and implement functions to handle Alexa’s requests.

var handlers = {
    'LaunchRequest': function () {    },
    'GetUnreadTextMessageIntent': function () {    },
    'ReadNextTextMessageIntent': function () {    },
    'ReplyTextMessageIntent': function () {    },
    'TextMessageIntent': function () {    },
    'AMAZON.YesIntent': function () {    },
    'AMAZON.NoIntent': function () {    },
    'DoneIntent': function () {    },
    'AMAZON.HelpIntent': function () {    },
    'Unhandled': function () {    }
};

Let us now implement each intent function and have explanation for essential codes. Remember that the code snippets in this section may not be completed. For implementation, use the project’s code available from github instead.

'LaunchRequest': function () {
  if (this.event.session.user.accessToken == undefined) {
    this.emit(':tellWithLinkAccountCard','to start using my 
           assistant skill, please use the companion app to 
           authenticate on RingCentral');  }else{
    var data = platform.auth().data();
    data.token_type = "bearer"
    data.expires_in = 86400
    data.refresh_token_expires_in = 86400
    data.access_token = this.event.session.user.accessToken
    platform.auth().setData(data)
    ...
  }
}

When a user invokes the skill (by saying “Alexa open my assistant”), the LaunchRequest function is called. First, we check if the user has authorized Amazon to request for an OAuth access token for our RingCentral app.

If the access token does not exist, we return a “LinkAccount” card, displayed in the Alexa app. The card will contain a link allowing the user to authenticate on login with RingCentral.

If the access token exists, we will set the access token using the platform.auth().setData(data) method so that we can use the platform instance to call RingCentral APIs.

We continue to implement the LaunchIntent function to fetch the user’s name, direct phone number and we will keep them in the session attributes for later usage. You can save the information into e.g. AWS DynamoDB if you want to avoid calling these code every time the user invokes the skill.

var thisHandler = this// Retrieve the user's name and extension number from RC account
platform.get('/account/~/extension/~/')
  .then(function(response) {
  var obj = response.json();
  thisHandler.attributes['extNumber'] = obj.extensionNumber
  thisHandler.attributes['userName'] = obj.name
  
  // Retrieve the user's phone number
  platform.get('/account/~/extension/' + obj.id + '/phone-number')
    .then(function(response) {
      var obj = response.json();
      var count = obj.records.length
      for (var record of obj.records){
        // check if the user has a direct number
        if (record.usageType == "DirectNumber"){
          thisHandler.attributes['ownPhoneNumber'] =
                      record.phoneNumber.replace("+", "")
          break;
        }
      }      // if there is no direct number
      if (!thisHandler.attributes['ownPhoneNumber']) {
        speech_output = "Hi " 
        speech_output += thisHandler.attributes['userName']
        speech_output += "Unfortunately, your account does not 
                          support SMS message."
        thisHandler.emit(':tell', speech_output)
      }else{
        speech_output = "Hi " 
        speech_output += thisHandler.attributes['userName']
        speech_output += ". How can I help you?"
        reprompt_text = "How can I help you?"
        thisHandler.emit(':ask', speech_output, reprompt_text)
      }
    })
    .catch(function(e) {
      thisHandler.emit(':tell', "Fail to read your account. Please
      try again.")
    });  })
  .catch(function(e) {
    thisHandler.emit(':tell', "Fail to read your account. Please try 
    again.")
  });
}

The GetUnreadTextMessageIntent function is called when the user says e.g. “get unread message”. Inside the function, we implement code to fetch unread messages from the user’s RingCentral account. We define the params variable and specify the messageType, readStatus and the direction parameters to fetch only inbound and unread SMS messages.

'GetUnreadTextMessageIntent': function () {
  this.attributes['index'] = -1
  this.attributes['textMsgs'] = []
  var params = {}
  params['messageType'] = "SMS"
  params['readStatus'] = "Unread"
  params['direction'] = "Inbound"
  var thisHandler = this// Call to fetch SMS messages
  platform.get('/account/~/extension/~/sms', params)
    .then(function (response) {
      var obj =response.json();
      var count = obj.records.length
      if (count > 0){
        // iterate the records array to read each message details.
        // last message first
        for (var i=count-1; i>=0; i--) {
          var record = jsonObj.records[i]
          var message = {}
          message['id'] = record.id
          // check if sender name exists in the message
          if ("name" in record.from){
            message['from'] = record.from.name
          }else{
            // sender's name is not defined. Convert the sender's
            // number to a string with space between each number
            // so Alexa can read the digit instead of the number
            message['from'] = 
                   getNumberAsString(record.from.phoneNumber)
          }
          // keep the number so we can reply to the sender if needed
          message['fromNumber'] = record.from.phoneNumber
          // store the message body 
          message['subject'] = record.subject
          // add the message object to the 'textMsgs' array
          thisHandler.attributes['textMsgs'].push(message)
        }
        // emit the ReadTextMessageIntent to read the first message
        thisHandler.emit('ReadTextMessageIntent')
      }else{
        thisHandler.emit(':tell', "You have no unread message.");
      }
    });
}

The ReadTextMessageIntent function is called when the user says “next” or “next message”. Inside the function, we implement code to read an unread message from the “textMsgs” array. We also set the message status to read when reading an unread message.

'ReadTextMessageIntent': function () {
  // check if there is any message in the array
  if (!this.attributes['textMsgs'] ||
      this.attributes['textMsgs'].length == 0) {
    return this.emit(':ask', "Please say get unread message to check
           for new messages.", "How can I help you?");
  }
  var count = this.attributes['textMsgs'].length
  var index = this.attributes['index']
  if (index >= count-1){
    // no more message
    return this.emit(':ask', "There is no more unread message. You 
             can say reply, or say get unread message to check for 
             new messages.", "How can I help you?");
  }
  // increase the index and retrieve a message object from the array
  this.attributes['index']++
  var msg = this.attributes['textMsgs'][this.attributes['index']]
  var prefix = ""
  if (this.attributes['index'] == 0){
    if (count == 1)
      prefix = "You have 1 unread message "
    else
      prefix = "You have "+count+" unread messages. First message "
  }else{
    if (this.attributes['index'] == count - 1)
      prefix = "Last message "
    else {
      prefix = convertNumtoOrder(this.attributes['index'])
      prefix += " unread message "
    }
  }  speech_output = prefix
  speech_output += "from " + msg['from']
  speech_output += ". Message. " + msg['subject'] + ". "
  if (this.attributes['index'] < count) {
    speech_output += "You can say reply or next message. "
    reprompt_text = "You can say reply or next message."
  }else {
    speech_output += "You can say reply or I am done."
    reprompt_text = "How can I help you?"
  }
  // call to set this message's status in the server to "read"
  platform.put('/account/~/extension/~/message-store/'+msg['id'], {
    readStatus: "Read"
    })
    .then(function (response) {
      // ask Alexa to read the message
      this.emit(':ask', speech_output, reprompt_text);
    })
    .catch(function(e) {
      console.log("Failed to set readStatus")
      console.error(e);
    });
}

The ReplyTextMessageIntent function is called when the user says “reply” or “reply message”. Inside the function, we implement code to confirm the recipient’s phone number and get ready to intake the text message from the user.

'ReplyTextMessageIntent': function () {
  // check if there is any message in the message array
  if (!this.attributes['textMsgs'] ||
      this.attributes['textMsgs'].length == 0) {
    return this.emit(':ask', "Please say get unread message to check
         for new messages, then say reply.", "How can I help you?");
  }
  // retrieve the message object from the array
  var msg = this.attributes['textMsgs'][this.attributes['index']]
  speech_output = "Reply to " + msg['from'] 
  speech_output += ". Now you can say message body, followed by the 
                   message you want to send."
  this.attributes['message'] = ""
  this.attributes['toNumber'] = msg['fromNumber'];
  this.emit(':ask', speech_output, speech_output);
}

The TextMessageIntent function is called when the user says “message body” and speak out the words to be sent. Inside the function, we ask Alexa to repeat the user’s text message to confirm if Alexa heard all the words correctly. We then wait for the user to say “yes” to send the message or to say “no” to cancel the action.

'TextMessageIntent': function () {
  var intent = this.event.request.intent;
  var message = intent.slots.MessageBody.value
  this.attributes['message'] = message
  speech_output = "I repeat your message. " 
  speech_output += message + ". Do you want to send it now?"
  reprompt_text = "Say yes to send the message or say no to cancel."
  this.emit(':ask', speech_output, reprompt_text);
}

The AMAZON.YesIntent function is called when the user says “yes”. Inside the function, we check the values of the recipient’s phone number and the text message before sending the message.

AMAZON.YesIntent': function () {
  // check if we have the phone number to reply a message
  // check also if we've captured the text message
  if (this.attributes['toNumber']){
    if (this.attributes['message']) {
      var thisHandler = this
      platform.post('/account/~/extension/~/sms', {
        from: {'phoneNumber': this.attributes['ownPhoneNumber']},
        to: [{'phoneNumber': this.attributes['toNumber']}],
        text: this.attributes['message']
        })
        .then(function (response) {
          speech_output = "Message is sent. "
          var count = thisHandler.attributes['textMsgs'].length
          if (thisHandler.attributes['index'] < count - 1) {
            speech_output += "You can say next to listen to the next 
                            message"
            reprompt_text = "You can say next to listen to the next
                           message"
          }else{
            speech_output += "No more unread message. You can say 
                    get message to check for new unread messages."
            reprompt_text = "How can I help you?"
          }
          thisHandler.emit(':ask', speech_output, reprompt_text);
        })
        .catch(function(e) {
          console.error(e);
          thisHandler.emit(':ask', "Failed to send message. Please 
                try again", "Say message body, followed by the text 
                message you want to send.");
        });
    }else{
      // no message is missing, prompt the user to say the message
      speech_output = "Say message body, followed by the text 
                       message you want to send."
      this.emit(':ask', speech_output, speech_output);
    }
  }else{
    speech_output = 'Sorry, I don\'t understand what you want me to 
                     do. Please say help to hear what you can say.';
    this.response.speak(speech_output).listen(speech_output);
    this.emit(':responseReady');
  }
}

The AMAZON.NoIntent function is called when the user says “no”. Inside the function, we check if the user cancels the action because of the reply message was taken incorrectly. If so, we ask the user to provide the message body again.

AMAZON.NoIntent': function () {
  if (this.attributes['message'] &&
      this.attributes['message'].length > 0){
    speech_output = "If the message is incorrect, you can say 
                     message body, followed by the text message you 
                     want to send."
  }else{
    speech_output = 'How can I help you?'
  }
  this.emit(':ask', speech_output, 'How can I help you?')
}

The DoneIntent function is called when the user says “I am done”. Inside the function, we say goodbye to the user and terminate the session. We also implement the AMAZON.HelpIntent function to tell the user how to use the assistant skill. Finally, we implement the Unhandled function to handle unexpected commands.

'DoneIntent': function () {
  this.emit(':tell', 'Good bye');
},
'AMAZON.HelpIntent': function () {
  speech_output = 'Say get unread message to fetch new unread text 
                   messages.'
  reprompt_text = 'How can I help you?'
  this.emit(':ask', speech_output, reprompt_text);
},
'Unhandled': function () {
  speech_output = 'Sorry, I don\'t understand what you want me to 
                   do. Please say help to hear what you can say.';
  this.response.speak(speech_output).listen(speech_output);
  this.emit(':responseReady');
}

When you are done with the code, select the index.js file and the node_modules folder to compress a .zip file then upload the compressed file to your AWS Lambda project as described in the Create an AWS Lambda Function for Alexa Skill section.

Congratulations! You have just completed the implementation of a simple Alexa skill for RingCentral. There are many more features you can add to the skill to enhance both usability and functionality. For example, implement handler states and using dialog interface, and add new intents to check for voicemails, play back voicemails, or to make a ring-out call etcetera. I will leave that to your own innovation and imagination to make your messaging assistant more useful.