From Amazon Alexa to Samsung Bixby Development

Dyung Ngo
Voice Tech Podcast
Published in
5 min readOct 30, 2019

I’m a big believer in voice and Conversational AI and had developed a number of Amazon Alexa skills (voice app),

started the Melbourne Amazon Alexa meetup (Australia) to connect and educate as well as build a corporate chatbot for staff at one of the major hospitals in Victoria to answer questions about human resources and payroll. But mostly I enjoy building voice experiences and have done so mostly on the Amazon Alexa platform. Recently I was invited to join the Bixby Premier Developer program and build a capsule (voice app) on the Samsung Bixby platform. From my personal perspective, I would sum up my experience with the following sentence.

“It’s different but fundamentally the same.”

Below is from the Bixby video tutorial series presented by Roger Kibbe. It a nice visual of that maps the differences in terminology.

Alexa

In designing for voice on Alexa, you start the interaction model. This maps the users’ spoken input to the intents your cloud-based service can handle.​

  • ​Intents: An intent represents an action that fulfils a user’s spoken request. Intents can optionally have arguments called slots.​
  • Sample utterances: A set of likely spoken phrases mapped to the intents. This should include as many representative phrases as possible.​
  • Custom slot types: A representative list of possible values for a slot. Custom slot types are used for lists of items that are not covered by one of Amazon’s built-in slot types.

Bixby

Developing for Bixby, intents, utterances and slots are mapped to the following components.

  • Concept (equivalent to slots)

According to the Bixby Developer Centre guide on Modelling Concepts:

A concept describes any “thing.” It could represent a concrete object, such as coffee, flowers, or an airport. A concept can also be abstract, such as a day, flight, or order.

There are two types of concepts.

Primitive concepts represent simple types, like text or numbers. Structure concepts are more complex, representing records with named properties.

  • Training (equivalent to Utterances)

According to the Bixby Developer Centre guide on Training for Natural Language:

Bixby learns how to handle NL utterances by example. You provide training examples that consist of sample utterances annotated to connect words and phrases to your capsule’s concepts and actions, aligning them to an intent.

  • Action (equivalent to Intents)

According to the Bixby Developer Centre guide on Modelling Actions:

An action defines an operation that Bixby can perform on behalf of a user. If concepts are nouns, actions are verbs. Examples of actions include:

FindRestaurants: search for restaurants.

ConvertTemperature: perform a temperature conversion computation.

BookHotel: perform a transactional operation that books a hotel room.

How to handle data persistence

When moving from one platform to another, in this case, Alexa to Bixby, inevitably you bring your Alexa hat. I certainly did! The most glaring need I had developing my first capsule was the concept of persistence, both session and persistent attributes. Session Attributes carry variables for the session and persistent attributes are stored in DynamoDB so the develop can access those attributes to provide a better voice experience.

For a great resource for building for memory on Alexa, click here. Now below demonstrates why you need to remember stuff about the user or where in the voice experience they’re at. And how frustrating the experience can be when it simply doesn’t remember anything about what you’ve said previously.

For my first capsule, I re-created my Alexa skill, Boxing Legends. It’s essentially a memory game so it’s critical that the development persist parameters (or attributes).

Boxing Legends “let’s you block and move, duck and weave, would-be knock out punches and then counter-attack with jabs, hooks, crosses and uppercuts.

The sequence looks like this:

Computer (turn): “He attacks. Left jab, right uppercut.”

Player (turn): “Block left, weave right.”

What I need to remember is that the computer attacks with a [left] jab and a [right] uppercut. Then I can take the players input of a [left] block and [right] weave and match it with the attack I remembered from the computer. If the directions match, then the player blocked the punches, otherwise, whack.

Build better voice apps. Get more articles & interviews from voice technology experts at voicetechpodcast.com

Alexa

In Alexa development, when I needed to remember something during the session, I used session attributes. For Boxing Legends:

  • Generate sessionAttributes variable:

const sessionAttributes = handlerInput.attributesManager.getSessionAttributes();

  • Generate custom code for the random attack sequence:

var attack = randomAttackGenerator(“PRO”);

e.g. Attack left jab, right uppercut

  • Assign the sessionAttributes to the desired name:

sessionAttributes.lastAttack = attack;

  • Save the session attribute within the intent:

handlerInput.attributesManager.setSessionAttributes(sessionAttributes);

The above code saves a parameter called “lastAttack” that persists to the player’s turn so when the player says “Block left, weave right” I’m able to determine whether the player blocked the attack correctly. Now in code:

  • Get the block from the block slot

var block = handlerInput.requestEnvelope.request.intent.slots.block.value;

  • Run some custom code to compare the computer’s attack to the player’s block

var counterSuccess = answersMoves(sessionAttributes.lastAttack, block);

Bixby

In Bixby, there is no notion of session attributes so you need to pass a concept (or parameter) to each action (or Intent). This mimicked the concept of session attributes to as I would pass parameters such as the computer’s last attack, computer and play points for each turn (when the computer attacked and when the player blocked or attacked). I passed a Match concept to keep track of what I wanted to remember. Below is a snippet of that structure concept, Match, with a Turn concept within it.

structure (Match) {

description (A structure to manage a boxing match)

property (turn) {

description (Turn in the fight)

type (Turn)

min (Required) max (Many)

visibility (Private) }

property (playerPoints) {

description (User’s score)

type (core.Integer)

min (Required) max (One)

visibility (Private) }

property (computerPoints) {

description (User’s score)

type (core.Integer)

min (Required) max (One)

visibility (Private) }

features {

transient

} }

structure (Turn) {

description (A turn in the fight (attack or block sequence))

property (previousAttack) {

description(Previous attack)

type (core.Text)

min (Optional) max (One)

visibility (Private) }

} }

How to handle persistent parameters

Alexa

When building a skill it’s usually useful to save parameters that extend beyond just the current session. For example, I wanted to maintain the players win/loss record. Typically for an Alexa skill to maintain information we use persistent attributes.

Bixby

For the sake of convenience, I used the sample provided in the Bixby GitHub repository under User Data Persistence. This had a working example persisting data to restdb.io. It has a free tier up to 2,500 records.

If you have any follow up questions, please email dyung@fivefifteen.com.au or check out some of my Alexa skills here.

Something just for you

--

--