Introduction to testing your chatbot (with code examples)
You’ve made the most amazing chatbot ever and it’s getting great feeback and users are flocking to it and then you get a bug report… It’s a simple fix so you quickly fix the bug and deploy a new version of your bot. Now all of a sudden you get 10 bug reports about things not working, uh oh what happened?
Testing not only makes sure you write correct software but also makes sure you don’t break previously correct software
This is part 2 of our series From a Thought to Your First Chatbot. Part 1 is about writing your first bot, we’ll be building on top of that bot in this post.
This is pulled from the deck we presented on April 19, 2017 at our meetup in NYC.
Have any questions? Join our slack or email us
Setup
We’ll be using the alana bot platform to set up our bot and test it. The testing framework grew out of our previous project messenger-bot-tester and follows the same general test design. If you’re not using alana, you should be able to use that project to test Facebook messenger bots in a similar way.
Go to github.com/adamjuhasz/spacebot to grab the pre-made bot or use yor bot from part 1.
What is testing?
- Unit testing: verify the functionality of a specific section of code (a script or a dialog)
- Integration testing: verify the interfaces between the bot and chat platform
- Regression testing: finding defects after a code change, especially degraded or lost features, including old bugs that have come back.
When to write tests
As often as you can and then write a few more tests…
- Every time you finish a module or a script, add a test while it’s fresh in your mind.
- Hard parts of a your bot that you struggled to develop
- Practice TDD (Test driven development) and write your tests first and then develop your bot till all your tests pass.
- Fix a bug? Add a test for the bug to make sure you don't bring the bug back.
How do we test?
Unit testing & Regression testing
Alana supports testing by mimicking interactions between the user and the bot. Each test is run as a new user.
At this time we don’t support testing separate functions oustide of a chat flow. For alana.cloud, we’re working on integrating coverage reports so you know how much of your bot you’re testing.
Integration testing
If you’re using the alana bot platform, we’ve taken care of the most important elements of integrating with chat platforms like Facebook Messenger and Slack. It’s still important to check the general behavior of your bot due to some issues that can crop up. For example are you sending 5 buttons but the platfrom only supports 3 and your buttons get truncated. We recommend going over your bot live on the platform just after you deploy. Sadly it’s difficult to do automate integration testing to a high degree so you’ll probably have to do it manually.
How to design your test plan
- Focus on risk not coverage
- Write tests for parts of your bot you don’t use frequently, such as onboarding or error handling.
- Write a test every time you fix a bug, you would be suprised how many bugs come back again.
- Create multiple tests with slightly different wording or phrasing to make sure you’re testing your NLP engine. For example if your test has the user saying “yes”, make two more tests saying “yep” and “ya”. This way you won’t be suprised if your ML NLP model changes.
- Write tests for incorrect inputs, think like a hacker. How could you break the bot?
When to run tests
Run your unit tests constantly as you change your code. In order to better enforce this best practice, every time you save your code alana.cloud will run all your tests automatically.
If you’r not using our cloud platform and can’t run your tests constantly then test at least right before you deploy your code. It’s also good to test after finishing a bigger module or touching old code.
Structure of a test
Below you can see an example test for alana. Here we are testing where we expect 2 messages from the bot as a greeting, we mimic a user sending a response and expect a certain response from the bot in return.
- Alana is an opinionated framework, each test needs to go into the
test/
directory. Multiple tests can be inside a single file or you can put a single test only and use multiple files. - Each test has a name representing the test. Be as descriptive as you can. Test names bot-wide need to be uniquely named.
- Each test is run as a new user.
- Declare a test using
test(...)
. - Inside of a declared
test(...)
we create a new test using thenewTest()
global function. This test needs to be returned to the test framework. - After creating the test, chain multiple test configuration options about the current test. In the example, we are setting
checkForTrailingDialogs
true so that the test makes sure no extra messages are sent outside what the test specified. - Chain the the test togther by telling the test what messages we expect from the bot and what messages to send as a response.
- Call
run()
to to compile the test and create a test promise the testing framework can run on
Every test will have the following structure
test('test-name', function() {
return newTest()
/* test code here */
.run();
});
Crafting a simple test
When we designed spacebar, we created a greeting for it similar to the one below. We know that when a new user connects to the bot we send the two phrases “Welcome to spacebot” and “I love space!” to the user first.
This means that we need to write a test that expects these two messages first.
test('greeting', function() {
return newTest()
.expectText('Welcome to spacebot')
.expectText('I love space')
.run();
});
To test this in, open up your bot in alana.cloud and open main.js
in the tests directory and add the test above. Be sure to open the testing pane and then click save.
Notice that we get an error about the test failing… In the testing pane, the platform lists out which testing file the failed test is in, and how many tests failed in that file. We then get the failing tests’ name, “greeting” in this example and the reason the test failed.
The reason the test failed is that we have a typo in the test (we forgot the “!”). Let’s fix the test and do it again. Now the test will pass.
Testing input/output
Testing chat flows is the bedrock of validating your alana bots. In our spacebot, after going through the greeting the bot asks the user some personal questions to fill out their profile.
Since each test is run as a new user, we’ll have to copy the greeting test before we can add code to test the profile
test('profile', function(){
return newTest()
.expectText('Welcome to spacebot') // from the previous test
.expectText('I love space!') // from the previous test
.expectText('What should I call you?')
.sendText('bot tester')
.expectButtons('Hello bot tester, what do you want to do?', [
{ type: 'postback', text: 'Photo of the day', payload: 'POTD' },
{ type: 'postback', text: 'Space trivia', payload: 'TRIVIA' },
])
.run();
});
- Here we used
sendText()
to mimic the user typing something to the bot, their name in this example. - After that the bot responds with the main menu, and has two buttons, this all sent as a single “button message”.
At this point we need to enter raw button objects
{type:'post back', text:'', payload: ''}
, but a macro is coming soon.
Auto-applying some flow
Since each new test is also a new user, it helps to be able to automate some portions of a test if it needs to be done in multiple tests.
Let’s script the greeting section by creating a function that automatically applies the greeting and profile section the test. The function will take the test variable and apply 5 steps to it.
function doGreeting(test) {
test.expectText('Welcome to spacebot')
.expectText('I love space!')
.expectText('What should I call you?')
.sendText('bot tester')
.expectButtons('Hello bot tester, what do you want to do?', [
{ type: 'postback', text: 'Photo of the day', payload: 'POTD' },
{ type: 'postback', text: 'Space trivia', payload: 'TRIVIA' },
]);
}
Now let’s make a test and make sure that our function works.
test('auto-do', function(){
const test = newTest()
doGreeting(test);
return test.run();
});
Now we can just call doGreeting()
and it’ll handle the onboarding of users for us! This makes our tests much more readable.
Testing APIs
Alana bot platform makes it easy to connect to external APIs, request(...)
is even built right in and exposed to all bots. Many APIs will return some type of random data, maybe a timestamp or a changing photo of the day. How do we test against these?
Do you control the API?
Send a specific testing flag to the external API and have it return static content. This is also a good place to turn of rate limiting if you’re testing often or have many tests.
- If you are using a “GET” request maybe add a flag to the query string
request({...qs:{test: true}...})
- If you’re sending a “POST” request you could also add a flag to the body
request({...body:{test: true}...})
to tell the server that this is running in a test.
Don’t control the API?
- Use a regular expression as the test case if you know the format of the response, for example a URL.
- Mock the response based on a platform switch.
If you mock the response be aware that the API’s response could change and your test wont fail. This is where integration testing is important.
We will be testing against the following chat script, when the user clicks the “Photo of the day” button they should see a photo.
Using regular expressions
First let’s see how to use a regular expression to test for the Photo of the day feature. We know that we will be getting a url back from NASA’s API so we just check that the “url” of the image sent by the bot is actually a url. We’ll check if it starts with “https:”
test('potd regex', function(){
const test = newTest();
doGreeting(test);
test.sendButtonClick('POTD')
.expectImage(/^https:/)
return test.run();
})
Mocking
What if the API does not return something where we can just check its format? Or the bot echos out something the API returns and so we have to make sure it is a specific text in a later test? (Such as a user’s name if we just send a userid)
We can check the platform of the user and if we are running tests. If we are, return a specific, known payload. Let’s replace the POTD script in the scripts
directory with the following. Here we’ll make a variable to hold a promise. The promise will either be a promise that immediately resolves to a known payload if we are testing or it will be a request
promise from NASA’s API.
newScript('POTD')
.dialog((session, response) => {
let promise = null;
if (session.user.platform === 'testing') {
console.log('sending mock');
promise = Promise.resolve({
url: 'https://image.jpg',
});
} else {
promise = request({
uri: 'https://api.nasa.gov/planetary/apod',
method: 'GET',
qs: {
api_key: 'DEMO_KEY'
},
json: true,
})
}
return promise.then(apod => {
response.sendImage(apod.url);
})
.delay(3000)
})
Notice that we check if session.user.platform
is equal to “testing”, this will only be true if we are running tests. If it is, we set the promise
variable to mocked object instead of request’s promise. Let’s test this
test('potd mock', function(){
const test = newTest();
doGreeting(test);
test.sendButtonClick('POTD')
.expectImage('https://image.jpg')
return test.run();
})