VSTS Speech-to-Task

Recently I worked a #OneWeek hackathon project which allows creating backlog items in Visual Studio Team Services using speech to text translation and wanted to share some thoughts and the development processes.

During a sprint retrospective we noticed our team was getting randomized by requests in emails from partner teams and I was trying to persuade people to be more diligent with using VSTS for work tracking to help stay focused and better be able to measure our progress. A co-worked mentioned that for simple small tasks, the act of having to open up VSTS UX, find the right view, write the task can be enough mental distraction, that he would often rather just start the task and finish it. Reflecting on this he said, “Wouldn’t it be nice if we could just say Alexa! Add this to the backlog!” We paused and both immediately concluded this would be the perfect project for Microsoft’s upcoming OneWeek Hackathon.

Now you have background on the motivation and after 3 days of hard work and experimentation I give you: VSTS Speech-to-Task

Online demo: https://vsts-speech-to-task.surge.sh
Code: https://github.com/mattmazzola/vsts-speech-to-task

Walk through video:

Over all the project was a lot of fun.

The good parts:

  • Learned the new prototyping features in Figma
  • Learning the Cognitive APIs and VSTS APIs.
  • Getting to work on design and UX was refreshing experience

Bad parts:

  • VSTS apparently uses custom authentication instead of simply being another resource ID you can acquire token for using normal AAD (Azure Active Directory) authentication. You have to register a new application and setup permissions, learn the claims and urls patterns, etc. This took longer than expected
  • Integrating 3rd party code with Ember js is still a pain. I attempted to create a shim addon to allow merging the microsoft-speech-browser-sdk package into embers vendor folder, and the dummy app worked fine however when using it in parent application it did not. I had to abandon the addon and manually include the scripts which is waste of a day.
    https://github.com/mattmazzola/ember-cli-microsoft-speech-shim

Remaining:

  • Use browser’s Local Storage for persisting preferred account, project, type, etc
  • Use custom speech model to allow mapping user intents with entities.
  • Fix the authentication issues in the application. If token expires during normal operation, need to automatically acquire new one using refresh token.