IBM Watson offers several services that are under the artificial intelligence umbrella using their infrastructure to run models that you can consume through their API. Unfortunately, if you want to use more than one service you should implement the integration between services by your self.
In my case, I was interested in developing a voice-enabled chatbot. An integration that makes the interaction between human and bot faster once the human doesn’t have to type, promoting a more natural interaction by using just your voice. The expected experience is similar to the one users have with platforms such as Google Home or Alexa, however, in this case , the assistant/bot is controlled by us, serving whatever service or guidance we program on it.
This particular bot will help you to set a appointment. For this purpose we should consume three different services.
Speech To Text: It will transform the user’s voice into writing text
Chatbot / Assistant: It will provide answers or responses base on the context and requirements from the user
Text To Speech: It will transform text provided by the chatbot into a voice. This voice can be changed since it is generated my AI algorithms.
This is the final result.
We can change multiple features of this implementation. We can provide different services and conversations based on location, time, day, user, and most important in my opinion, context. Context allows you to have a more natural conversation where the chat-bot keeps track of variables that are critical for the service that is being provided. For example, the chat-bot keeps track of the name of the user, his/her preferences, and if the user had mentioned specific requirements, the bot can bring them back as a reminder to complete a final request.