Zrzut ekranu 2017-02-24 o 23.31.10

Voice communication with IBM Workspace & Sia

Before an important presentation of Sia during IBM Connect17 in San Francisco, by our fantastic partner Oblong, I received a call from one of the most awesome people that I have had the pleasure to work with, Dr. Andre, with a short questions: “Can we make Sia communicate with IBM Workspace using voice?”… Well sure we can, when do you need it? (somehow I knew already that it’s going to be a crunch and suggested delivery for the following day). I was too optimistic as Andre wanted it in 2 hours 🙂

Challenge accepted. Here is how I did it in less than 2 hours (around 45 min).

The plan was pretty simple, use a service that can convert speech into text and then send it to IBM Workspace. Then run the script in a background that will record voice and stream that to the selected service. I immediately turned to IBM Watson Speech to text API as we were comparing it before with other solutions and it was performing very very well. We also know Watson services intimately, so it was a natural choice.

Let’s hack! I’ll try to explain all actions within steps that I took to make it work.

Creating a python app

First, I created a python app that would run in the background to record voice and then send it to Watson. Typical actions: create a project in favorite IDE (PyCharm in my case), and virtualenv for python.

mkvirtualenv talk_with_workspace

If you not familiar with virtualenv and very helpful extension virtualenvwrapper it is worth to take a look.

Look for libraries, solutions, services that might help

As usual, before writing any code I always check if there is a ready solution that I can use to solve user needs. As expected python community is one of the best in the world and within few minutes of checking, I found SpeechRecognition  library. Even better, the library was ready not only to record voice but also send it to IBM Watson (I love Python!!!)

Installation is common for python libraries. Go to your environment and run:

workon talk_with_workspace
pip install SpeechRecognition

Documentation is also well prepared so getting started didn’t take a long time. My first few lines of code looked like this:

import speech_recognition as sr
from .config import IBM_USERNAME, IBM_PASSWORD

r = sr.Recognizer()


while True:
    with sr.Microphone() as source:
        audio = r.listen(source)

    text = r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD)

Getting started with IBM Watson Speech To Text

Now I had to create Speech to Text service in IBM Watson. Like all things in Bluemix, it required nothing more than a few clicks.

  • Login to https://console.ng.bluemix.net/Zrzut ekranu 2017-02-24 o 17.32.08
  • Find in catalog Speech to text in Watson section and click create.Zrzut ekranu 2017-02-24 o 17.32.38
  • Get credentials and copy paste to our appZrzut ekranu 2017-02-24 o 17.33.04

That should work

The app should detect when we start talking and when there is a pause in speech so we will get the transcript after each command. This is perfect for our use case as we can send this part of text to Workspace.

Send test to workspace

The last thing that we had to do is to send a received text to a specific space within IBM Workspace. No worry 1.5 hours left before Andre will call me again; we can make it.

  • Go To https://developer.watsonwork.ibm.comZrzut ekranu 2017-02-24 o 17.49.22
  • Click on Apps
  • Click create new appZrzut ekranu 2017-02-24 o 17.49.32
  • Give it a name and short description.Zrzut ekranu 2017-02-24 o 17.49.48
  • Get App ID and App Secret – very important to copy it now – App Secret will not be shown later.

Now when we have required IDs and secrets let’s do some python code. All the needed information of how Workspace API works can be found in the well documented developers center. To communicate with API I used python requests library. To install it simply run:

pip install requests

Sample python function to send message to workspace would look like this:

def send_message(text, space):
    """Send text to workspace"""

    # get access token
    response = requests.post(
        url='https://api.watsonwork.ibm.com/oauth/token',
        auth=(config.VOICE_APP_ID, config.VOICE_APP_SECRET),
        data={
            'grant_type': 'client_credentials'
        }
    )
    access_token = json.loads(response.content.decode('utf-8'))['access_token']

    # send message to workspace
    requests.post(
        'https://api.watsonwork.ibm.com/v1/spaces/{space}/messages'.format(space=space),
        data=json.dumps(
            {
                'version': 1.0,
                'type': 'appMessage',
                'annotations': [{'version': 1.0, 'type': 'generic', 'text': text}]
            }
        ),
        allow_redirects=False,
        headers={
            'jwt': access_token,
            'content-type': 'application/json'
        }
    )

That will allow us to communicate with Workspace.

Add application to space

One of the very last things to do was to add application into selected space. There is an very easy way to do it. You can go into your app settings and chose “Create share link”. You can grab that link and put into your browser. You should see list of spaces that you belong to and with a simple interface add just created app to selected spaces. We will also need a space ID to know where we should send that message. Take a look on some screenshots of how you can do it.
Zrzut ekranu 2017-02-24 o 18.32.50

Zrzut ekranu 2017-02-24 o 18.33.09

Connect pieces together

Now I had to only connect the pieces together and enjoy the final result.
Take a look at video and code below:

import speech_recognition as sr
import json
import requests

import config

r = sr.Recognizer()


def send_message(text, space):
    """Send text to workspace"""

    # get access token
    response = requests.post(
        url='https://api.watsonwork.ibm.com/oauth/token',
        auth=(config.VOICE_APP_ID, config.VOICE_APP_SECRET),
        data={
            'grant_type': 'client_credentials'
        }
    )
    access_token = json.loads(response.content.decode('utf-8'))['access_token']

    # send message to workspace
    requests.post(
        'https://api.watsonwork.ibm.com/v1/spaces/{space}/messages'.format(space=space),
        data=json.dumps(
            {
                'version': 1.0,
                'type': 'appMessage',
                'annotations': [{'version': 1.0, 'type': 'generic', 'text': text}]
            }
        ),
        allow_redirects=False,
        headers={
            'jwt': access_token,
            'content-type': 'application/json'
        }
    )


def run():
    while True:
        with sr.Microphone() as source:
            audio = r.listen(source)

        text = r.recognize_ibm(audio, username=config.IBM_USERNAME, password=config.IBM_PASSWORD)
        send_message(text, config.WORKSPACE_SPACE_ID)

Not bad for 45 minutes of work ;). This little hack proves the concept that in the very near future programmers/developers work will change a lot and most things we will be able to build by adding and connecting blocks. This will make building awesome things faster and cheaper – and it’s possible because of great open sources communities and thanks to great partners like IBM and platforms like Bluemix and IBM Watson.

You can find full code here: https://github.com/Opentopic/ibmworkspace-voice

Cognitively, your IBM Champion Tomasz Roszko 🙂


Author: Tomasz Roszko

Tomasz leads Opentopic’s engineering team. He got his start as a web developer in 2007 working for “Netstation” and writing internet applications for one of the largest publishing companies in Poland. During this time he worked extensively with Django, Pylons, and the RED DOT CMS. In 2009 he started his business (TJ.Software) creating a team of passionate Python/Django developers. He is currently the leader of this development team based in Bialystok Poland and focusing on Opentopic. Previously he has worked with many industry leaders such as RevSquare, NowWeComply, BlueIce, FlowLabs, and Mediapolis.

Since 2015 intensively work with IBM Watson, Bluemix and Softlayer to deliver value for Opentopic clients as soon as possible.

In his (very rare) spare time Tomasz teaches at Bialystok Technical University in the area of web application development and Data Science (including Watson API training). He is married and proud father of five.