Perlbox Voice Application Framework

Using perlbox voice libraries, Perlbox Voice for TK provides a transparent interface to several open source speech systems. The goal of this project is to provide an easy to use, easy to configure application that connects spoken human words to computer commands as well as connecting computer responses to spoken human words. Perlbox-Voice allows the user to easily configure vocabularies that consist of "commands" and "reponses". The idea is that when the human says "command" (such as "web browser"), the computer will excute "response" (such as "mozilla").

Sphinx-2 Listening Agent

The Sphinx-2 Listening agent was created by the Sphinx Group at Carnegie Mellon University in order to stimulate the creation of speech-using tools and applications, and to advance the state of the art both directly in speech recognition, as well as in related areas including dialog systems and speech synthesis under an open source license. The purpose of releasing the source code publicly was to encourage development of speech tools in common computing environs.

Festival Talking Agent

The Festival speech synthesis system is a general multi-lingual speech synthesis system. Festival was created by the The Centre for Speech Technology Research at the University of Edinburgh and offers a full text to speech system with various APIs, as well an environment for development and research of speech synthesis techniques.

How To Run Perlbox-Voice for Tk

Figure 0: The Splash Screen.

Step 1: Start the System
When Perlbox-Voice begins, the listening agent is not running. To start the listner, follow the two steps below.

-Click the icon labeled "Control" (see Figure 1)
-click the button labled "Start Listener"

To stop the listening agent, click the button labeled "Stop Listener".

Figure 1: The "Control" Pane.

The Listener takes a bit of time to wake, you will be informed when the Listener is up and ready to listen. At this point you are ready to begin issuing commands for your computer to execute.

If you want the speaker to say something to you, enter some text into the text box and click the button labeled "Speak this Text".

Step 2: The Computers Vocabulary.

One of the major features of Perlbox Voice Application Framework is the simplification of the creation of new language models (vocabulary) for the Sphinx 2 listener. In this section we will discuss the method by which you can easily create new vocabularies for use in Perlbox Voice for Tk.

Figure 2: The "Vocabulary" Pane.

The term vocabulary refers to the list of words and phrases that Perlbox-Voice understands, as well as the actions that it is to take upon hearing these commands. For the purpose of this document, the term speech will refer to what you say and the term command will refer to the action that the computer should take upon hearing command. A command

A speech can be any combination of the 127,000 words listed in the Perlbox-Voice modification of
CMU Pronouncing Dictionary. This dictionary contains most nouns, verbs, adjectives and adverbs in the English language, as well as many common proper nouns and common words in non English languages.

A command is any instruction that your computer understands how to execute. Typically, this would be an application. The Perlbox-Voice application framework provides additional functionality in this area by providing pseudo commands that will not be passed on the operating system, but will performed though Perlbox-Voice framework. These pseudo commands are listed below.

say Some text -this command will not be passed on the operating system, instead "Some text" will be passed to the talker. Figure 2 contains two examples of this: good morning and good night.

Reloading the Current Vocabulary
If you wish to reset the vocabulary (perhaps you deleted some fields that you did not intend to) you can reload the current vocabulary by clicking the button labeled "Reset Fields". This will reset the value only as far back as the last time you entered this pane or selected "Apply Changes".

Adding an Entry
To add an entry, type the speech that you will say into the text box labeled "When You Say". Then type the command that you want the computer to execute into the text box labeled "Computer Does". If you are satisfied, click the button labeled "Add Entry". The new entry should appear in the table above.

Deleting an Entry
If you want to remove an entry from the table, highlight this entry by clicking on it, then click the botton labeled "Delete Entry".

Creating Your New Vocabulary
Make sure that the table is correct. No attempt will be made by Perlbox-Voice to verify the correctness of the commands in your new vocabulary. When you are satisfied, simply click the button labeled "Apply Changes". A new vocabulary will be created and made ready for use.

Step 2: The Configuration Tab.

The configuration tab currently allows for only two options. We will discuss each of these options below.

Figure 3: The "Configuration" Pane.

Sound Response
Sound response is refers to the Festival speech synthesiser. For the purpose of this document, we will refer to this agent as "the talker". The Talker will inform you of various pieces of information, pertaining primarily to the state and actions of the Listener and Perlbox-Voice in general. With the slider at the top of this pane, you can set the verbosity level, the higher it is, the more the talker will chatter to you. After you have had a look around the application, 5 is probably a good setting. You should be careful, however, if you are running other sound applications and your sound card does not support duplexing modes; as the talker will simply wait untill the sound card is free and then say everything that it had to say in one long stream.

Browser for Help Documents
Perlbox Voice for Tk uses html format for help documentation, such as this document. This field gives you the ability to change the default browser used to open help documentation. Common values for this field include: mozilla, netscape, galeon, firebird, epiphany, konqueror and opera.

Desktop Plugin
Perlbox voice provides for desktop plugins, which give you direct access to control your favorite desktop through voice commands. By default, no desktop plugin is enabled, and only one desktop plugin can be enabled at a time. In order to enable a plugin, click on the option menu under the text "Desktop Plugins" and select the plugin you want to load and click the button labeled "Apply Changes". A new language model will be created and the listener will be restarted.For more information on how to use your favorite desktop plugin, click on the link below.

Available Desktop Plugins:
KDE Desktop

Figure 4: The "Help" Pane.

Use this pane to obtain help and information about Perlbox and Perlbox Voice for Tk.

This concludes the Perlbox Voice for Tk tutorial. We hope that you have found this document useful. Please, feel free to contact us with any questions at me@perlbox.org . Updated January 2004, Shane C. Mason.

Perlbox Voice Homepage