meta data for this page
Phonological Considerations
Make sure phrases don't sound similar to the recognizer
Similar-sounding phrases can confound grammars, irrespective of orthography. Even though a word or a phrase may look different when it is written on the page, if it sounds similar to other keywords, the grammar may have difficulty recognizing the phrase.
An example: Assume the caller has gone through the process for making a payment. Before we submit the payment, we ask the caller if he would like to submit, cancel, or change. If the caller says change, consider this prompt:
- You can change the amount or the account. Which would you like to change?
“Amount” and “account” differ by a single phoneme. This has a high likelihood of false accepts. Sometimes you can easily come up with substitute words, and other times you might have to get more creative. This is one where it's hard to come up with a good synonym for either of the two words, so you might end up with something like this:
- Which would you like to change? You can say “the account” or “how much.”
Or possibly to successive yes/no questions, especially if one tends to be way more frequent than the other.
- System: Is it the amount you need to change?
- Caller: No
- System: OK, then the account?
- Caller: Yes
Avoid repeating the same word as part of multiple menu options
- You can say, “cancellation information,” “policy information,” or “transfer information.”
Having the word “information” in each option makes them all phonetically similar. A better way to write this prompt is: You can say, “Cancel,” “Policy information,” or “Transfer.”
Avoid alpha input
The classic example of acoustic confusability is provided by English letter recognition (spelling). It poses a difficult recognition task due to the acoustic similarity of the letter and number “names” i.e., “three,” “zee,” “cee,” “dee,” “tee,” etc. It is especially challenging because the main difference in the phonemes occurs at the start of the utterance, which is exactly the point at which the recognizer is facing the most difficult acoustical challenges, i.e., for endpointing and calibration.
If you must have alpha input, include a filler for the caller to say to prime the recognizer
A useful workaround is to ask the caller to say a filler phrase at the start of spelling a word, i.e. “My account number is… B C D” etc. (There are many other ways to enhance alphanumeric recognition. See also: Chapter 8 - Grammars)
Consider conflicts between local dialog states and universal grammars
It's easy to forget to think about any universal phrases when looking for phonological conflicts. Don't look at just the phrases in the local dialog states, but remember to look at everything that is available at any given point.
Avoid super short, low energy words
Some phonemes have higher energy than others, meaning they are easier for a recognizer (human or computer) to pick up on. And super short utterances (a single syllable) don't give the recognizer a lot of phonemes to work with. Combine the two exacerbates problems with recognition. Most common in these situations is that something completely different, even noise or side speech, triggers a false accept of these phrases.
One of the most problematic words in this category is “help.” It has few phonemes, and they aren't high energy. Anyone who has transcribed many calls at all where “help” is an accepted phrase can tell you that as often as not, that's not what the caller said when it was recognized.
Another challenge is actually “no.” However, there's not much you can do about a yes/no prompt. Your best bet is DTMF backup in the reprompt.