meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
how_we_got_here_and_when [2019/07/12 11:37] admin |
how_we_got_here_and_when [2019/08/08 10:52] (current) lisa.illgen_concentrix.com |
||
---|---|---|---|
Line 4: | Line 4: | ||
AVIxD (The Association for Voice Interaction Design) has contemplated IVR (Interactive Voice Response) and VUI (voice user interface) design standards since the organization’s inception. Within the VUI community, there has been skepticism as to whether standards were even possible. A VUID’s (VUI designer) favorite answer has always been “it depends.” | AVIxD (The Association for Voice Interaction Design) has contemplated IVR (Interactive Voice Response) and VUI (voice user interface) design standards since the organization’s inception. Within the VUI community, there has been skepticism as to whether standards were even possible. A VUID’s (VUI designer) favorite answer has always been “it depends.” | ||
- | In 2011, a group of VUIDs discussing the topic again came to a realization. What was holding us back was thinking of standards as black and white. In his recent book, "The Voice in the Machine", Pieraccini (2012) pointed out that although the recent history of voice systems includes a proliferation of industrial standards such as VoiceXML, SRGS, SSML, CCXML and MRCP, it is unlikely that there will soon be rigidly enforced standards for VUI design in the form of a call-flow description language. | + | In 2011, a group of VUIDs discussing the topic again came to a realization. What was holding us back was thinking of standards as black and white. In his recent book, "The Voice in the Machine", [[references#pieraccini2012|Pieraccini]] (2012) pointed out that although the recent history of voice systems includes a proliferation of industrial standards such as VoiceXML, SRGS, SSML, CCXML and MRCP, it is unlikely that there will soon be rigidly enforced standards for VUI design in the form of a call-flow description language. |
- | "Although a call-flow description language would seem the perfect candidate for yet another standard, despite a number of attempts, no such standard has yet appeared, at least none as of this writing. And there's no industry-wide interest in creating one: a standard for an end product at the top of the speech industry food chain would fill no gap between levels of the chain. By contrast, a middle-of-the-chain standard like MRCP fills a gap between the VoiceXML browsers built by some companies and the speech recognition and text-to-speech engines built by others. Without it, vendors would have a hard time integrating different speech recognizers and text-to-speech engines into their platforms to offer their customers choice and flexibility. And companies that built speech recognition and text-to-speech engines would have a hard time selling their products to the many different platform vendors. ... Thus, there might not be a standard until the call flow becomes an intermediate representation between two levels of the speech industry and no longer a top-of-the-food-chain end product. That could happen if the flow of interaction of a dialog machine ever becomes an ingredient of a higher-level reasoning machine. But that's not the case yet" (Pieraccini, 2012, pp. 255-256). | + | "Although a call-flow description language would seem the perfect candidate for yet another standard, despite a number of attempts, no such standard has yet appeared, at least none as of this writing. And there's no industry-wide interest in creating one: a standard for an end product at the top of the speech industry food chain would fill no gap between levels of the chain. By contrast, a middle-of-the-chain standard like MRCP fills a gap between the VoiceXML browsers built by some companies and the speech recognition and text-to-speech engines built by others. Without it, vendors would have a hard time integrating different speech recognizers and text-to-speech engines into their platforms to offer their customers choice and flexibility. And companies that built speech recognition and text-to-speech engines would have a hard time selling their products to the many different platform vendors. ... Thus, there might not be a standard until the call flow becomes an intermediate representation between two levels of the speech industry and no longer a top-of-the-food-chain end product. That could happen if the flow of interaction of a dialog machine ever becomes an ingredient of a higher-level reasoning machine. But that's not the case yet" ([[references#pieraccini2012|Pieraccini, 2012]], pp. 255-256). |
**To Guidelines...**\\ | **To Guidelines...**\\ | ||
By changing the focus from standards to guidelines, the idea becomes more universally acceptable. Additionally, assigning a relative importance to each guideline and presenting the circumstances under which the recommendation changes gives the idea of guidelines even more credence. | By changing the focus from standards to guidelines, the idea becomes more universally acceptable. Additionally, assigning a relative importance to each guideline and presenting the circumstances under which the recommendation changes gives the idea of guidelines even more credence. | ||
- | Various people within the community have written books on VUI design (e.g., Balentine & Morgan, 2001; Cohen, Giangola, & Balogh, 2004; Lewis, 2011), but there hasn’t been an industry wide effort to define guidelines. A working group within AVIxD was established to do just this, and the result is this document. | + | Various people within the community have written books on VUI design (e.g., [[references#balentine2001|Balentine & Morgan, 2001]]; [[references#cohen|Cohen, Giangola, & Balogh, 2004]]; [[references#lewis2011|Lewis, 2011]]), but there hasn’t been an industry-wide effort to define guidelines. A working group within AVIxD was established to do just this, and the result is this document. |
**Audience**\\ | **Audience**\\ | ||
Line 17: | Line 17: | ||
**Research**\\ | **Research**\\ | ||
- | Wherever possible, the authors have shared research on the topics. Unfortunately, for many areas in VUI design rigorous research that’s applicable across systems is in short supply. As a result, this document will be a living and growing one, changing as research is done. In addition, certain recommendations may change over time as the technology changes. | + | Wherever possible, the authors have shared research on the topics. Unfortunately, for many areas in VUI design, rigorous research that’s applicable across systems is in short supply. As a result, this document will be a living and growing one, changing as research is done. In addition, certain recommendations may change over time as the technology changes. |
**Authors**\\ | **Authors**\\ | ||
- | And who are the authors? Why should you believe what you read here? Every member of the committee has been doing design for over 10 years. Many have written books or articles or white papers. All have presented at conferences. They come from a multitude of organizations and have worked in countless domains. If this group concurs, it's a safe bet that it's a good idea. And where we haven't concurred, we have put forth all sides. This is almost always in cases where observations have differed, and we don't exactly know why, so we have presented as much information as possible to help you make the decision that's right for you. | + | And who are the authors? Why should you believe what you read here? Every member of the committee has been doing design for over ten years. Many have written books or articles or white papers. All have presented at conferences. They come from a multitude of organizations and have worked in countless domains. If this group concurs, it's a safe bet that it's a good idea. And where we haven't concurred, we have put forth all sides. This is almost always in cases where observations have differed, and we don't exactly know why, so we have presented as much information as possible to help you make the decision that's right for you. |
See the contributors page for bio information on each author. | See the contributors page for bio information on each author. | ||
Line 32: | Line 32: | ||
Pieraccini, R. (2012). [[https://www.amazon.com/Voice-Machine-Building-Computers-Understand-ebook/dp/B00946TMSM/ref=sr_1_fkmr0_2?keywords=roberto+pierracini+voice+in+the+machine&qid=1556913124&s=gateway&sr=8-2-fkmr0|The voice in the machine: Building computers that understand speech.|]] Cambridge, MA: MIT Press. | Pieraccini, R. (2012). [[https://www.amazon.com/Voice-Machine-Building-Computers-Understand-ebook/dp/B00946TMSM/ref=sr_1_fkmr0_2?keywords=roberto+pierracini+voice+in+the+machine&qid=1556913124&s=gateway&sr=8-2-fkmr0|The voice in the machine: Building computers that understand speech.|]] Cambridge, MA: MIT Press. | ||
- | |||
- | [(This is a note.)] |