==== Key Effectiveness Criteria ====
From a usability metrics perspective ([[references#sauro2012|Sauro & Lewis, 2012]]), the fundamental measurements at the task level are measures of effectiveness (e.g., successful task completion rate), efficiency (e.g., successful task completion time), and satisfaction (collected at the end of a task, at the end of a session, or both). [[references#bloom2005|Bloom et al.]] (2005) provided ten key criteria, based on classical usability metrics but focused on IVRs, for measuring the effectiveness of voice user interfaces.

**// Caller satisfaction //**\\
Assessed at a minimum with one or two five-point Likert items, such as, “I was satisfied with the automated portion of this call” and “I was satisfied with the agent during this call.” For more detailed evaluation of caller satisfaction use psychometrically qualified instruments such as the 34-item Subjective Assessment of Speech System Interfaces (SASSI) ([[references#hone|Hone & Graham, 2000]]), the 11-item Pragmatic Rating Scale for Dialogues ([[references#polkosky2002|Polkosky, 2002]]), or the 25-item Framework of SUI Service Quality ([[references#polkosky2008|Polkosky, 2008]]).

**// Perceived ease of use //**\\
Assessed at a minimum with one five-point Likert item, such as a variant of the Single Ease Question ([[references#sauro2012|Sauro & Lewis, 2012]]) -- “The application was easy to use.” For more detailed evaluation of perceived ease of use, see the items for the User Goal Orientation and Customer Service Behavior factors of the Framework of SUI Service Quality (Polkosky, 2008).

**// Perceived quality of output //**\\
Assessed at a minimum with two five-point Likert items (“The voice was understandable” and “The voice sounded good.”) For more detailed evaluation of voice quality, see the five items for the Speech Characteristics factor of the Framework of SUI Service Quality ([[references#polkosky2008|Polkosky, 2008]]) or, for a multidimensional assessment, the 15-item MOS-X ([[references#polkosky2003|Polkosky & Lewis, 2003]]).

**// Perceived first-call resolution rate //**\\
Assessed at a minimum with a yes or no answer to the question, “Did you accomplish your goal?”

**// Time to task //**\\
The time that callers spend in the IVR before they can begin the desired task. Lower values of time-to-task lead to greater caller satisfaction. Items that increase time-to-task include lengthy up-front instructions, references to a Web site, and marketing messages (see [[What Not to Include at the Beginning]]).

**// Task completion rate //**\\
The rate at which callers actually accomplish tasks (an objective measure in contrast to the subjective measure of perceived first-call resolution rate).

**//Task completion time //**\\
The time required for callers to complete tasks. Generally, shorter task times are better for both the caller and for the service provider.

**// Correct transfer rate //**\\
Percentage of transferred calls getting to the right agent.

**// Abandonment rate //**\\
Percentage of callers who hang up before completing a task. Ideally the design of the IVR and associated logging discriminates between expected (probably not a problem) and unexpected (probably a problem) disconnections (see [[Logging Strategy]]).

**// Containment rate //**\\
Percentage of calls not transferred to human agents. Although this is a common metric, it is deeply flawed. See the discussion about this in [[Logging Strategy]].

==== References ====
Bloom, J., Gilbert, J. E., Houwing, T., Hura, S., Issar, S., Kaiser, L., et al. (2005). Ten criteria for measuring effective voice user interfaces. Speech Technology, 10(9), 31–35.

Hone, K. S., & Graham, R. (2000). Towards a tool for the subjective assessment of speech system interfaces (SASSI). Natural Language Engineering, 6(3–4), 287–303.

Polkosky, M. D. (2002). Initial psychometric evaluation of the Pragmatic Rating Scale for Dialogues (Tech. Rep. 29.3634). Boca Raton, FL: IBM.

Polkosky, M. D. (2008). Machines as mediators: The challenge of technology for interpersonal communication theory and research. In E. Konjin (Ed.), Mediated interpersonal communication (pp. 34–57). New York, NY: Routledge.

Polkosky, M. D., & Lewis, J. R. (2003). Expanding the MOS: Development and psychometric evaluation of the MOS-R and MOS-X. International Journal of Speech Technology, 6, 161–182.

Sauro, J., & Lewis, J. R. (2012). Quantifying the user experience: Practical statistics for user research. Burlington, MA: Morgan Kaufmann.