Immediate explicit confirmation
This is the simplest approach to explicit confirmation – each time the caller provides an input, the system confirms it, for example:
The advantage of this one-at-a-time approach is that it is highly directive, and callers tend to have a very high success rate.
The main disadvantage is that the number of dialog turns makes the interaction feel sluggish (Boyce, 2008; Frankish & Noyes, 1990). And the greater the number of items, the more sluggish the interaction.
On the other hand, an internal usability study conducted by Convergys showed that participants either preferred or were not troubled by step-by-step (immediate) confirmation AND final wrap-up confirmation while making a payment, but it is uncertain how safe it is to extend this approach to non-payment-related tasks. There is little published research on this topic, and there might be a number of variables that influence when immediate might be advantageous over batch confirmation strategies – for example, the frequency with which callers perform the task (infrequent callers might prefer immediate confirmation), the specific task, or caller characteristics (younger callers might prefer batch confirmation). As described below, a key variable is the accuracy of the recognizer (Kotan & Lewis, 2006) – if callers will routinely need to correct misrecognitions, then much of the efficiency advantage of batch confirmation vanishes.
Delayed (batch, group) explicit confirmation: A basic approach
The basic strategy for delayed confirmation is to collect all the data, then play all of it in a single confirmation prompt. If the system correctly recognized everything, than that's it. Otherwise, the caller needs to correct any incorrect items, after which the system repeats the confirmation.
So, when there are no errors, confirmation takes just one step, no matter how many items there are.
If, however, there is a need to make multiple corrections, much of that advantage disappears.
Here's a sample dialog using this strategy in which there are no errors:
Here's a similar dialog, but with multiple errors:
The more items there are, the more cumbersome this can become. An exploratory, small-sample study conducted by Kotan and Lewis (2006) had three participants complete four bill-paying tasks that had different confirmation styles (immediate vs. simple delayed) and 0 or 2 recognition errors. Simple delayed confirmation was significantly faster when there were no errors, but significantly slower when there were two errors.
To streamline the basic approach to delayed confirmation, you can eliminate the repeated playback of the full confirmation message (Kotan & Lewis, 2006), for example:
This approach does not reduce the number of turns, but it avoids the repeated playing of information that was correct and which the caller has no intention of changing.
If it's a customer requirement to do a full confirmation before submitting the data for the transaction, then drop the immediate confirmation steps, for example:
Note that Kotan and Lewis (2006) also tested an alternative strategy in which they collected from participants a list of items to change before changing any of them (a method called “Batch Collection/Correction”). The failure of this method to conform to the probable user expectation of making a correction immediately after identifying an error apparently disrupted task performance, so we do not recommend (or illustrate) that variation – for details, see Kotan & Lewis (2006) or Lewis (2011).
Boyce, S. J. (2008). User interface design for natural language systems: From research to reality. In D. Gardner-Bonneau & H. E. Blanchard (Eds.), Human factors and voice interactive systems (2nd ed.) (pp. 43–80). New York, NY: Springer.
Frankish, C., & Noyes, J. (1990). Sources of human error in data entry tasks using speech input. Human Factors, 32(6), 697–716.
Kotan, C., & Lewis, J. R. (2006). Investigation of confirmation strategies for speech recognition applications. In Proceedings of the Human Factors and Ergonomics Society 50th annual meeting (pp. 728–732). Santa Monica, CA: Human Factors and Ergonomics Society.
Lewis, J. R. (2011). Practical speech user interface design. Boca Raton, FL: CRC Press, Taylor & Francis Group.