meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
voice_talent [2019/06/27 11:51]
crispin_reedy_yahoo.com
voice_talent [2019/08/08 10:34] (current)
lisa.illgen_concentrix.com
Line 29: Line 29:
 **// Tying voice talent selection to the desired system persona //** **// Tying voice talent selection to the desired system persona //**
  
-The user persona developed by the business ​(see page TODO PUT PAGE HERE) should drive the system persona designed for the voice application. This system persona should then drive the selection of voice talent. ​+The user persona developed by the business should drive the system persona designed for the voice application. ​ (For more on user persona and system persona, see the page [[persona_and_brand|Persona and Brand]]) ​This system persona should then drive the selection of voice talent. ​ For example, consider an application developed by a cosmetics company to remind their independent beauty consultants of the birthdays and anniversaries of their colleagues and coworkers. ​ In order to resonate with their consultants,​ the brand wishes to convey a system persona that is upbeat, fun, and feminine. ​ The 
  
 +Vocal characteristics
 +
 +Register – High pitch or low pitch
 +
 +Soprano / alto / tenor / bass
 +
 +Open-ness, resonance, sonority
 +
 +Associated with depth, size, seriousness
 +
 +Chest-voiced quality
 +
 +“James Earl Jones”
 +
 +Closed-ness,​ nasalness
 +
 +Associated with quickness, lightness
 +
 +May be irritating
 +
 +Head-voiced quality
 +
 +“Mr. Burns”
 +
 +
 +"​Read"​ or delivery
 +How the talent reads the lines
 +Speed, pacing
 +Spaces
 +Intonation
 +Good talents can change a lot of their characteristics through different “reads”
  
  
-TODO - see section on selecting a system persona. 
  
 Voice characteristics - need an article on this? Voice characteristics - need an article on this?
Line 49: Line 79:
  
 **// Involve the right stakeholders in the decision //**\\ **// Involve the right stakeholders in the decision //**\\
-Make sure the highest-level executive who cares about the IVR voice is engaged in the selection process. "Trust me—you do not want to be in a meeting where you’re presenting the working version of the application (including all professional recordings) to the senior vice-president in charge of customer care who, upon hearing the voice for the first time, says, 'I hate it. We need a different voice'"​ (Lewis, 2011, p. 103).+Make sure the highest-level executive who cares about the IVR voice is engaged in the selection process. "Trust me—you do not want to be in a meeting where you’re presenting the working version of the application (including all professional recordings) to the senior vice-president in charge of customer care who, upon hearing the voice for the first time, says, 'I hate it. We need a different voice'"​ ([[references#​lewis2011|Lewis, 2011]], p. 103).
  
 === Gender === === Gender ===
 **// Do not overemphasize gender //**\\ **// Do not overemphasize gender //**\\
-There is no compelling research to indicate an advantage based solely on the gender of the voice talent (Couper, Singer, & Tourangeau, 2004; Lewis, 2011). For average listeners in normal channels, “…there is little evidence to suggest that one sex of speaker is more intelligible than another, if other factors are ruled out. For example, males may typically have louder voices than females, and female voices may be more high-pitched than males, but if these factors are controlled for, any sex differences usually disappear” (Edworth ​& Hellier, 2005).+There is no compelling research to indicate an advantage based solely on the gender of the voice talent ([[references#​couper|Couper, Singer, & Tourangeau, 2004]][[references#​lewis2011|Lewis, 2011]]). For average listeners in normal channels, “…there is little evidence to suggest that one sex of speaker is more intelligible than another, if other factors are ruled out. For example, males may typically have louder voices than females, and female voices may be more high-pitched than males, but if these factors are controlled for, any sex differences usually disappear” ([[references#​edworthy|Edworthy ​& Hellier, 2005]]).
  
-There is a general tendency in the US to use a female voice for IVRs (likely due to their service-provider orientation -- for a historical perspective,​ see Yellin, 2009), but there are numerous examples of successful use of male voices in IVRs. Find out if your client cares and, if so, take that into account when selecting a voice or set of voices to review.+There is a general tendency in the US to use a female voice for IVRs (likely due to their service-provider orientation -- for a historical perspective,​ see [[references#​yellin|Yellin, 2009]]), but there are numerous examples of successful use of male voices in IVRs. Find out if your client cares and, if so, take that into account when selecting a voice or set of voices to review.
  
-There is no question that we all carry conscious and unconscious stereotypes in our heads. In recent years, the psychologist most strongly associated with research in how these stereotypes affect human-computer interaction is Clifford Nass (Nass & Brave, 2005; Nass & Yen, 2010; Reeves & Nass, 2003), most notably in the book, "Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship"​. In that book, Nass and Brave (2005) described experiments in which different types of people used speech applications (notably, with most of the experiments using TTS rather than professional voice talents for their audio). In most of the studies they replicated classic social psychology studies of interactions between humans, replacing one of the humans with a speech-enabled computer, a variation of the “computers as social actors” (CASA) paradigm.+There is no question that we all carry conscious and unconscious stereotypes in our heads. In recent years, the psychologist most strongly associated with research in how these stereotypes affect human-computer interaction is Clifford Nass ([[references#​nass2005|Nass & Brave, 2005]][[references#​nass2010|Nass & Yen, 2010]][[references#​reeves|Reeves & Nass, 2003]]), most notably in the book, "Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship"​. In that book, [[references#​nass2005|Nass and Brave]] (2005) described experiments in which different types of people used speech applications (notably, with most of the experiments using TTS rather than professional voice talents for their audio). In most of the studies they replicated classic social psychology studies of interactions between humans, replacing one of the humans with a speech-enabled computer, a variation of the “computers as social actors” (CASA) paradigm.
  
 For example, they replicated the “similarity attraction” effect, the finding that people are attracted to other people who are similar to themselves. In these laboratory experiments,​ extroverts preferred an extroverted user interface and males preferred to hear a male voice. It turns out, however, that it is difficult to apply many of these findings to user interface design (e.g., how would you know in advance if a caller were male or female, introvert or extrovert). Additionally,​ they reported that people tend to rate male voices as more trustworthy (especially male listeners), and to expect females to be more nurturing. For example, they replicated the “similarity attraction” effect, the finding that people are attracted to other people who are similar to themselves. In these laboratory experiments,​ extroverts preferred an extroverted user interface and males preferred to hear a male voice. It turns out, however, that it is difficult to apply many of these findings to user interface design (e.g., how would you know in advance if a caller were male or female, introvert or extrovert). Additionally,​ they reported that people tend to rate male voices as more trustworthy (especially male listeners), and to expect females to be more nurturing.
  
-Despite the reliability with which these social effects appear in replications of social psychology experiments,​ they are not as reliable when assessed in real-world systems that are otherwise usable, that is, efficient, effective, and pleasant (Balentine, 2007). Lewis (2011), in an analysis of data from studies of the perception of the quality of TTS voices (both male and female) rated by both males and females, did not find any significant Voice Gender by Listener Gender interaction,​ an interaction that the similarity attraction hypothesis would have predicted (and an effect replicated by Machado et al., 2012). Couper, Singer, ​and Tourangeau (2004) studied the influence of male and female artificial voices on more than 1000 respondents to an IVR survey on sensitive topics. They measured respondents’ reactions to the different voices and abandoned call rates, and found no statistically significant results related to the gender of the voices. In particular, there were no significant Voice Gender by Respondent Gender interactions.+Despite the reliability with which these social effects appear in replications of social psychology experiments,​ they are not as reliable when assessed in real-world systems that are otherwise usable, that is, efficient, effective, and pleasant ​([[references#​balentine2007|(Balentine, 2007)]]). [[references#​lewis2011|Lewis]](2011), in an analysis of data from studies of the perception of the quality of TTS voices (both male and female) rated by both males and females, did not find any significant Voice Gender by Listener Gender interaction,​ an interaction that the similarity attraction hypothesis would have predicted (and an effect replicated by [[references#​machado|Machado et al., 2012]]). [[references#​couper|Couper, Singer, ​Tourangeau]] (2004) studied the influence of male and female artificial voices on more than 1000 respondents to an IVR survey on sensitive topics. They measured respondents’ reactions to the different voices and abandoned call rates, and found no statistically significant results related to the gender of the voices. In particular, there were no significant Voice Gender by Respondent Gender interactions.
  
-“Why such strong effects of humanizing cues are produced in laboratory studies but not in the field is an issue for further investigation. … Across these studies, little evidence is found to support the ‘computers as social actors’ thesis, at least insofar as it is operationalized in a survey setting” (Couper et al., 2004, p. 567).+“Why such strong effects of humanizing cues are produced in laboratory studies but not in the field is an issue for further investigation. … Across these studies, little evidence is found to support the ‘computers as social actors’ thesis, at least insofar as it is operationalized in a survey setting” ([[references#​couper|Couper et al., 2004]], p. 567).
  
 === Coaching, inflection === === Coaching, inflection ===
Line 77: Line 107:
 For any given sentence or phrase, there are many ways to speak it, only one or a few of which will be appropriate in a given context. For example, what is the correct way to record the question (appearing in a list of frequently asked questions), “What happens after I apply for cash assistance?​” Should the speaker emphasize “What,” “happens,​” “after,​” “apply,​” or “cash assistance”?​ For any given sentence or phrase, there are many ways to speak it, only one or a few of which will be appropriate in a given context. For example, what is the correct way to record the question (appearing in a list of frequently asked questions), “What happens after I apply for cash assistance?​” Should the speaker emphasize “What,” “happens,​” “after,​” “apply,​” or “cash assistance”?​
  
-The answer depends on the question'​s context. If the surrounding items concern other aspects of applying for and getting cash assistance, then plan to emphasize “after,​” contrasting it with the things that happen before applying. If the surrounding items have to do with other types of assistance such as food stamps or health benefits, then plan to emphasize “cash assistance.” It’s critical to get the prosodic element of contrastive stress correct (Cohen, Giangola, & Balogh, 2004; Lewis, 2011).+The answer depends on the question'​s context. If the surrounding items concern other aspects of applying for and getting cash assistance, then plan to emphasize “after,​” contrasting it with the things that happen before applying. If the surrounding items have to do with other types of assistance such as food stamps or health benefits, then plan to emphasize “cash assistance.” It’s critical to get the prosodic element of contrastive stress correct ([[references#​cohen|Cohen, Giangola, & Balogh, 2004]][[references#​lewis2011|Lewis, 2011]]).
  
 Usage notes are also helpful, especially when recording small pieces that will later be concatenated together. Knowing that something will be an element in a list or the last thing in a fill-in-the-blank sentence makes all the difference in the world in how it's recorded. Usage notes are also helpful, especially when recording small pieces that will later be concatenated together. Knowing that something will be an element in a list or the last thing in a fill-in-the-blank sentence makes all the difference in the world in how it's recorded.
  
 These notes in the manifest are all the more important if the designer is not the coach. They will be invaluable to the coach and voice talent. These notes in the manifest are all the more important if the designer is not the coach. They will be invaluable to the coach and voice talent.
 +
 +For considerations to make when selecting a voice talent for a multilingual application,​ see [[multilingual_applications|Multilingual Applications]].
  
 [[References]] [[References]]