meta data for this page
  •  

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
voice_talent [2019/06/25 18:59]
crispin_reedy_yahoo.com
voice_talent [2019/08/08 10:34] (current)
lisa.illgen_concentrix.com
Line 1: Line 1:
-{{tag>DraftIncomplete}}+{{tag>Editing}}
 ==== Voice Talent ==== ==== Voice Talent ====
 Note:  This page assumes you have chosen to use professionally recorded voice talent. ​ If you are using Text-to-Speech,​ for more information,​ [[tts|see that page]]. ​ For more on choosing between using recordings vs. Text-to-Speech,​ see the page [[Recordings vs. Text to Speech]] Note:  This page assumes you have chosen to use professionally recorded voice talent. ​ If you are using Text-to-Speech,​ for more information,​ [[tts|see that page]]. ​ For more on choosing between using recordings vs. Text-to-Speech,​ see the page [[Recordings vs. Text to Speech]]
Line 16: Line 16:
 Recently, talent warehouse websites have arisen on the internet; one prominent one is The Voice Realm. ​ These talent warehouse sites should not be mistaken for full-service voice talent agencies. ​ Talent warehouses are simply aggregators. ​ They allow independent talents to list themselves on the website. ​ Using a talent warehouse, you can put out a "​casting call" to a wide variety of independent talents who will then record samples of your audio. ​ These talent warehouses are usually fairly inexpensive. ​ However, when you are using one, you are simply using the aggregator'​s website to find and pay an independent talent who is often running their own studio and doing their own audio engineering. Therefore, even when you are using the same aggregator website, the quality, skill and technical ability will vary widely depending on the individual talent you are working with.  ​ Recently, talent warehouse websites have arisen on the internet; one prominent one is The Voice Realm. ​ These talent warehouse sites should not be mistaken for full-service voice talent agencies. ​ Talent warehouses are simply aggregators. ​ They allow independent talents to list themselves on the website. ​ Using a talent warehouse, you can put out a "​casting call" to a wide variety of independent talents who will then record samples of your audio. ​ These talent warehouses are usually fairly inexpensive. ​ However, when you are using one, you are simply using the aggregator'​s website to find and pay an independent talent who is often running their own studio and doing their own audio engineering. Therefore, even when you are using the same aggregator website, the quality, skill and technical ability will vary widely depending on the individual talent you are working with.  ​
  
-Your choice of audio talent should be driven by your business needs and level of expertise you have available. ​ If you are embarking on a large project for an important customer, an experienced voice talent agency will be a reliable partner who can consistently deliver high quality audio, saving you project time.  For a small demo or a concept clip, an independent talent or warehouse could be feasible, but you should expect to spend more time engaging with the talent/​engineer and QAing the audio.+Your choice of which type of talent ​studio to work with should be driven by your business needs and level of expertise you have available. ​ If you are embarking on a large project for an important customer, an experienced voice talent agency will be a reliable partner who can consistently deliver high quality audio, saving you project time.  For a small demo or a concept clip, an independent talent or warehouse could be feasible, but you should expect to spend more time engaging with the talent/​engineer and QAing the audio.
  
  
 **// Maintain consistency across the brand //** **// Maintain consistency across the brand //**
  
-If a voice talent already represents the brand in other media, you can consider using that talent for other audio. However, this is generally not practical if current branding employs a celebrity voice. Celebrities are often pricey and not readily available.  ​Also, consider the purpose of the voice-enabled application you are creating. ​ Alexa Skills and Google Actions are often more heavily branded as they may be based in the sales or marketing areas of an enterprise, whereas IVRs, being based in the customer service area of an organization,​ may be less heavily branded, because the customers may be calling for a different reason. ​ Or, is the image the celebrity presents in conjunction with the company consistent with the customer service the IVR will be supplying? For example, a prominent insurance company has a strong existing audio brand using a duck; however, when considering extending this audio brand to the IVR, decided that the duck is a "happy duck" and therefore would not be appropriate for a phone line which is often used by customers calling in regards to claims. ​+If a voice talent already represents the brand in other media, for example, advertising, you should ​consider using that talent for your voice-enabled application. However, this is generally not practical if current branding employs a celebrity voice. Celebrities are often pricey and not readily available.  ​
  
-If there is professional ​voice talent doing branding ​in other mediathen there'​s a much stronger case for using the same talent ​for the IVR for consistency.+Also, consider the purpose of the voice-enabled application you are creating. ​ Alexa Skills and Google Actions are often more heavily branded as they may be part of sales or marketing projects, whereas IVRs, being based in the customer service area of an organizationmay be less heavily branded, because ​the customers may be calling ​for a different reason. ​ So, consider whether or not the marketing campaign which features the celebrity is consistent with the goals of your voice-enabled application. ​ For example, a prominent insurance company has a strong existing audio brand using a duck; however, when considering extending this audio brand to the IVR, the team decided that the duck is a "happy duck" and therefore would not be appropriate ​for a customer interaction over a phone line which is often used by customers calling in regards to claims, which may be sad events.
  
-**// Give the client choices //**\\+If there is professional voice talent doing branding across several media channels, then there'​s a much stronger case for using the same talent for the IVR for consistency. 
 + 
 +**// Tying voice talent selection to the desired system persona //** 
 + 
 +The user persona developed by the business should drive the system persona designed for the voice application. ​ (For more on user persona and system persona, see the page [[persona_and_brand|Persona and Brand]]) This system persona should then drive the selection of voice talent. ​ For example, consider an application developed by a cosmetics company to remind their independent beauty consultants of the birthdays and anniversaries of their colleagues and coworkers. ​ In order to resonate with their consultants,​ the brand wishes to convey a system persona that is upbeat, fun, and feminine. ​ The  
 + 
 +Vocal characteristics 
 + 
 +Register – High pitch or low pitch 
 + 
 +Soprano / alto / tenor / bass 
 + 
 +Open-ness, resonance, sonority 
 + 
 +Associated with depth, size, seriousness 
 + 
 +Chest-voiced quality 
 + 
 +“James Earl Jones” 
 + 
 +Closed-ness,​ nasalness 
 + 
 +Associated with quickness, lightness 
 + 
 +May be irritating 
 + 
 +Head-voiced quality 
 + 
 +“Mr. Burns” 
 + 
 + 
 +"​Read"​ or delivery 
 +How the talent reads the lines 
 +Speed, pacing 
 +Spaces 
 +Intonation 
 +Good talents can change a lot of their characteristics through different “reads” 
 + 
 + 
 + 
 +Voice characteristics - need an article on this? 
 + 
 + 
 + 
 +**// Give the client choices //**
  
 If seeking a new voice talent, keep the client in the loop. It usually works well to provide clients with samples of three or four voices so they can choose the voice they feel best represents their company. Letting the stakeholders vote privately makes for a fun reveal and discussion of why voices were chosen. If seeking a new voice talent, keep the client in the loop. It usually works well to provide clients with samples of three or four voices so they can choose the voice they feel best represents their company. Letting the stakeholders vote privately makes for a fun reveal and discussion of why voices were chosen.
Line 35: Line 79:
  
 **// Involve the right stakeholders in the decision //**\\ **// Involve the right stakeholders in the decision //**\\
-Make sure the highest-level executive who cares about the IVR voice is engaged in the selection process. "Trust me—you do not want to be in a meeting where you’re presenting the working version of the application (including all professional recordings) to the senior vice-president in charge of customer care who, upon hearing the voice for the first time, says, 'I hate it. We need a different voice'"​ (Lewis, 2011, p. 103).+Make sure the highest-level executive who cares about the IVR voice is engaged in the selection process. "Trust me—you do not want to be in a meeting where you’re presenting the working version of the application (including all professional recordings) to the senior vice-president in charge of customer care who, upon hearing the voice for the first time, says, 'I hate it. We need a different voice'"​ ([[references#​lewis2011|Lewis, 2011]], p. 103).
  
 === Gender === === Gender ===
 **// Do not overemphasize gender //**\\ **// Do not overemphasize gender //**\\
-There is no compelling research to indicate an advantage based solely on the gender of the voice talent (Couper, Singer, & Tourangeau, 2004; Lewis, 2011). For average listeners in normal channels, “…there is little evidence to suggest that one sex of speaker is more intelligible than another, if other factors are ruled out. For example, males may typically have louder voices than females, and female voices may be more high-pitched than males, but if these factors are controlled for, any sex differences usually disappear” (Edworth ​& Hellier, 2005).+There is no compelling research to indicate an advantage based solely on the gender of the voice talent ([[references#​couper|Couper, Singer, & Tourangeau, 2004]][[references#​lewis2011|Lewis, 2011]]). For average listeners in normal channels, “…there is little evidence to suggest that one sex of speaker is more intelligible than another, if other factors are ruled out. For example, males may typically have louder voices than females, and female voices may be more high-pitched than males, but if these factors are controlled for, any sex differences usually disappear” ([[references#​edworthy|Edworthy ​& Hellier, 2005]]).
  
-There is a general tendency in the US to use a female voice for IVRs (likely due to their service-provider orientation -- for a historical perspective,​ see Yellin, 2009), but there are numerous examples of successful use of male voices in IVRs. Find out if your client cares and, if so, take that into account when selecting a voice or set of voices to review.+There is a general tendency in the US to use a female voice for IVRs (likely due to their service-provider orientation -- for a historical perspective,​ see [[references#​yellin|Yellin, 2009]]), but there are numerous examples of successful use of male voices in IVRs. Find out if your client cares and, if so, take that into account when selecting a voice or set of voices to review.
  
-There is no question that we all carry conscious and unconscious stereotypes in our heads. In recent years, the psychologist most strongly associated with research in how these stereotypes affect human-computer interaction is Clifford Nass (Nass & Brave, 2005; Nass & Yen, 2010; Reeves & Nass, 2003), most notably in the book, "Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship"​. In that book, Nass and Brave (2005) described experiments in which different types of people used speech applications (notably, with most of the experiments using TTS rather than professional voice talents for their audio). In most of the studies they replicated classic social psychology studies of interactions between humans, replacing one of the humans with a speech-enabled computer, a variation of the “computers as social actors” (CASA) paradigm.+There is no question that we all carry conscious and unconscious stereotypes in our heads. In recent years, the psychologist most strongly associated with research in how these stereotypes affect human-computer interaction is Clifford Nass ([[references#​nass2005|Nass & Brave, 2005]][[references#​nass2010|Nass & Yen, 2010]][[references#​reeves|Reeves & Nass, 2003]]), most notably in the book, "Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship"​. In that book, [[references#​nass2005|Nass and Brave]] (2005) described experiments in which different types of people used speech applications (notably, with most of the experiments using TTS rather than professional voice talents for their audio). In most of the studies they replicated classic social psychology studies of interactions between humans, replacing one of the humans with a speech-enabled computer, a variation of the “computers as social actors” (CASA) paradigm.
  
 For example, they replicated the “similarity attraction” effect, the finding that people are attracted to other people who are similar to themselves. In these laboratory experiments,​ extroverts preferred an extroverted user interface and males preferred to hear a male voice. It turns out, however, that it is difficult to apply many of these findings to user interface design (e.g., how would you know in advance if a caller were male or female, introvert or extrovert). Additionally,​ they reported that people tend to rate male voices as more trustworthy (especially male listeners), and to expect females to be more nurturing. For example, they replicated the “similarity attraction” effect, the finding that people are attracted to other people who are similar to themselves. In these laboratory experiments,​ extroverts preferred an extroverted user interface and males preferred to hear a male voice. It turns out, however, that it is difficult to apply many of these findings to user interface design (e.g., how would you know in advance if a caller were male or female, introvert or extrovert). Additionally,​ they reported that people tend to rate male voices as more trustworthy (especially male listeners), and to expect females to be more nurturing.
  
-Despite the reliability with which these social effects appear in replications of social psychology experiments,​ they are not as reliable when assessed in real-world systems that are otherwise usable, that is, efficient, effective, and pleasant (Balentine, 2007). Lewis (2011), in an analysis of data from studies of the perception of the quality of TTS voices (both male and female) rated by both males and females, did not find any significant Voice Gender by Listener Gender interaction,​ an interaction that the similarity attraction hypothesis would have predicted (and an effect replicated by Machado et al., 2012). Couper, Singer, ​and Tourangeau (2004) studied the influence of male and female artificial voices on more than 1000 respondents to an IVR survey on sensitive topics. They measured respondents’ reactions to the different voices and abandoned call rates, and found no statistically significant results related to the gender of the voices. In particular, there were no significant Voice Gender by Respondent Gender interactions.+Despite the reliability with which these social effects appear in replications of social psychology experiments,​ they are not as reliable when assessed in real-world systems that are otherwise usable, that is, efficient, effective, and pleasant ​([[references#​balentine2007|(Balentine, 2007)]]). [[references#​lewis2011|Lewis]](2011), in an analysis of data from studies of the perception of the quality of TTS voices (both male and female) rated by both males and females, did not find any significant Voice Gender by Listener Gender interaction,​ an interaction that the similarity attraction hypothesis would have predicted (and an effect replicated by [[references#​machado|Machado et al., 2012]]). [[references#​couper|Couper, Singer, ​Tourangeau]] (2004) studied the influence of male and female artificial voices on more than 1000 respondents to an IVR survey on sensitive topics. They measured respondents’ reactions to the different voices and abandoned call rates, and found no statistically significant results related to the gender of the voices. In particular, there were no significant Voice Gender by Respondent Gender interactions.
  
-“Why such strong effects of humanizing cues are produced in laboratory studies but not in the field is an issue for further investigation. … Across these studies, little evidence is found to support the ‘computers as social actors’ thesis, at least insofar as it is operationalized in a survey setting” (Couper et al., 2004, p. 567).+“Why such strong effects of humanizing cues are produced in laboratory studies but not in the field is an issue for further investigation. … Across these studies, little evidence is found to support the ‘computers as social actors’ thesis, at least insofar as it is operationalized in a survey setting” ([[references#​couper|Couper et al., 2004]], p. 567).
  
 === Coaching, inflection === === Coaching, inflection ===
Line 63: Line 107:
 For any given sentence or phrase, there are many ways to speak it, only one or a few of which will be appropriate in a given context. For example, what is the correct way to record the question (appearing in a list of frequently asked questions), “What happens after I apply for cash assistance?​” Should the speaker emphasize “What,” “happens,​” “after,​” “apply,​” or “cash assistance”?​ For any given sentence or phrase, there are many ways to speak it, only one or a few of which will be appropriate in a given context. For example, what is the correct way to record the question (appearing in a list of frequently asked questions), “What happens after I apply for cash assistance?​” Should the speaker emphasize “What,” “happens,​” “after,​” “apply,​” or “cash assistance”?​
  
-The answer depends on the question'​s context. If the surrounding items concern other aspects of applying for and getting cash assistance, then plan to emphasize “after,​” contrasting it with the things that happen before applying. If the surrounding items have to do with other types of assistance such as food stamps or health benefits, then plan to emphasize “cash assistance.” It’s critical to get the prosodic element of contrastive stress correct (Cohen, Giangola, & Balogh, 2004; Lewis, 2011).+The answer depends on the question'​s context. If the surrounding items concern other aspects of applying for and getting cash assistance, then plan to emphasize “after,​” contrasting it with the things that happen before applying. If the surrounding items have to do with other types of assistance such as food stamps or health benefits, then plan to emphasize “cash assistance.” It’s critical to get the prosodic element of contrastive stress correct ([[references#​cohen|Cohen, Giangola, & Balogh, 2004]][[references#​lewis2011|Lewis, 2011]]).
  
 Usage notes are also helpful, especially when recording small pieces that will later be concatenated together. Knowing that something will be an element in a list or the last thing in a fill-in-the-blank sentence makes all the difference in the world in how it's recorded. Usage notes are also helpful, especially when recording small pieces that will later be concatenated together. Knowing that something will be an element in a list or the last thing in a fill-in-the-blank sentence makes all the difference in the world in how it's recorded.
  
 These notes in the manifest are all the more important if the designer is not the coach. They will be invaluable to the coach and voice talent. These notes in the manifest are all the more important if the designer is not the coach. They will be invaluable to the coach and voice talent.
 +
 +For considerations to make when selecting a voice talent for a multilingual application,​ see [[multilingual_applications|Multilingual Applications]].
  
 [[References]] [[References]]