|
Using Touchtone and
Speech Recognition
By
Lizanne Kaiser, Ph.D
Feb/Mar 2006
Chances
are, the majority of people you know have experience using a speech-enabled
automated telephone system (e.g. "Please
say your account number"). Speech
recognition technology is advancing rapidly and today, call centers are looking
toward speech to provide a higher level of customer service than traditional
touchtone phone systems can offer. The
acceptance rate of speech recognition among callers is also growing.
In fact, 85% of people say that speech is easier to use than touchtone
and 90% feel that speech adds value to phone-based transactions.
Despite
that, many contact centers still rely on traditional touchtone for caller
interactions, even while implementing or migrating to speech.
For some, touchtone inputs are necessary for security or legal reasons,
for others it's a matter of preference. Either
way, using both technologies can
be beneficial, but the key is knowing how best to marry these two different
modes of input into one seamless automated caller experience.
Speech and Touchtone: Different
Caller Experiences:
Contact
centers migrating existing touchtone applications to speech, or creating new
self-service applications using speech, often assume that the same general call
flow architecture and usability best practices that apply to touchtone can be
applied to speech applications. In
fact, a caller's experience in a speech-enabled application versus a
touchtone-only application is very different and callers place different
expectations on a speech system than they do on a touchtone-only system.
Having
a "conversation" with a touchtone-only system is mechanical and generally
non-intuitive. The caller can only
interact with the system by listening to instructions and providing a mechanical
response (using keys on the telephone keypad) within limited menu options.
When
a caller interacts with a speech application he or she subconsciously compares
it to having a conversation with a real person - even though the caller knows
it's an automated system. Because
of this, the caller places a higher expectation on the speech-enabled system,
similar to the expectation placed on a live agent.
Callers
perceive a higher level of control with speech and even have a pre-existing
mental model of how the conversation will unfold, based on prior experiences
with agents. If the system's call
flow does not mirror the mental model of that task, callers tend to interrupt
the system to ask for what they want. The
best speech-enabled systems guide the caller unobtrusively, without sacrificing
the caller's sense of control or conversational naturalness.
Speech
allows automated systems to leverage a caller's natural ability for language
and conversation to provide a better caller experience.
The intent is not to fool callers into thinking they're speaking with a
real person, but to avoid "deal breakers" - points where the dialog flow
becomes so awkward or unnatural that the caller begins focusing more on the
system than on the interaction they need to accomplish.
Marrying Speech and Touchtone Happily:
When
creating a contact center system incorporating both speech and touchtone, call
centers need to design for speech first, and then incorporate touchtone into the
design as a secondary feature. Wherever
possible, the system should support touchtone as an alternative
input mode to speech. Avoid
randomly switching back and forth between speech-only and touchtone-only, as
this can confuse callers. For
instance, if the system encourages the caller to say information, the caller
should be able to say or touchtone a response, whichever they prefer:
System:
Please tell me your 10-digit home phone number.
Caller:
5554492350
or
System:
Please tell me your 10-digit home phone number.
Caller:
[Caller
enters 5554492350 on their telephone keypad]
Touchtone
offers an effective fallback for speech
as well. Well-designed and fully
tuned speech recognition systems can have recognition accuracy rates in the 90th
percentile. Nevertheless, speech
recognition systems can have difficulty accurately recognizing what the caller
is saying, especially if the caller is in a noisy environment, on a cell phone,
using a speaker phone, or has an accent or voice quality that is challenging for
the recognition engine. When speech
recognition errors occur, the dialog design should offer callers
context-specific, hierarchical error messages, with each level providing new or
additional prompting that helps guide the caller to use the fallback touchtone
function.
System:
Which type of account are you calling about - Checking, Savings, or CD?
Caller:
I'm calling about my checking account.
[loud background noise]
System: Sorry,
I didn't get that. Please say
Checking, Savings, or CD. Or if
you're calling about something else, just say Other.
Caller:
Checking!
[loud background noise]
System: I
still couldn't quite catch that. Let's
try the phone keypad instead. For
checking, please press one. For
savings, press two. For CDs, press
three. For all other questions,
press four.
Caller:
[Caller enters 1 on the
phone keypad]
Avoid
wording like "For checking, say or
press one" that neither sounds conversationally natural, nor helps constrain
the callers' possible responses effectively.
Even though this prompt directs callers to just say "One," callers
will still say a wide range of responses to a speech system, based on whatever
seems natural in that conversational context (such as, "I'm calling about my
Checking account.").
For
security or legal issues, some
companies encourage or require callers to input information via touchtone, such
as Social Security Numbers, account numbers, or confirmations.
Depending on the security or legal requirements, the system can be
designed to guide the caller to use touchtone, but recognize either speech or
touchtone as a valid input; or guide the caller to use touchtone and accept this
as the only valid input. If
touchtone is required, the error handling prompts need to clearly guide a caller
who attempts to use speech instead of touchtone.
Meeting Expectations:
In
order to provide the expected interaction, both speech and touchtone
applications must provide value to the caller.
With speech, it's important to anticipate what callers say at any given
point in the dialog and build those utterances into the recognition grammar file
active at that point in the application. Unlike
touchtone where there are limited options, if speech dialog prompts and
recognition grammars are poorly designed, the caller may end up saying things
that cannot be recognized by the automated system.
Remember that deal breakers destroy the natural rhythm of the interaction
and callers feel that the system has not lived up to expectations.
With
speech recognition, it's a balancing act between ensuring there are enough
different utterances listed in the grammar to maximize in-grammar coverage (that
what the caller said is listed in the grammar) and at the same time not
overloading any particular grammar with so many possibilities that it
significantly compromises in-grammar accuracy (that what the caller said is in
the grammar and is correctly matched as such by the recognition engine).
With touchtone, the limited options can often remove the uncertainty of in-grammar
coverage and maximize in-grammar accuracy, but at the same time, provide a less
satisfying caller interaction.
When
choosing to merge touchtone and speech applications remember to evaluate your
callers, their expectations, and your call center's requirements - knowing
that the goal is to offer callers the best way of getting self-service through
an automated phone system. All
systems yield the best return on investment when they are designed to address
business requirements, match the callers' needs and expectations, and offer a
conversational, helpful, and intuitive interface that represents your
organization's brand.
Lizanne Kaiser, Ph.D., is a
Senior
Principal Consultant of Voice Services for Genesys Telecommunications
Laboratories, Inc.
Benefits of Speech Recognition
-
Flatten
call flows, compared to touchtone
-
Shorten
calls up to 50% vs. touchtone
-
Increase
self-service usage from 20-60%
-
Decrease
hold time by as much as 35%
-
85%
more effective in routing calls vs.
54% with touchtone
Read
more articles
relevant to hospital and medical related call centers.
|