
As voice recognition technology has flourished in the customer-facing environment, FST catches up with Dr Mark Randolph who explains the role VoiceXML is playing in enabling voice technology to develop.
As technology has developed, from the customer point of view the banking landscape has been revolutionised. As a kid I well remember my Mom rushing to her local bank branch to transfer some funds for the weekend before it closed. Nowadays she’d just log-on to her internet banking and rely on her debit card for the weekend groceries.
Given the bad press that off-shore call centers and automated phone systems sometimes get, it is worth considering that phone banking is, just like internet or mobile banking, a relatively new channel that has revolutionised customers lives. One of the key enablers of this growth has been the development of VoiceXML.
Industry standard
Originally developed in 1999 when AT&T, IBM, Lucent and Motorola realized they were all trying to develop essentially the same thing, VoiceXML has become the industry standard for voice applications. “It is pretty much the adopted practice for developing applications and marking up speech applications – more than 10,000 applications have been developed using it,” explains Dr Mark Randolph, Chairman of the VoiceXML Forum – a global industry organization that promotes the adoption of VoiceXML and other speech-related technologies.
In the financial services space the primary application has been in call centers and telephony, particularly self-service. However, as Randolph explains, enterprises are looking to develop their IT services to support self-service in a more holistic manner. “Financial services are trying to broaden their reach to customers,” he explains. “As banks develop their web services, VoiceXML plays in well with that. At the back-end having an application server that can serve up VoiceXML to a voice platform in the same manner that it serves up HTML to a web site is advantageous.”
Randolph also explains that VoiceXML’s flexibility allows it to support more than just voice. “Bear in mind with a technology like speech recognition, if voice isn’t working for the user, then the user can always fall back on touchtone,” he suggests. “Typically those touchtone applications are also developed using VoiceXML. The thing to think about is how you would provide a voice interactive service regardless of whether you have voice input.”
Untapped potential
While voice recognition software is now an established part of most banks’ toolkit, it is generally used for automating responses. However, the VoiceXML language can act as an enabler for more advanced uses. “If you look at things like analytics, for example, the whole industry has grown up because of VoiceXML. All the tools that are available to manage say a web-based CRM application exist because of it,” Randolph argues.
He also suggests that as the mobile space develops, VoiceXML will play a role in enabling customer applications to add functionality. “There’s a couple of scenarios in mobile,” Randolph explains. “One would be voice only, while the other would combine text, voice and graphics in a multi-modal interface. This would play well in the financial services space.”
For financial services, the security of technology applications is always at front of mind, and voice certainly has potential to be used in this context. “One can imagine that speaker-verification as a biometric combined with other security measures such as passwords would be used for an application as security conscious as banking.
To push this agenda, the Forum has chartered a speaker biometrics committee to extend the biometric standards that exist to support VoiceXML. “We’ve been working with ANSI, WC3 and other standards bodies to help them understand how VoiceXML can be incorporated.”
This isn’t the only standards activity that the Forum has been busy with. It offers a VoiceXML platform certification program that, as Randolph explains, ensures that the different components of a voice platform will be interoperable. This gives the advantage that users can ensure they are buying best-in-breed components at the best prices. Currently the Forum is certifying platforms for VoiceXML 2.1 conformance, and will launch a VoiceXML 3.0 certification program in due course. “We’ve had 22 platforms certified so far, and we’ve also published a solutions directory, as a reference for end-users.”
Going forward, from the end-user perspective Randolph thinks we can “only expect” speech recognition performance to improve. “Better accuracy, more open vocabulary; people will be able to make open-ended queries rather than being tightly scripted through a dialogue,” he predicts. The other driver in terms of deployment will be cost-savings. “VoiceXML has embraced VoIP; as call centers and the whole industry adopts IP as a way of building and delivering services, VoiceXML will fit in well.”
Randolph is sure of one thing – the industry will continue to develop and change. His final thought is an invitation to the industry to join with the Forum to help shape this change. “We want to engage with end-users more deeply – obviously it’s the customer that really drives the industry, so we look for them to join the Forum and help us guide the technology in a way that’s going to best meet their requirements.”
Dr Mark Randolph is Chair of the VoiceXML Forum, and Director of Technology and Engineering at Motorola. Dr Randolph has led teams at Motorola to develop large vocabulary speech recognition and natural language dialog systems, the industry’s first implementation of a distributed speech recognition system, and the ‘Mya’ voice portal. He was elected chairman of the Forum in December 2006 and has expanded its work to include related speech, telephony and mobile technologies.
The mission of the Forum is to promote and to accelerate the adoption of Voice Extensible Markup Language (VoiceXML) as a standard for developing speech-based applications and to cultivate and nurture an ecosystem around VoiceXML. The Forum is not a standards body but works in collaboration with standards bodies, like the W3C and the IETF.