
Sometimes a seemingly straightforward statement said another way can mean something completely different. For example, “Your company’s service is unbelievable.” It’s a simple phrase. One that could be said many times a day when customers interact with a business. But is it a positive or negative statement? It can be either, depending on the emotional context.
Emotions are a critical factor in how we, as humans, communicate with each other. We are uniquely adapted to understand a broad range of emotional indicators such as facial expression, body language, and tone of voice. Without an understanding of emotional context, conversations can be difficult to understand.
Each day in thousands of contact centers around the world, phone conversations are recorded, stored, and generally, never heard by anyone but the agent that took the call. It is widely accepted that valuable business insights are contained in these recordings. The problem is that there are far too many interactions taking place for the center to listen to all of them.
Finding the Meaningful Interactions
Interaction analytics technology is designed to help organizations overcome the resource limitation associated with trying to listen to all the calls. Interaction analytics does this by assisting contact center organizations with locating the meaningful recordings that are worth the time to listen to among the many available. One key technological component used to accomplish that is emotion detection.
Emotion detection technology analyzes and flags recoded conversations where heightened emotional levels are present in the interaction between customers and contact center agents. How does this amazing technology work?
There are two different approaches for detecting emotion in recordings. One is a variation of word spotting technology that attempts to detect emotion by recognizing certain words or phrases that may indicate high emotional levels. However, this is a little like putting the horse before the cart. Emotional content should be used to help understand the context of the words spoken, not the other way around.
A better method of emotion detection uses physical parameters of speech, such as pitch, pacing and volume, to gauge the level and presence of emotion in a conversation. A major advantage of this approach is that it is independent of the actual words that are spoken. This provides another viewpoint of information to be used when classifying or locating recordings of interest.
Begin with an Emotional Baseline
Without getting too deep into the actual technology, here’s how it works. During the initial portion of the conversation an ‘emotional baseline score’ consisting of the speaker’s pitch, word pacing, volume and other characteristics is calculated. Establishing a baseline at the start of each conversation minimizes the effects of variables like background noise and changes in a person’s voice due to for example, a cold.
Establish an Emotional Score
Once the baseline score has been determined, that same calculation is repeated periodically during the conversation. This provides an ‘emotional score’ for ongoing segments of the exchange. Comparing the score of each segment to the baseline score provides information on whether the emotional level has changed, and to what extent.
If the difference exceeds a certain threshold, emotion is flagged for that segment. Making this threshold setting adjustable by the end user allows for fine tuning. A higher threshold level results in lower numbers of false alarms. It also increases the chances of missing an emotional interaction, however. Separating the conversation into segments this way also allows the technology to accurately define where in the timeline of the interaction the emotion was detected. This can then be visually depicted in a graphical view of the conversation.
Separate Speaker and Agent Audio Tracks
Speaker separation, the ability to separate both sides of a conversation into different audio tracks, is needed for this technology to work well. It provides a valuable ability to associate which side of the conversation was emotional – the customer or the agent in the case of the contact center. Another factor affecting the accuracy of emotion detection is the presence of external audio, such as music on hold, or background noise. Fortunately technologies exist to mask or reduce these types of interference.
For example, let’s take a look at how emotion detection technology delivers value to the contact center. One best practice in this area is to review all recordings flagged as containing emotion, particularly when it is detected on the customer side of the conversation. Although the technology does not differentiate between good and bad emotions, both types of interactions present opportunities for improvement.
Improve Customer and Agent Satisfaction
When reviewing emotional interactions, the center doesn’t have to listen to the entire recording. Instead, the specific portion of the recording is flagged and that section of the interaction can be skipped to and replayed. This saves a considerable amount of time during the initial review process. If it is discovered that an agent was unable to defuse a negative emotional situation, that customer can be contacted proactively to address the issue. Imagine the impact that has on customer satisfaction!
One contact center in the health insurance industry is doing this successfully and realizing the benefits. The center performs emotion detection analysis on every call that is recorded in their operation. Periodically throughout the day, quality assurance personnel listen to recorded interactions that have been flagged as emotional. A couple of interesting cases point out how proactive response can positively impact customer service.
In the first example a customer called in regarding a surgery that was scheduled for the next day. The agent they talked to told them that the company had not received the medical records from the provider. These records were needed before the surgery could be approved as payable under the customer’s health plan.
Understandably, the customer got quite upset at finding this out at the last minute. The call ended with the agent informing her there was nothing that could be done and that the surgery would have to be postponed.
After reviewing the recording due to being flagged as emotional, the quality assurance specialist contacted the customer’s provider directly and requested that the medical records be immediately faxed. The records were received and reviewed. The customer was called back and informed that her surgery was approved for tomorrow.
In another instance a customer called in with a billing issue five minutes before the call center was scheduled to close at 8:00 p.m. He felt that he had been sent the bill by mistake and wanted to discuss this with someone in the billing department. The agent informed him that the billing area closed three hours ago at 5:00 p.m. and that he would need to call back when they opened tomorrow at 8:00 a.m. The customer was very upset that he had received the bill and told the agent that she had better warn the billing people as he was going to call first thing in the morning.
At 7:00 a.m. the call was reviewed by the quality assurance manager. She did some research on the bill in question. It was discovered that the company had indeed made a mistake and the bill should not have been sent. At 8:00 a.m. she called the customer at home to apologize for the error and tell him to disregard the bill he received.
Identifying problems and fixing them proactively provides obvious benefits; however, when ‘good’ types of emotion are discovered in an interaction, there are opportunities there as well. Agents can be recognized for their good performance, an important tactic in improving and maintaining agent morale. A best practice effort can also be showcased by creating a coaching package with clips from the interaction, and then used to improve overall performance.
Gain Valuable Business Insights
Emotional content provides one valuable piece of the puzzle when looking for recordings that provide valuable business insights. Used in combination with other known aspects of an interaction, it allows interesting recordings to be located with very high precision. In general, the more aspects used the better the results.
Let’s say the center wants to find captured interactions where the customer is at risk of leaving. Emotion could reasonably be expected to be evident in these types of conversations. It could also be present in many other types of interactions, however. Combining the occurrence of heightened emotional levels with the appearance of key phrases such as “cancel my service”, “cancellation policy” or the mention of competitor’s name would provide much more targeted results.
In fact, the ability to perform these types of multi-dimensional queries when locating recordings is a key feature to look for when considering the purchase of advanced call analysis tools.
Putting it to Use
Emotion plays an important role in our interactions with each other and it conveys valuable information about the meaning of a particular conversation. Emotion detection is a sound and useful technology, and one that is offered by many vendors of call recording and analytics solutions. As is the key with all technologies however, it’s not the technology itself that provides benefits to the business. It’s how that technology is put to use that matters.