Diversity in Emotion AI and why it matters for business

For a long time technological bias was a purely academic issue but since recently it has been manifesting itself in the commercial world, and with much broader implications. In a world where companies rely on analytics and data-informed decision-making ignoring bias can cost money, customers, and jobs.

When it comes to customer analytics, Machine Learning and Affective Computing — the fields associated with AI and emotion recognition — create powerful tools for gathering behavioral insights and helping understand people like never before. That’s why companies strive to adopt them widely: from customer experience management and service improvement to product development. And still some technologies struggle not only to correctly identify emotions but to recognize a face at all unless it is a person of European descent.

In this article, we will look at the negative consequences various types of technological bias led to, talk about avoiding bias in emotion recognition tasks, and discuss why it is important to do this at all.

What’s the deal?

Last year MIT published a study that revealed that two datasets, setting benchmarks for the industry, — IJB-A (2015) and Adience (2014) — and used to train face recognition algorithms were largely composed of white males. Thus, the results eventually were far from being fair. It turned out that Microsoft’s technology had an error rate of 20.8% when recognizing women with darker skin tones, IBM got 34.7%, and China-based AI giant Megvii (Face++) was also 34.5% confused when recognizing darker-skin women compared to lighter-skin men.

One more example: in 2016, Amazon disbanded the unit that was struggling to create a tool to automatically review and rank job applicants’ resumes. It turned out the algorithm gave preference to male candidates. Amazon fixed the program to avoid discrimination, but in a while closed the project — there was no guarantee the tool would not “devise other ways of sorting candidates that could be discriminatory’’.

Apart from gender and ethnic imbalance in recognition and ranking tasks, the problem automatically extends to emotion detection systems. The quality of data here is of supreme importance: developers need large amounts of labeled photos and videos of people expressing emotions.

Back in the 1980s, a group of researchers discovered that people from different cultures did show bias when evaluated each other’s emotions. And when it comes to technology, we see similar trends. As a recent study shows, emotional analysis technology tends to assign more negative emotions to black men’s faces than to white men’s faces. In the field of emotional technologies this is a wake-up call:

«First, black faces were consistently scored as angrier than white faces for every smile. … Second, black faces were always scored as angrier if there was any ambiguity about their facial expression. … Black men’s facial expressions are scored with emotions associated with threatening behaviors more often than white men, even when they are smiling.» — says Lauren Rhue, the author of the study.

Skin distribution map. Source: Encyclopedia Britannica.

Whether we need to recognize a face, determine an emotion or measure the degree of emotional expression, we take a dataset that will correlate with these aspects, or will specifically create one from the available sources. However, on the large scale of things we tend to overlook that the samples we have chosen are disproportionate. They suit our needs, but are they fair? The solution only reflects what we need it to reflect and what we have trained it to do.

Why balance is important?

First and foremost, imbalanced data imposes limits upon the technological applications which are generally missed until the struggles arise.

Sometimes such errors can lead to relatively unserious matters, such as the case with the famous Chinese businesswoman. Dong Mingzhu was fined by mistake because the system decided her photo printed in the bus ad had been crossing the road in the wrong place.

The photo of Ms Dong on the bus. Source: The Telegraph.

But the implications can be especially sensitive in the law enforcement industry, where inaccurate face recognition can lead to overuse ‘’on the segment of the population on which it underperforms’’.

It concerns emotional recognition tech as well but in a different way. Where face identification systems aim to match a face to the one in a gallery of thousands (or even millions) of faces, emotion recognition soft aims to detect the slightest emotional cues in this face. Emotional expression differs from person to person and depends on the complex set of factors including age, gender, and cultural peculiarities. And it is subject to erroneous predictions as well.

With the fast spread of the technology the issue is more relevant than ever — the upcoming International Conference on Affective Computing & Intelligent Interaction (ACII 2019), the biggest one in the industry, will be centered around the slogan of “Affective Computing for ALL’’, and will focus on inclusive technology that considers “the full range of human diversity with respect to ability, language, culture, gender, age and other forms of human difference.’’

Integrating emotional data into customer analytics opens up a new dimension of customer insights. Among other possible applications, they can be used to improve customer experience and service management. For example, to evaluate the work of sales managers, depending on how satisfied were the customers with communication. Or to evaluate the performance of a particular branch based on the overall customer satisfaction statistics. In the case of low performance, companies can introduce staff training, adapt their business processes, and stimulate better employee performance through bonuses.

However, the algorithm that calculates customer satisfaction index based on the detected customer emotions in the United States, will reflect an objective picture. But the same algorithm brought to Italy or China, where patterns of emotional expression may differ, may turn out to be wrong, even biased. This can lead to erroneous decisions, which businesses might take on the basis of these data.

Is it bad news for business?

In general, emotion recognition applications for business require specific training data for AI. For instance, most companies in a particular geographic region would work with specific appearances. While in some countries or locations, such as megapolices, the audience can be very diverse due to the historical background.

Indeed, the accuracy of emotion recognition depends not only on the correct representation of different ethnic, age or gender groups. Last November we participated in the event organized by one of the major banks. What we found out — there are more people with beards than we would have expected, and in situations where the audio input is missing, a beard can be a challenge for the emotion recognition algorithm, as it obscures more than a third of the face.

That is why when adopting emotion recognition applications in a store or a service office, it is important to further train the algorithm on the local data. Next-level multimodal recognition and context-based technology will also lend a helping hand, clarifying the results of the emotion analysis with the data received from the voice and body movements.

Can we avoid bias at all?

Diversity is our power. We feed the machines the data that supposedly make the technology bias-free. In fact, it is not always so. So, what shall we do to stop that?

Just like famous Joy Buolamwini’s Algorithmic Justice League, tech giants have already got on the diversity train. For example, this week IBM has released the biggest labeled dataset containing about 1,000,000 images — equally distributed across skin tones, genders, and ages. Google has been also creating tools to help the community make equally just decisions for all groups of the population. We in Neurodata Lab created Emotion Miner — a digital platform containing affective data extracted from public videos available across the globe, which can be adapted for any specific request, any cultural background or the needs of local communities.

The world has changed and we should act accordingly. Humanity created a new force that is changing all industries. But with great power comes great responsibility, therefore, it is in our best interest to teach the machines better.


You are welcome to comment this article in our blog on Medium.