What is a Multimodal Humanoid Companion for Customer Services

Everyone knows that we live in the age of Digital Transformation, Big Data, IoT, AI, etc. Whatever conference you’re at, whatever business magazines you read — you cannot escape. Every business is currently trying to figure out what trajectory will take the digital trend and how it will be transformed when applied to real-life scenarios and cases.

At Neurodata Lab we develop Emotion Recognition software for high-tech industries. We also analyzed some of the trends we were interested in, mostly focusing on certain changes in robotics. Here’s why.

Let’s have a look at Gartner’s report: “Top Trends in the Gartner Hype Cycle for Emerging Technologies, 2017”. At the peak of Expectations, we find Virtual Assistants and Smart Robots. Same for “Gartner Market Guide for Conversational Platforms, 20 June 2018”. There is a lot of research like that going on right now. So, what does this all mean?

The work with Big Data and “smart systems” has become more and more salient. And just like with every other trend, the bottleneck has been soon revealed. The interface. However smart the system is, however “big” the data or whatever the quality is, the result of its work must be clear to the user. The interaction with such system all in all must be convenient to the user. Together with the Digital Transformation, Big Data, and IoT systems, the business industry will surely demand a new highly intuitive ergonomic interface for communication with people. The problem of the interface solutions will come up more often and more critically. Especially when, unlike a trained specialist, a simple user is engaged.

We asked ourselves — so what is this interface solution exactly?

We searched for the answer in different case studies, CES 2018 in particular, and here they are:

We can go through these cases forever. First, they show how much these interface solutions vary, and how hard it is to find how to classify and range them. Sometimes it is even harder to find what business task they address and how it is reflected in the functionality.

The document from the PwC became a great step towards the systematization of the current situation in the industry: “The Relevance and Emotional Lace” PwC Digital Services, Holobotics Practice”. It unites Chatbots, Virtual Assistants, and Humanoid Interactive Robots into one type of the interface solution called Humanoid Companion.

Automated Network Intelligence for Multiple Access. Credit: PWC. Available at: https://www.pwc.com/it/it/services/consulting/holobotics/docs/holobotics.pdf

Then the solution is subdivided into two subtypes: Physical Robotics and Online Robotics.

It should be noted that the subdivided segments of the market are basically dealing with the same agenda — the creation of a high-quality ergonomic interface solution — are now severely divided.

Specialized manufacturers, suppliers, analysts are often dealing with this. The division into subtypes in the PwC is based on “what hardware is the core of the solution”: a robot, a computer, etc.

We suggest introducing another aspect that might help us get a clearer picture and provide a more transparent vision for this type of solution: the type (modality) of human-machine interaction that is included in the interface solution. Let’s dive into details: It’s hard to communicate just with words! There are 4 levels of interaction (which are convenient and ergonomic for a person: words, voice, face, body. When we communicate we engage either all of the modalities, or just a part of them). The same thing goes for the interface systems: there are lots of solutions that are focused on handling the same agenda, yet they differ based on modalities they include.Here are the examples:

Chatbot — in such solutions only one modality is used — words, text interaction.

Assistant — assumes the presence of a second modality — voice, speech interaction.

Avatar — these are more complex and rare solutions, now they can be seen mainly in HR.

There are also some, let’s say, intermediate solutions, for example, a user writes a text, while an assistant speaks and moves:

Robot — and our favorite Human Robots, multimodal communication agents working with all the 4 modalities: text, voice, face and body.


Vector by Anki. Credit: Anki.

What’s important is that the inclusion of new modalities, on the one hand, allows new opportunities, and on the other hand, requires extra tech like new pioneering AI systems namely in the field of Emotion AI that deal with NLP and interface solutions of the same type.

This is exactly what Neurodata Lab deals with. We provide tech products in the sphere of Emotion AI for interface solutions (chatbots, virtual assistants, avatars, and robots).


You are welcome to comment this article on our blog on Medium.