Progress Report

Construction of an AIoT-based universal emotional state space and evaluation of well-/ill-being states1. Constructing a Human Emotional State Space by AIoT

Progress until FY2022

1. Outline of the project

In this project, we aim to develop AIoT (AI×IoT) technology that enables the objective estimation of emotional states in daily life on a cloud platform using multidimensional psycho-physiological data (e.g., voice data, physical activity, heart rate, respiratory rate, and recording situations) measured by IoT devices. The project is divided into the following three research topics:

1. Development of human emotion estimation technology with clinical validity using multidimensional psycho-physiological data
Summary: we develop technology with clinical validity that enables the objective estimation of various emotional states on a cloud platform. This is achieved by utilizing multidimensional psycho-physiological data (e.g., voice data, physical activity, heart rate, respiratory rate) measured by IoT devices in daily life.
2. Assessment of psycho-physiological data in patients
Summary: we collect psycho-physiological data from patients with mental disorders to construct a human emotional state space with clinical validity.
3. Development of a translational IoT cloud system
Summary: we develop a cloud-based IoT system capable of acquiring continuous and real-time psycho-physiological data from both humans and animals (mice and rats) in real-world settings on a large scale.

2. Outcome so far

To develop human emotion estimation technology, we first conducted data cleansing on our existing database. Utilizing this refined database, we constructed machine learning models capable of estimating self-annotated emotion scores recorded in daily life.
For emotion state estimation based on spontaneous physical activity data, we developed a multi-task learning model that simultaneously estimates four emotional states (depressive mood, anxiety, positive mood, and negative mood) from local statistics of the physical activity data (>300 individuals with approximately 7,000 recordings). Our model archived an average absolute error of 0.2 (around 20% error of normalized scores) in the performance of estimating emotion scores. Additionally, we confirmed that an optimization approach of specific layers of the constructed network for each individual using transfer learning significantly improved the estimation accuracy.

Image

For the emotion estimation based on speech data, we also developed a multi-task learning model to simultaneously estimate the nine emotions assessed by the Depression and Anxiety Mood Scale questionnaire (DAMS). This includes the following 9 emotions; vigorous, gloomy, concerned, happy, unpleasant, anxious, cheerful, depressed, and worried. The model utilized high-dimensional features extracted from 10 seconds speech data (approximately 20,000 recordings) as input signals. By incorporating personalization layers into the network structure, we achieved successful estimation of the nine emotions, with an average coincident correlation coefficient of 0.55 and a maximum of 0.61.
Our results are comparable to the highest accuracy achieved by models built on data acquired in environments where various factors are well-controlled (e.g., laboratory settings). This highlights the significance of our findings, as we were able to achieve the comparable level of accuracy using data from everyday life.
Regarding the assessment of psycho-physiological data in patients, we designed a research plan and submitted an application form to the ethics committee.
For the development of a translational IoT could system, we developed firmware for acceleration and photoplethysmography data processing on the ring-type wearable device. Additionally, we started integrating the ring-type device with the existing our IoT cloud system (both API and modem-type gateway integration). Furthermore, we conducted experiments to validate the feasibility of adapting the ring-type device for use with mice. Based on the results of these experiments, we designed a prototype tailored for animal use.

3. Future plans

  • We plan to improve human emotion estimation by constructing multi-modal learning models that leverage various physiological signals, as well as enhancing feature extraction methods.
  • We will gather clinical data and utilize them to develop emotion estimation models with clinical validity.
  • We will continue to develop and improve the integration of the ring-type device with our existing IoT cloud system.
  • We will proceed with the development a device designed for animal use and conduct its validation studies.

(NAKAMURA Toru: Kobe University
YAMAMOTO Yoshiharu: The University of Tokyo)