Customer Emotion Recognition
Through Facial Expression
by
Hoa T. Le
Bachelor of Information Technology
Thai Nguyen University of Information and
Communication Technology – Vietnam, 2012
A Thesis Proposal Submitted to the School of Graduate Studies
in Partial Fulfillment of the Requirements for the Degree of
Master of Science in Computer Science
Mapúa Institute of Technology
June 2016
ii
ACKNOWLEDGEMENTS
The Author would like to express her sincere gratitude to God and to other significant
persons for giving the opportunity to complete this study;
To the greatest Adviser, Sir Larry A. Vea, for the continuous support of this Master
thesis study and related research, for his patience, motivation, and immense knowledge. His
guidance made this research in completion;
To the Thesis Committee, Dean Kelly Balan, Sir Joel De Goma, and Sir Aresh
Saharkhiz, for their time, insightful comments and encouragement, and for the hard questions
which incented the author to widen and improve her research from various perspectives;
To the School of Graduate Studies, Dr. Jonathan Salvacion, and Sir Omar Ombergado,
for their instruction to complete the format of this paper and other requirements needed;
To Ms. Grace Panahon – Star Circle manager and Ms. Rizza Faustino, for the help to
have the permission to gather data in the stores;
To the Editor, for the time spent in patiently checking the errors and reviewing this
manuscript;
To the Family, Parents, Brother and Sister-in-law, for the support that they provided
through the entire life of the author;
To the Friends and Housemates, especially Jocel Marie T. Gebora, for the support and
provision of food and prayers to have this thesis achieved in full completion.
Hoa T. Le
iii
TABLE OF CONTENTS
TITLE PAGE
i
APPROVAL PAGE
ii
ACKNOWLEDGEMENTS
iii
TABLE OF CONTENTS
iv
LIST OF TABLE
vii
LIST OF FIGURES
ix
ABSTRACT
xi
Chapter 1: INTRODUCTION
1
Chapter 2: REVIEW OF RELATED LITREATURE
5
Emotion Typologies
7
Customer Emotion
9
Expression of Interest
9
Expression of Happiness
10
Expression of Sadness
11
Expression of Boredom
12
Expression of Surprise
12
Facial Affect Analysis
13
Microsoft Kinect SDK and Face Tracking Outputs
13
Kinect for Xbox 360 Face Tracking Outputs
16
Kinect v2 – High Definition Face Tracking
20
Comparison of Face Tracking Results Between Kinect v1 and Kinect v2
24
Piecewise Bezier Volume Deformation
24
iv
Candide-3
25
Classifiers for Emotion Detection
25
Related Works
26
Chapter 3: CUSTOMER EMOTION RECOGNITION THROUGH FACIAL
EXPRESSION
29
Abstract
29
Introduction
29
Methodology
33
Research Paradigm
34
Methodology Parameters
35
Data Collection
36
Gathering Setup
36
Feature Extraction and Annotation
38
Feature Selection
41
Annotation
42
Training Classifiers
43
Model Testing
44
Prototype Development
45
Prototype Testing
45
Real World Testing
45
Analysis of the Results
45
Machine Learning and Classification
47
Results and Discussion
48
v
Dataset Description
48
Animation Unit Interpretation
49
Annotation Results
52
Correlation Between the AUs.
54
Test Machine
55
Definition of Terms
55
Model Development
56
Model Testing
58
Model Performance of thirty-three (33) customers of Kinect 2.
59
Feature Selection
60
Classifier Analysis
62
Prototype Testing Result
66
Real World Testing
67
Conclusion
67
References
68
Chapter 4: CONCLUSIONS
76
Chapter 5: RECOMMENDATIONS
77
REFERENCES
78
vi
LIST OF TABLE
Table 1: Basic Emotions
8
Table 2: The Angles are expressed in Degrees
18
Table 3: Action Units [AUs] which represent “deltas “from the neutral shape of the face
19
Table 4: Shape Units [SUs] which determine head shape and neutral face
20
Table 5: Face Shape Animations Enumeration
22
Table 6: Kinect v1 and Kinect v2 Face Tracking Outputs
24
Table 7: Emotion Behaviors
42
Table 8: Instances in Kinect v1 and v2 dataset
49
Table 9: Animation Unit Interpretation (Microsoft) for Kinect 1
49
Table 10: Animation Unit Interpretation (Microsoft) for Kinect 2
50
Table 11: AUs detected from Sample Face by Kinect 1
51
Table 12: AUs detected from Sample Face by Kinect 2
52
Table 13: Features Observed by the Dataset
53
Table 14: Comparison of Magnitudes of “Happy", “Interest”, "Bored”, ”Surprise”" and “Sad”
in the Dataset.
53
Table 15: AUs Correlation
54
Table 16: Selected features using CfsSubsetEval and BestFrist
61
Table 17: Accuracy result by using CfsSubsetEval and BestFrist Kinect 2
61
Table 18: Accuracy result by using CfsSubsetEval and BestFrist Kinect 1
62
Table 19: Base Classifiers of the Random Committee
63
Table 20: Movements considered by the Classifier
63
Table 21: Movements considered by the Classifier
63
vii
Table 22: New Patterns Discovered of Customer’s Affect via the Notable Features
65
Table 23: Prototype Testing Results
66
viii
LIST OF FIGURES
Figure 1: Camera Space
14
Figure 2: Kinect-1-vs-Kinect-2-Tech-Comparison
16
Figure 3: Tracked Points
17
Figure 4: Head Pose Angles
18
Figure 5: Candide -3 face model
25
Figure 6: The Conceptual Framework
34
Figure 7: Research Paradigm
35
Figure 8: Star Circle, Starmall, Alabang
36
Figure 9: Camera Set-Up
37
Figure 10: Setup for Kinect Sensor Captures Full Body
38
Figure 11: Setup for Two (2) Kinect Sensors.
38
Figure 12: 3D Face Mask
40
Figure 13: Tracked Face
41
Figure 14: Annotation of Videos
43
Figure 15: Sample Face captured by Kinect 1
51
Figure 16: Sample Face Captured by Kinect 1
52
Figure 17: Accuracy of Model Development Results
57
Figure 18: Kappa of Model Development Results
57
Figure 19: Accuracy of Model Testing Results
58
Figure 20: Kappa Statistic of Model Testing Results
59
Figure 21: Accuracy of Model Testing
60
Figure 22: Kappa of Model Testing
60
ix
Figure 23: A section of one of the Random Committee base classifiers
x
64
ABSTRACT
Products evoke positive or negative emotions to customers. Those with negative emotions
towards the products are likely to reject it, while those with positive emotions toward the
product are enticed to buy them. There were already some studies on customer emotion through
facial expression using ordinary cameras. In this study, the Researcher aimed to develop
models that recognize customer’s emotion through Kinect sensor v1 and the new Kinect sensor
v2 and tried to compare these sensors in terms of recognition rates. The 2 sensors were placed
one on top of the other and simultaneously recorded videos of the customers and extracted
facial features from the captured facial expressions. Each instance of the extracted features was
then labeled with the corresponding observed emotion of the customer from the recorded video.
The resulting dataset were then processed using some classifiers. Results showed that Kinect
sensor v2 performed better than Kinect sensor v1. For the results of prototype testing, Kinect
v2 got the kappa statistic average 0.67371 in the “moderate” to “good” agreement range instead
of Kinect 1 with the 0.5428 average.
Keywords: Emotion Recognition, Facial Expressions, Microsoft Kinect, Classifications,
Affective Computing
xi
Chapter 1
INTRODUCTION
Human beings have been blessed with the ability to live according to their feelings,
emotions and rationale. Emotions are related to behavior, decision-making and relationship
and that is how emotions impact people lives. It is one of the most important fields of research
today which is going to be explored by this study.
The study of emotion, feeling and affect presents a considerable challenge for
researchers due to lack of differentiation among these terms. Although feeling, emotion and
affect are routinely used interchangeably, it is important not to confuse affect with feelings and
emotions. As Massumi defined it “feelings are person-centered, biographical and conscious
phenomena”; “emotions are social expressions of feelings and affect is influenced by culture”.
It is a preconscious phenomenon which is capable of becoming conscious upon recall; and
affects are prepersonal, that is affect exists outside of consciousness before personal selfawareness develops [20].
Affective computing was first popularized by Rosalind Picard’s book “Affective
Computing” which called for research into automatic sensing, detection and interpretation of
affect and identified its possible uses in human computer interaction [HCI] contexts [54].
“Affective computing is the study and development of systems and devices that can recognize,
interpret, process, and simulate human affects”.
Automatic affect sensing has attracted a lot of interest from various fields and research
groups, including psychology, cognitive sciences, linguistics and computer vision, speech
analysis, and machine learning. The progress in automatic affect recognition depends on the
progress in all of these seemingly disparate fields. Affective computing has grown and
1
2
diversified over the past decades. It now encompasses automatic affect sensing, affect
synthesis and the design of emotionally intelligent interfaces.
Emotional factors are as important as classic functional aspects of product/service [42].
Products can evoke a wide range of emotions, both negative and positive. These emotions are
expressed differently, according to situations. Negative emotions stimulate individuals to reject
the object whereas positive emotions stimulate individuals to accept the object [38]. Positive
emotions stimulate product purchase intentions. It means products that evoke positive
emotions give positive results both on business and consumer perspective. Thus, it is
indisputably worthwhile to design products that evoke positive emotions [47].
Recent advances in image analysis and pattern recognition open up the possibility of
automatic detection and classification of emotional and conversational facial signals. Possible
area that could use the advance technology of facial expression recognition system is the
customer satisfaction measurement. A published study [33] proposed a system that measures
the satisfaction level of the new customers during the registration process. The expression of
customer being served at the counter is captured to evaluate the satisfaction of the customer.
Another study proposed by Shergill [35] described a computerized intelligent sales
assistant that gives sales personnel the ability to allocate their time. However, this is still a
conceptual framework. They used an in-house camera and the computer software developed
by the team to classify the different facial expressions of purchasers while they shop. Based on
the result, the system gave information about shopper behavior and suggested interesting
products.
Although, there are already various studies on application of facial expression
recognition system, previous studies focused more on academic emotions [33] while the other
3
one is still a conceptual framework [35] and they focused in single tracking although targets
business realm. This paper addresses what is lacking from previous papers: provide a model
of facial expression recognition system on business field. Another goal of this research is to
define various direct emotions of customer-product interactions.
Based on the background and the research gaps provided, the main objective of this
study is to develop a model that recognizes customer’s emotion through facial expressions
using Kinect sensor v1 and v2. The Researcher aims to extract and determine notable facial
features that can be used to recognize customer’s emotions. A suitable classification algorithm
is used to provide the highest prediction rate. The model is then validated by embedding it in
a prototype and compares the performance of Kinect v1 and Kinect v2 based on the results.
Being exposed everyday with human interactions, emotions play a vital role on
relationship, academe and work. With this, the Researcher was curious as to how emotions
play role in the business world or if emotions do really have a place in it. During the
development of this study, she was able to come up with some questions that should be
answered when this study is completed:
How is facial expression related to customer’s emotion?
What classification algorithms yield the best recognition rate?
How does the prototype help both the business companies and customer?
What are the differences between the extracted facial features from Kinect v1 and
v2?
Based on the target goals of this study, this has significant impact on computer science
and business world. That is, this paper incorporates affective computing by looking for a
suitable affective intelligent model and implements it on a facial recognition system prototype,
4
which has benefit both consumers and business companies. This also has significant
contribution on discovering new facial features and patterns that can be extracted from Kinect
v2. These patterns can be used for future studies. This might also pave the way for new
approach on business marketing and advertising.
Since business companies want to optimize their resources, with this study, it helps
them save time due to its high recognition rate. That is, it allows finding out what the customers
truly feel about the products. Thus, the direct sales staff is able to easily point where a sale is
more likely, at the same time providing customer satisfaction.
The research focuses on the field of facial emotion recognition for Filipino customers.
It covers positive and negative states of customer’s face: interest, happy, surprise, boredom
and sad. The study planned to use two (2) parallel cameras: (1) Kinect for Xbox 360 and (2)
Kinect for Xbox One v2 with separate adapter or Kinect for Window v2 (Kinect v2). Kinect
for Xbox 360 can track only one face and Kinect v2 can track one to six people simultaneously
in one range. The analysis of facial expressions in the study is limited to include thirty (30)
respondents and small area of coverage for detecting the target facial expressions.
The rest of the paper is organized as follows: Chapter 2 gives a background of facial
expressions and how it motivates the paper’s experiment. Chapter 3 discusses the methodology,
results and discussions. The conclusion is discussed in Chapter 4 and the recommendation is
in the Chapter 5.
Chapter 2
REVIEW OF RELATED LITREATURE
Since the early 1970s, the pioneering work by Ekman and his colleagues have
performed extensive studies of human facial expressions. They found evidence to support
universality in facial expressions. Their studies indicated that there are six (6) universally
recognized prototypes of face expressions: happiness, anger, disgust, sadness, fear, and
surprise [6]. They studied facial expressions in different cultures, including preliterate cultures,
and found much commonality in the expression and recognition of emotions on the face.
However, these studies showed that the processes of expression and recognition of emotions
on the face are common enough, despite differences imposed by social rules. For example,
Japanese subjects and American subjects showed similar facial expressions while viewing the
same stimulus film. However, in the presence of authorities, the Japanese viewers were more
reluctant to show their real expressions. Babies seem to exhibit a wide range of facial
expressions without being taught, thus suggesting that these expressions are innate. Their work
on action units (AU], described in Facial Action Coding System (FACS), and inspired
researchers in this field. They used FACS to code facial expressions as a combination of fortyfour () facial movements where movements on the face are described by a set of action units
[AUs] and to manually describe facial expressions, using still images of, and usually extreme,
facial expressions. Each AU has some related muscular basis. While much progress has been
made in automatically classifying according to FACS a fully automated FACS based approach
for video has yet to be developed. This work inspired the Researcher to analyze facial
expressions by tracking prominent facial features or measuring the amount of facial movement,
usually relying on the “universal expressions” or a defined subset of them. In 1990s, automatic
5
6
facial expression analysis research gained much interest, mainly thanks to progress, in the
related fields such as image processing (face detection, tracking and recognition) and the
increasing availability of relatively cheap computational power.
The work in computer-assisted quantification of facial expressions did not start until
the 1990s. Mase and Pentland (1990) used measurements of optical flow (OF) to recognize
facial expressions. Mase was one of the first to use image processing techniques to recognize
facial expressions. Another study by Lanitis et al., they used a flexible shape and appearance
model for face identification, pose recovery, gender recognition and facial expression
recognition. Local optical flow was also the basis of Rosenblum’s work, utilizing a radial basis
function network for expression classification. The study [11] used an optical flow regionbased method to recognize expressions. Donato et al. And Bartlett [8] tested different features
for recognizing facial AUs and inferring the facial expression in the frame.
Pantic and Rothkrantz [9] identified three (3) basic problems a facial expression
analysis approach needs to deal with: face detection in a facial image or image sequence, facial
expression data extraction and facial expression classification. Different methods for facial
expression recognition differ in the feature extraction and representation method, type of
classification, and whether the recognition is done from still image or video. For the facial
feature extraction and representation method, there are three (3) main types: template-based,
feature-based, and appearance-based.
With regards to the method of classification, there are also two (2) major types: imagebased or sequence-based. Neural networks (NN), support vector machine (SVM), and
Bayesian networks (BN) belonging to the image-based classification, while Hidden Markov
Model (HMM) and Dynamic Bayesian Networks (DBN) to the latter. A comprehensive review
7
of these methods can be found in Michel [56], trisected an approach to expression recognition
in live video. The results indicated that the properties of a Support Vector Machine learning
system correlate well with the constraints placed on recognition accuracy and speed by a real
time environment. In 2006, Yu-Li Xue, Xia Mao and Fan Zhang proposed a comprehensive
video facial expression database, which involves human’s main emotional facial expressions
and includes twenty-five (25) kinds of pure facial expressions [despair, grief, worry, surprise,
flurry, horror, disgust, fury, fear, doubt, impatience, hate, contempt, disparagement, sneer,
smile, plea, laugh], mixed facial expressions and complex facial expressions (using ANC
camera, 380 thousand pixels).
For the latest researches combined base algorithms to improve performance of system.
For example: Facial expression recognition based on Hessian regularized support vector
machine is the experimental results show that HR based SVM (HesSVM) outperforms SVM
and LR base SVM (Lap SVM). Or [55] presented classifier operates only in off-line mode in
three (3) isolated steps: 1. Feature extraction from clip 2. Evaluation using Candide model 3.
Recognition of expressed emotion, and others.
Emotion Typologies
Table 1 on page 8 illustrates that basic emotion sets typically include two (2) or three
(3) positive emotions. These can be combined to give five (5) basic positive emotions: Joy,
Love, Interest, Anticipation, and Pleasant Surprise. Working with such small sets of basic
emotions enables a shared research focus among academia, which supports comparisons
among research initiatives. The disadvantage is that these sets are an oversimplified
representation of the variety of human emotions. The emotion lexicon of most modern
8
languages contains hundreds of emotion names [57], and suggesting that all of these are mere
variations of basic emotions marginalizes the richness of the emotional repertoire. Some
researchers have been dissatisfied with the economy obtained with the basic emotion sets. In
agreement with this critical stance, the Researcher proposes that the small set is too
rudimentary to be useful for explaining the variety of positive emotions experienced in humanproduct interactions. Each basic emotion encompasses various different emotions. For example,
the basic emotion of joy encompasses: pride, satisfaction, relief, and inspiration. Love
encompasses: sympathy, admiration, kindness, lust, and respect. These are clearly different
emotions, with different eliciting conditions, different feelings, and different behavioral
manifestations.
The set of twenty-five (25) positive emotion types was then assembled to function as a
practical balance between the conciseness of basic emotion sets and the comprehensiveness of
emotion sets.
Table 1: Basic Emotions [47]
9
Customer Emotion
According to the study of Consoli [42], the emotion becomes more important with the
emergence of the principle of the consumer pleasure. The emotions represent another form of
language universally spoken and understood. Emotions are distinctive element that must be
added to enhance the basis of supply of product/service and especially they are designed and
managed with rigor and ethical spirit. The consumer does not look for a product/service that
meets both the needs and rational processes, but for an object that becomes a center of symbolic
meanings, psychological and cultural, a source of feelings, relationships and emotions.
The purchase decisions of customers are driven by two (2) kinds of needs: functional
needs satisfied by product functions and emotional needs associated with the psychological
aspects of product ownership. The products must generate emotions but also present good
functionality (traditional attributes).
Expression of Interest
As discussed in Chapter 1, the interest of the customer with the products is very
important. The questions are “Is interest an emotion?”, “What are the facial behaviors/features
associated with a real customer?” and “How to define it through face features?” This study
defines the interested expression as a lack of all of the remaining emotions.
Following the study of the researchers of Changingminds Group [37] is based from
[38] [39], [40] interested is found out:
“Steady gaze of eyes at item of interest [may be squinting]; slightly raised eyebrows;
lips slightly pressed together; head erect or pushed forward.”
- Xem thêm -