VIETNAM NATIONAL UNIVERSITY, HANOI
COLLEGE OF TECHNOLOGY
ANH DUC NGUYEN
IMPROVING THE 3D TALKING HEAD
FOR USING IN AN AVATAR
OF VIRTUAL MEETING ROOM
Branch: Information Technology
Code: 1.01.10
MASTER THESIS
Supervisor: Dr. The Duy Bui
Hanoi, November 2006
Contents
List of Figures..................................................................................................................3
Chapter 1 - Introduction................................................................................................. 5
1.1 The avatar in the virtual meeting room ...........................................................5
1.2 Structure of this thesis.......................................................................................6
Chapter 2 - The 3D animated talking head................................................................ 8
2.1 A muscle based 3D face m odel........................................................................8
2.2 Combination of facial movements on a 3D talking head.............................9
2.3 From emotions to emotional facial expressions...........................................12
2.4 Conclusion.........................................................................................................15
Chapter 3 - OpenGL and JO G L overview............................................................... 16
3.1 OpenGL overview............................................................................................16
3.1.1 Immediate Mode and Retained Mode (Scene Graphs)........................16
3.1.2 OpenGL history.........................................................................................16
3.1.3 How does OpenGL work?........................................................................17
3.1.4 OpenGL as a state machine..................................................................... 19
3.1.5 Drawing geometry.................................................................................... 20
3.2 JOGL overview.................................................................................................22
3.2.1 Introduction............................................................................................... 22
3.2.2 Developing with JOGL............................................................................23
3.2.3 Using JOGL............................................................................................... 24
3.3 Conclusion.........................................................................................................25
Chapter 4 - Improving lip-sync ability......................................................................26
4.1 Introduction....................................................................................................... 26
4.2 Previous work...................................................................................................27
4.3 FreeTTS and Mbrola........................................................................................28
4.3.1 FreeTTS......................................................................................................28
4.3.2 Mbrola........................................................................................................31
4.4 The improved lip model...................................................................................32
4.5 Conclusion.........................................................................................................35
C hapter 5 - Adding the hair and eyelashes models................................................36
5.1 Introduction....................................................................................................... 36
5.2 The Hair model.................................................................................................. 37
5.2.1 Introduction to V RM L............................................................................. 37
5.2.2 Our hair model............................................................................................39
5.3 The Eyelashes m odel........................................................................................ 42
5.4 Conclusion..........................................................................................................44
Chapter 6 - Implementation and illustrations......................................................... 45
6.1 Implementing the face m odel..........................................................................45
6.1.1 Structure of the system............................................................................. 45
6.1.2 Some improvements..................................................................................46
6.2 Face model illustrations....................................................................................47
Chapter 7 - Conclusion.................................................................................................. 56
Future research............................................................................................................... 56
References...........................................................................................................................58
2
L is t o f F ig u re s
2.1: The original 3D face model: (a): The face mesh with muscles; (b): The
face after rendering......................................................................................................9
2.2: System overview.......................................................................................................10
2.3: Combination of two movements in the same channel........................................ 11
2.4: The activity of Zygomatic Major and Orbicularis Oris before (top) and after
(bottom) applying combination algorithm............................................................11
2.5: The emotion-to-expression system ........................................................................12
2.6: Membership functions for emotion intensity (a) and muscle contraction
level (b ).....................................................................................................................13
2.7: Basic emotions: neutral, Sadness, Happiness, Anger, Fear, Disgust, Surprise
(from left to right)................................................................................................... 15
3.1: Software implementation of OpenGL....................................................................18
3.2: Hardware implementation of OpenGL................................................................. 18
3.3: A simplified version of OpenGL pipeline.............................................................19
3.4: The structure of an application using JOGL.........................................................25
4.1: FreeTTS Architecture...............................................................................................29
5.1: Dividing a polygon (a) to triangles (b)...................................................................40
5.2: Importing the hair model: (a): the original head; (b): the head with the
imported hair model; (c): the head with the imported and fine tuned hair
model........................................................................................................................... 41
5.3: Some other imported and fine tuned hair models...................................................41
5.4: The open (a) and close eyes (b) without and with eyelashes..............................43
5.5: The face without (a) and with (b), (c) the hair and eyelashes models.............. 44
6.1: The main interface of our program......................................................................... 47
6.2: The face model displays Happiness emotion with maximum intensity............ 48
6.3: The face model displays Surprise emotion with maximum intensity................48
3
6.4: The combination of two emotions: Happiness andSurprise................................49
6.5: The effect of left Zygomatic Major muscle’scontractionat maximum level
on the face model....................................................................................................... 49
6.6: The face model from different view points..............................................................50
6.7: Increasing surprise.......................................................................................................50
6.8: The hair model after being imported.........................................................................51
6.9: The hair model after being fine tuned.......................................................................51
6.10: Some other hair models............................................................................................ 52
6.11: Closing the eyes.........................................................................................................53
6.12: The face model attach to the body........................................................................ 54
6.13: Our face model embeds intoother project..............................................................54
4
Chapter 1
Introduction
1.1 The avatar in the virtual meeting room
The Virtual Meeting Rooms (VMRs) are 3D virtual simulations of meeting
rooms where the various modalities such as speech, gaze, distance, gestures and
facial expressions can be controlled (a VMR project in Twente). The rapid
development in computer graphics and embodied conversational agents areas allows
the creation of VMRs and makes them to be useful for various purposes. These
purposes can be divided into three following categories [24], First, they can be used
as a virtual environment for teleconferencing, a real-time communication means for
remote participations of meeting [18]. Using the VMRs helps to reduce the amount
of data that needs to be sent to and displayed on screens of remote client side. In
addition, they offer to overcome some features that are problematic in real meetings
or in traditional video-based conferences. For examples, the participants can adapt
the Virtual Environment to their own preferences without disturbing other people or
they can choose a view from any seat in VMRs that they want and feel the
comfortable during the meeting [17]. Second, VMRs are used to simulate the
content of recorded meeting in the different ways or present multimedia information
about it. Information can be directly recorded from participant’s behaviors in real
meetings (e.g. tracking of head or body movements, voice). These presentations can
be used as a 3D summary of the real meetings or for evaluating the annotations and
results which are obtained by machine learning methods. Third, because Virtual
Environments allow controlling various independent factors (voice, gaze, distance,
gestures, and facial expressions); these factors can be used to study their influence
on features of social interaction and social behavior. Conversely, the effect of social
interaction on these factors can be studied adequately in Virtual Environments as
well.
In the VMRs environment, each participant is represented by an avatar. An
avatar is an embodied conversational agent that simulates all behaviors and
movements of the participant. The avatar will typically contain a talking head which
is able to speak and displays lip movements during speech, emotional facial
5
Introduction
expressions, conversation signals and a body which is able to display gestures of the
participant. The important thing is the avatar of each participant must bring the
belief to other participants. The avatar will be believable if it can simulate the
appearance, express the characteristics of the participant and its actions and
reactions can be as true to life as those of the person it is representing.
The talking head model plays an important role in the creation of a believable
avatar. It is not only used to display facial movements and expressions but also used
to distinguish other avatars and to express the personality of the participant. In order
to create a talking head model which is suitable to use for avatar in the VMRs, there
are some problems which need to deal with. First, the talking head must be simple
enough to keep the real-time animation but still produce realistic and high quality
facial expressions. Second, the talking head not only has the capabilities to create
facial movements such as conversational signals, emotions expressions, etc but also
has to combine and solve the conflicts between them. Third, the talking head must
look like real head, it means the head must have other models attached to it such as
hair models, tongue model, eyelashes model, etc.
In this thesis, we choose the talking head model from [3] to improve and then
use for avatars in VMRs. We study the model carefully to discover all advantages as
well 93 disadvantages. The advantages will be inherited while the missing functions
or disadvantages will be supplemented or improved, respectively. We change the
rendering method of the head to new one to improve the animation speed. The
synchronization between audible and visible speech is also improved. We supply
the hair and the eyelashes models to make the head look more realistic. The
improved model not only can be used for avatars in VMRs environment but also
can be embedded into other projects.
1.2 Structure of this thesis
In the Chapter 2, we introduce the 3D animated talking head [3] that our
works are based on. This head is able to produce realistic facial expressions, real
time animation on the personal computer. It can display several types of facial
movements such as eye blinking, head rotation, lip movement, etc at once and the
most important thing is it can generate emotional facial expressions from emotions.
We briefly introduce the way this muscle based 3D face model is created, the
6
Introduction
techniques it uses for producing animation, the combination of facial movements
and how to generate emotional facial expressions from emotions.
In the Chapter 3, we present an overview of OpenGL and JOGL (Java
bindings for OpenGL). OpenGL is industry standard and premier environment for
developing 2D and 3D graphic applications. Its capabilities allow developer to
display compelling graphics and produce applications that require maximum
performance (OpenGL project). JOGL is new OpenGL interface for Java platform.
It is open sourced, clean and minimalist API from all bindings available.
In the Chapter 4, we introduce an overview of FreeTTS and Mbrola. FreeTTS
is a robust text-to-speech system that we used to get phonemes and timing
information from a text. This phonemes string is used to generate lip movements
when speaking. FreeTTS supports Mbrola which is a speech synthesizer based on
the concatenation of diphones. We used Mbrola as an output thread of FreeTTS to
produce synthetic audible. We also present the method to improve the lip-sync
capability. The original head can speak but in some conditions the speech from the
speaker does not synchrony with the movements of the lip on the screen. Besides,
we may want the head to express various emotions depends on current speaking
sentence, so we need to know exactly time when the sentence is spoken then we can
generate the suitable emotions.
The original head does not have hair model and eyelashes. We supply these
parts in order to make it look like a real head and become more attractive. In the
Chapter 5, we present the method to apply a hair model for the head and the way we
draw eyelashes for the eyes. Available hair models will be attached to the head
model without much human intervention during process. In addition, the eyelashes
are a small part on the face but without them, the eyes may not look real. The
eyelashes also help to improve the emotions expression capability of the eyes when
the eyes flutter. We describe some problems about the eyelashes creation, and how
to fix them to the eyelid so they can move with the eyelid when the eyes close or
open.
In the Chapter 6, we introduce the implementation of the face using Java and
JOGL. We also introduce our improvement in rendering method of the talking head
using the new methods and mechanism which are introduced in OpenGL 1.5. This
method helps to increase the animation speed significantly. Some illustrations of
our 3D talking head model are also introduced in this Chapter.
7
Chapter 2
The 3D animated talking head
2.1 A muscle based 3D face model
The face model is created by a polygonal face mesh and a B-spline surface for
the lips. The face mesh data was obtained from a 3D scanner at first and was
processed to improve the animation performance but still kept the high quality of
the model. The process contains two phases. In the first phase, the number of
vertices and polygons was reduced in non-expressive parts but maintained in the
expressive parts which are the areas around the eyes, the nose, the mouth and the
forehead. At the end of this phase, the face mesh contains 2,468 vertices and 4,746
polygons. This is small enough to have real-time animation but still preserves the
high quality of detail in expressive parts of the face. In the second phase, the face
model was divided into eleven regions. Five regions on the left part include of left
lower face, left middle face, left lower eyelid, left upper eyelid and left upper face.
There are five corresponding regions on the right part and the last region is at the
back of the head. This not only helps to prevent unwanted artifacts generated
because of the displacement of the vertices in the regions that should not be affected
by muscle contractions but also increase the animation speed.
The lip model is a B-spline surface with 24 x 6 control point grid. The lip is
deformed by moving the control points and the B-spline surface is polygonalized to
connect with the face mesh for rendering. The B-spline surface has the advantage of
producing a smooth face but it can not produce wrinkles and needs to be
polygonalized before rendering. If the number of control point is too large then it
will require heavy computations. Due to these advantage and disadvantage, it is
suitable to use B-spline surface for modeling the small part of the face like the lips.
Almost all of the 19 muscles, which are used on the face to generate
animation, are vector muscles, except Orbicularis Oris that drive the mouth and
Orbicularis Oculi that drive the eye. The vector muscle of the face is an improved
version of the vector muscle model from [28]. In addition, a mechanism to generate
wrinkles and bulges is added to increase the realistic of the facial expressions and
the technique to reduce the computation is also introduced to enhance the animation
8
The 3D animated talking head
performance. The Orbicularis Oris muscle is parameterization-based and is adopted
from [12]. The Orbicularis Oculi has two parts: the Pars Palpebralis that open and
closes the eyelid, is adopted from [22] and Pars Orbitalis that squeezes the eye, is
adopted from [28], The jaw and the eyeball rotation algorithms are improved from
the ones proposed in [22]. The mouth now has a natural oval-looking, and the eyes
can track a target. Eye movement is independent of facial muscle movements and
can not rotate to impossible positions. All muscles have the intensity range from 0
to 1, the step value between two adjacent muscle contractions is 0.2. This step value
is determined after trail and error experiments. It is small enough to ensure the
facial animations are smooth and large enough to decrease the computation times.
Figure 2.1 shows the original face from [3].
(a)
(b)
Figure 2.1: The original 3D face model
(a): The face mesh with muscles; (b): The face after rendering
2.2 Combination of facial movements on a 3D talking
head
The system takes as input the marked up text with each facial movement
(except lip movement while talking) is defined as a group of muscle contractions
that share the same function, start time, onset, offset and duration. Lip movement
will be generated separately inside the system based on the phonemic presentation
of the input text.
9
The 3D animated talking head
Figure 2.2: System overview
There are several types of facial movements on the face. They include lip
movements when talking, conversational signals, emotion displays, gaze and head
movements, and manipulators to satisfy biological requirements of the face. All of
them can occur at the same time and because they are driven by the muscle models,
there can be situations where there are conflicting muscles when two or more
movements happen at once. Conflicting muscles are muscles that can not contract at
the same time. For example, when we smile the Zygomatic Major and Minor
muscles contract to pull the comer of the lip outward. If at that time we
concurrently say “Hello”, the phoneme “@U” in the word “Hello” requires the
contraction of the Orbicularis Oris muscle which drives the lip into a tight, pursed
shape. So Zygomatic Major (and Minor) and Orbicularis are conflicting muscles.
The face must solve this problem to produce natural animation.
Each type of facial movements belongs to one channel. There are six channels
in the system: manipulators (eye blinking), lip movements (phoneme), conversation
signals (muscle contractions), emotion displays (expression), gaze movements (eye
movement) and head movements channel. The combination process contains two
steps. In the first step, the movements in each channel are concatenated to generate
smooth transactions between adjacent movements. In the second step, the
movements in all channels are combined and processed to solve “conflicting
muscles”.
10
The 3D animated talking head
12
11
O 1
o 09
—J
-
------- First movement
/
/
ob
•p06
m35
(TJ
¿0.4
§«
O02
01 //
Combination o f two
movements
i1
I
/
/
'
\
\
N
\
--------Second m ovement
\
»I
*
\
\
#
\
\
\
\
\
Time’fin seconds)
Figure 2.3: Combination of two movements in the same channel
o>
-Jo
co
o(Z
c
o
o
a
o>
co
c
o
o
4 5
6
7 8
Time (in seconds)
Figure 2.4: The activity of Zygomatic Major and Orbicularis Oris before (top) and
after (bottom) applying combination algorithm.
Figure 2.3 is an example about combining two movements in the same
channel. The muscle’s activity of the first movement happens until time 3, when
11
The 3D animated talking head
there is a stimulus to the second movement, it stops following the first movement
and then release to the target value of the muscle in the second movement (0.5),
followed by the second movement.
Figure 2.4 is an example about combining two movements in different
channels. Because Zygomatic Major and Orbicularis Oris are conflicting muscles
and the Orbicularis muscle has higher priority when it is activated (at time 3), the
Zygomatic Major is inhibited. However, its activity is adjusted so it does not release
too fast which would create an unnatural movement. Zygomatic Major activity
releases gradually to zero value and then Orbicularis Oris muscle starts contracting.
2.3 From emotions to emotional facial expressions
There are six emotions are considered to be universal, this means they
associated consistently with the same facial expressions across different cultures.
These emotions are: Happiness, Anger, Surprise, Fear, Disgust and Sadness [34].
Other emotions on the face are considered to be generated by combining six basic
emotions above, but rarely more than two emotions occur at the same time. So, two
aspects of generating emotional facial expressions from emotions are concentrated
to solve. First, depending on the intensity of emotion, the face must display the
continuous changes in expressions. Second, the face must have a method to
combine expressions from two emotions. A fuzzy rule-based system is suitable for
these requirements because it allows incorporating qualitative as well as
quantitative information.
Single Expression Mode
FRBS
T
FMCV)
i
Blend Expression Mode
FRBS
Figure 2.5: The emotion-to-expression system
12
The 3D animated talking head
There are two fuzzy rule-based systems implemented to convert from emotion
intensities to muscles contraction levels, which are used to generate emotional
expressions on the 3D face model. The first fuzzy rule-based system is used to
produce contraction levels from single emotion intensity, it is called “Single
Expression Mode”. The second one is used when two emotion intensity values are
converted to muscle contractions levels, it is called “Blend Expression Mode”. A
mechanism to select Single or Blend Expression Mode is based on the intensities of
the emotions felt. When a single emotion expresses, the Single mode is chosen. The
Blend Expression Mode is chosen when more than one emotion expresses, but only
the two highest emotion intensity values are used (Figure 2.5).
^ in te n s ity (em o,'onl
Ik'nriT or Intensity
(a)
M ie v e l(,m ,scle- co,Uracnon *
lX»j»ree of Level
(b)
Figure 2.6: Membership functions for emotion intensity (a) and muscle contraction
level(b)
13
The 3D animated talking head
The intensity of each emotion is modeled by five fuzzy sets: VeryLovv, Low,
Medium, High and VeryHigh. Similarly, the contraction level of each muscle is
described by five fuzzy sets: VerySmall, Small, Medium, Big and VeryBig. By
using these fuzzy sets, the system can describe emotions qualitative descriptions
like “surprise then lift eyebrows’' and quantitative descriptions like “if the level of
sadness is low then draw the eyebrows together; while if the level of sadness is
high, then draw the eyebrows together and draw the corners of the lips down.”, etc.
The form in Figure 2.6 of the membership functions and the support of each
membership function are determined after experiments.
The rule in the single expression mode looks like following form:
if Sadness is VeryLow then
muscle 9’s contraction level is VerySmall
muscle 13’s contraction level is VerySmall
muscle 14’s contraction level is VerySmall
muscle 15’s contraction level is VerySmall
muscle 18’s contraction level is VerySmall
The rule in blend expression mode looks like following form:
if Surprise is Low and Fear is Medium then
muscle 9’s contraction level is Small
muscle 10’s contraction level is Small
muscle 16’s contraction level is Small
muscle 3’s contraction level is Medium
muscle 4’s contraction level is Medium
muscle 5’s contraction level is Medium
muscle 17’s contraction level is Medium
There are no rules to blend expressions of Happiness and Disgust, as well as
Sadness and Surprise, because there is no evidence that these emotions can happen
concurrently. For these expressions, only the emotion with higher intensity is
expressed.
Figure 2.7 displays six basic emotion facial expressions which are generated
from six corresponding emotions with all intensities are 1 (maximum value). The
quality of the facial expressions is improved by using the psychological-based and
fairly simple fuzzy rules rather than using other graphic algorithms and complicated
formulas or intensively trained Neural Networks.
14
The 3D animated talking head
Figure 2.7: Basic emotions: neutral, Sadness, Happiness, Anger, Fear, Disgust,
Surprise (from left to right)
2.4 Conclusion
This 3D face model is suitable for using in the avatar of virtual meeting room
because it can display facial expressions with real time animation. Beside verbal
movements (lip movements when speaking), it can display other non-verbal
behaviors such as eye blinking, head rotation, etc. It also can generate emotional
facial expressions from emotions and can combine different facial movements to
display at the same time. Not only is the face able to express six basic built-in
emotions but it can also generate many other emotions by controlling the muscles
model. Thus, the participants can express their own emotions and track the
emotions of the others through the face of avatar. They can benefit from verbal and
non-verbal communications and have a new way to find the points of interest in the
meeting. One important thing is that the face can help the avatar to bring the
plausibility to the participants so they can feel that they are in the real meeting with
real people.
15
Chapter 3
OpenGL and JOGL overview
3.1 OpenGL overview
3.1.1 Immediate Mode and Retained M ode (Scene Graphs)
There are two different types of APIs for programming real-time 3D
applications [32]. The first type is called retained mode. In retained mode, the
description of objects and the scene is provided to the API and then the graphics
package will create the image on the screen. All things the programmers need to do
is to give commands to change the position and viewing orientation of the user (also
called the camera) or other objects in the scene. The structure that has just be built is
called scene graph. The scene graph is a data structure that includes all the objects
in our scene and their relationships to others. Many high-level toolkits or "game
engines" use this approach. The programmer doesn't need to understand how the
scene is rendered because the graphic library will take care of rendering the model
or database that he hands over to it. Java3D is one example of scene graph API.
The second approach to 3D rendering is called immediate mode. Most retained
mode APIs or scene graphs use an immediate mode API internally to actually
perform the rendering. For examples, Java3D uses OpenGL or Direct3D to render
the geometry created by user. In immediate mode, the programmers don't describe
the models and environment at high a level as in retained mode. Instead, they issue
commands directly to the graphics processor. Each command has an immediate
effect depends on the current setting state and new commands have no effect on
rendering commands that have already been executed. This allows everything to be
controlled at low-level.
3.1.2 OpenGL history
OpenGL is an industry-standard, cross-platform Application Programming
Interface (API). The specification for this API was finalized in 1992, and the first
implementations appeared in 1993. The forerunner of OpenGL is Iris GL (Graphics
16
OpenGL and JOGL overview
Library), the API that was designed and supported by Silicon Graphics, Inc. To
establish an industry standard, Silicon Graphics collaborated with various graphics
hardware companies to create an open standard, which was named "OpenGL."
Until now, seven revisions have been introduced to add new functionality to
the API. The newest version of the OpenGL specification is 2.1.All newer versions
are upward compatible with earlier versions [4],
- Version 1.1 was finished in 1997 and added support for two important
capabilities: vertex arrays and texture objects.
- The specification for OpenGL 1.2 was released in 1998 and added support
for 3D textures and an optional set of imaging functionality.
- The OpenGL 1.3 specification was completed in 2001 and added support
for cube map textures, compressed textures, multi-textures, etc.
- OpenGL 1.4 was completed in 2002 and added automatic mipmap
generation, additional blending functions, internal texture formats for
storing depth values for use in shadow computations, support for drawing
multiple vertex arrays with a single command, more control over point
rasterization, control over stencil wrapping behavior, and various additions
to texturing capabilities.
- The OpenGL 1.5 specification was published in October 2003. It added
support for vertex buffer objects, shadow comparison functions and
occlusion queries.
- OpenGL 2.0, finalized in September 2004, opened up the processing
pipeline for user control by providing programmability for both vertex
processing and fragment processing. Other features added in 2.0 include
support for multiple render targets, nonpower-of-2 textures, point sprites,
and separate stencil functionality for front- and back-facing surfaces.
- Version 2.1, has just released in August 2006, added support for the
revision 1.20 of OpenGL shading language, non-square matrices, pixel
buffer objects and sRGB textures. |- ĐAI H O C Q U Ố C GIA HÀ NÒI
trung
TÁM THỒNG TIN THƯ VIỆN
31.1.3 How does OpenGL work?
OpenGL implementations can be software implementation or hardware
implementation. Window applications can call a Windows API which is called the
17
OpenGL and JOGL overview
Graphics Device Interface (GDI) to create output onscreen and graphic card vendors
usually supply a driver for GDI to interface with. A software implementation of
OpenGL takes graphics requests from an application and constructs (rasterizes) a
color image of the 3D graphics. This image then will be supplied to the GDI to
display on the monitor. Microsoft has its OpenGL software implementation and
almost modem operating system products from Microsoft contain support for
OpenGL. However, SGI and MESA also released software implementations of
OpenGL for Windows that greatly outperformed Microsoft's implementation.
Figure 3.1: Software implementation of OpenGL
An OpenGL hardware implementation usually takes the form of a graphics
card driver. OpenGL API calls from applications are passed to a hardware driver.
This driver does not pass its output to the Windows GDI for display, it interfaces
directly with the graphics display hardware, instead. The more components of
OpenGL are hardware implemented, the faster the implementation processes the
calls from applications and display images onscreen.
Figure 3.2: Hardware implementation of OpenGL
18
OpenGL and JOGL overview
When an application calls OpenGL API functions, the commands are placed in
a command buffer. Vertex data, texture data, etc are also contained in this buffer.
When the buffer is flushed, the commands and data are passed to the
“Transformation and Lighting” step. In this step, points used to describe an object's
geometry are recalculated to determine the given object's location and orientation.
Lighting calculations are performed as well to indicate the brightness of the colors
at each vertex. When this stage finished, the data is passed to the “Rasterization”
step of the pipeline. The rasterizer actually creates the color image from the
geometric, color, and texture data and places the image into the frame buffer. The
frame buffer is the memory area of the graphics display device, which means the
image is displayed on the screen. Figure 3.3 shows the simple view of OpenGL
pipeline. At a low level, there are many boxes inside each box of the diagram.
Figure 3.3: A simplified version of OpenGL pipeline
3.1.4 OpenGL as a state machine
OpenGL is designed as a state machine [21]. If we put it into specific states (or
modes) then these states will remain in effect until we change them. For example,
the current color is a state variable. We can set the current color to black, white, red,
or any other color, and all objects will be drawn with that color until we set the
current color to something else. The current color is only one of many state
variables that OpenGL maintains. The other states are current viewing and
projection transformations, line and polygon stipple patterns, polygon drawing
modes, pixel-packing conventions, positions and characteristics of lights, and
material properties of the objects being drawn.
The execution model for OpenGL can be described as client-server. An
application (the client) issues OpenGL commands that are interpreted and processed
by an OpenGL implementation (the server). Many server-side variables only have
two states: on or off, that are enabled or disabled with the command gl E n a b l e ()
or g l D i s a b l e (). For client-side, we enable it with g l E n a b l e C l i e n t S t a t e ()
and disable it with g l D i s a b l e C l i e n t S t a t e () commands. Each state variable or
19
- Xem thêm -