Fig. 1. a) An MC200/30iR-gr MEA (NMI, Reutlingen, Germany, UK), showing the 30m elec-
trodes which lead to the electrode column-row arrangement b) Electrode arrays in the centre of
the MEA seen under an optical microscope (Nikon TMS, Nikon, Japan), x4 magnification c)
An MEA at x40 magnification, showing neuronal cells in close proximity to an electrode, with
visible extensions and inter-connections
Fig. 2. The Miabot robot with a cultured neural network
Fig. 3. Modular layout of the robot/MEA system
Table 1. Basic statistics from a wall avoidance experiment
Fig. 4. Analysis of the robots activity during a simple wall-detection/right turn experiment
Fig. 2. Hamming window minimizes unwanted artifacts (a) untapered signal; (b) Hamming-
window; (c) tapered signal. Note that the ends connect more smoothly than in (a).
Fig. 6. Spectrogram of a male speaking
Fig. 5. Spectrogram of a female speaking
Fig. 8. (a) A set of points in 2D space generated by a Gaussian mixture. Each cluster can be
composed of multiple components. (b) The probability density function f of a 2-component
Gaussian mixture.
Table 2. Spectrogram-based results
' 44.1kHz is a common sampling rate to represent sounds perceivable by most humans.
Table 1. Number of sound files in the database
Table 4. Spectrogram-based results with linear magnitude scaling
Logarithmically scaling the frequencies produces better results, because it resem-
bles the way humans hear more closely. Knowing Weber-Fechner’s law, taking the
logarithm of the Fourier-transform of a signal simulates human hearing relatively
Table 3. Results using collapsed classes
Table 6. MFCC-based results
The use of Mel-scale Frequency Cepstral Coefficients resulted in marginally bette:
results for human speech for clear samples only, giving considerably worse classifica
tion rates for the other classes. When we added noisy samples into the male and fe
male classes, the results got slightly worse as can be seen from the tables. W:
conclude that MFCC is a useful tool when used appropriately, but it does not generate
feature vectors suitable for all audio classes. For the classes tested in this work, ou
own feature vector implementation produces significantly better results.
Fig. 1. Experimental setup and the corresponding angular position of the lamps
Fig. 2. The various steps of the calculation of the regularity. From the raw image (a), the Fou-
rier spectrum is calculated (b). Then a polar and normalized AACF representation is calculated
(c), from which the regularity position and intensity (d) are obtained by applying the Chet-
verikov algorithm.
Fig. 3. Pattern regularity function of the cleaning stage for canvas 1 and 2
A careful visual inspection of the canvas surfaces showed a decrease in the regularit
of the weave pattern due to the scrubbing. The physical strain due to the scrubbing 01
the surface tended to affect the arrangement of the fibers. As a result of the gentl
scrubbing, the tension of the weave was loosened and we also noticed the emergenc:
of slub. With stronger scrubbing, defects in the structure, such as holes and snagget
fibers, were observed. In order to quantify these variations in pattern regularity, w
computed the algorithm described in Part 2.2.3 for the different samples considered
Figure 3 presents the results.
Fig. 4. Variation of the luminance due to gentle scrubbing for canvas 1 according to the
position of illumination
Fig. 5. GLCM contrast of untreated canvas 1 for various elevation angles for the illumination
eee ewe eee Se SO eee See
Figure 5 shows the variation of GLCM contrast as a function of the lateral offset for
canvas 1 according to the elevation of the illumination.
By computing the GLCM contrast as a function of the lateral displacement, the
weave pattern is exhibited. The offset is chosen in pixels rather than in mm or mi-
crometer for sake of simplicity, but the width of the undulation of the GLCM contrast
corresponds precisely to the actual width of the threads. These results also demon-
strate that illumination incident near grazing angle (larger elevation angle) tends to
enhance the pattern of the canvas texture. For the elevation angle of 20°, very near
normal, the undulation of the weave is barely noticeable. These results are again in
accordance with the visual perception of the canvas surface. As with brightness, the
Fig. 6. Maximum GLCM contrast on untreated canvas 2 according to the position of
illumination
GLCM contrast is deeply related to the position of the light source. In this context, the
full characterization of this contrast was needed to determine the best geometry posi-
tion to perform the measurement of texture change from cleaning. Figure 6 presents a
3D polar plot of the maximum GLCM contrast as a function of the illumination
position.
Fig. 8. Effect of cleaning on the GLCM contrast amplitude variation
Fig. 1. We can often recognize everyday objects by their contours
Most of the information in our daily life is redundant. Studies [20] show that photos
normally provide much more information than we need. This redundancy can be as
high as ninety percent [15]. In facial communication, dramatic reductions in the spa-
tial resolution of images can be tolerated by viewers [3]. From the point of view of
psychology and economics, the level of detail in data communication can be greatly
reduced. For example, photos in newspapers normally only have two values for each
dot (pixel): with or without ink. With grid screen processing, the size of the smallest
pixels is increased so that the number of dots per area can be reduced greatly. How-
ever, the picture is still recognizable. Increasing the level of details of the grid screen
can make the image more attractive but not more recognizable nor comprehensible.
Fig. 2. Expressions of pain in pictures, numbers and words
Fig. 3. Analogical description of noses and face type in Chinese
We developed a prototype of the interactive system for facial reconstruction on a
computer. In the system, a user selects the feature keywords in a hierarchical struc-
ture. The computer responds to the selected keyword with a pool of candidates that
are coded with labels and numbers. Once a candidate is selected, the computer will
superimpose the components together and reconstruct the face. As we know, a
composite sketch of a suspect is often done by professionals. Our system enables
inexperienced users to reconstruct a face with a menu-driven interaction. In contrast,
the reconstruction process is reversible, so it can be used for facial description studies,
robotic vision and professional training.
Fig. 4. Interactive front facial reconstruction based on image components
Fig. 1. Separating genres in a 2D projection of N dimensions by selecting and aggregating lexi-
cal clusters on an X and Y axis. This is a snapshot of one of the DocuScope interfaces used to
separate genre through the selection and aggregation of specific lexical clusters. In this figure,
the user has selected clusters (past, description, narrative) on the Y axis associated with remi-
niscences. The user has selected clusters (first person, first person personal register, personal
register, and interactivity) on the X axis associated with letters. The interface confirms that
these features are relevant to defining similarities and differences between these genres by
separating reminiscences high on the Y axis and letters to the far right on the X axis.
strings we identified heard over radio stations that focused on news, talk, or sports.
The visualization environment allowed us to visually inspect and test new samples in
our archive. To further control quality, we built a collision detector that would warn
us if we assigned the same string to multiple categories. This helped us locate
debugging inconsistencies and ambiguities in the string data. The visualization envi-
ronment we taught with [10] also became a centerpiece in one of our graduate writ-
ing courses. As part of their continuing training in close reading, students were asked
to keep semester logs of the matched strings they found in their own writing and in
the writing of their peers. They were asked to keep systematic notes about whether
the strings matched by the software were ambiguous or incorrect. In cases where
they found errors, students proposed changes to the software’s internal dictionaries
as part of their log-work. If their proposals were verified by the course instructors,
the internal dictionaries were changed to reflect them. It is beyond the scope of this
paper to say more about these methods, but further discussion about these techniques
is available elsewhere [8, 13].
Fig. 2. Plotting Factor 1 vs. Factor 2. Factor 1 isolates student texts that include guiding the
reader, sequence, prescriptive, and curiosity. Factor 1 significantly distinguishes (MANOVA, F
=24.11; p <=0) the student written information texts from other texts, as the boxed region to
the far right indicates. Factor 2 significantly distinguishes texts at the lower end of the factor
that provide present-based visual language. This factor significantly distinguishes (MANOVA,
F = 13.04; p <=0) description texts from the other student genres. We have included the names
of the students who wrote the seeming most prototypical and peripheral texts for these factors.
Li wrote the most prototypic information paper and Qian the most prototypic description paper.
Y oufei wrote the most peripheral information paper and Peng the most peripheral description
naner
Vee ALARA DD PINE AAR ee ee eS Oe eee Se ae *F}. eM fae
7
caption). We boxed off both clusters as a visual guide to the reader.
We were able to isolate one student in the economics section by the name of Li a
having written the most prototypical information assignment. Note that Li’s paper i
farthest to the right of any other information paper. We were able to isolate anothe
student in the telecommunication section by the name of Qian as having written th
most prototypical description assignment. Please note that Qian’s paper is lower tha
any other descriptive paper. These two students were able to produce a paper most “i
range” with the restrictions of the genre assigned. Conversely, for each prototyp
paper, we could identify a corresponding peripheral paper, a paper whose lexical clus
tering was most out of range for the assignment. Youfei, an economics studer
assigned an information paper, wrote a text that clustered as a narrative piece. Peng,
telecommunications student assigned a descriptive paper, wrote a text with fer
descriptive features and also with relatively few markers of the other genres.
Fig. 3. Plotting Factor 3 vs. Factor 6. Factor 3 isolates student texts that range from the use of
temporal expression on the low end of the factor to self-disclosures, acknowledging, updating,
and questioning on the high end. This means that the further to the left a text falls, the more
temporal expressions it contains and the more it resembles a narrative. Because of its involve-
ment with temporal expression, factor 3 significantly distinguishes (MANOVA, F = 3.49; p <=
0) the student description texts from narrative texts, as the boxed regions indicate. Notice how
the narrative texts (squares) congregate to the left side of the chart and the description papers
(circles) fall on their right. Factor 6 isolates texts where students account for their actions, and
this factor also significantly distinguishes (F = 4.54; p <= 0.013) narrative texts from descrip-
tive texts. Notice how the narrative papers appear above most of the descriptive papers
(circles). We have included the names of the students who wrote the most prototypical and
peripheral texts for the narrative paper. Mao wrote the most prototypical narrative paper under
the definition of a major prototype. Han, on the other hand, wrote the most peripheral paper. As
one can see, his paper falls significantly to the right of the major cluster of narrative paper,
indicating he is lacking the temporal expressions so vital to standard narrative writing.
Fig. 3. Emotional expressions in embodied virtual agents. These examples are taken
from the VirtualHumans project, see http://www.virtual-human.org/ for more infor-
mation.
Feelings and emotions have been discussed in relevant psychological literature,
e.g., 44]. More modern, technically grounded works speak of moods and emo-
tions in the mind, e.g., (43). We use the distinction between moods and emotions
(figure 5). The realisation of emotions (in embodied virtual characters, such as
the ones in figure[8] right) for speech and body graphics has been studied, e.g.,
in [46]. used more fine-grained rules to realise emotions by mimicry, gesture,
and face texture. Moods are realised as body animations of posture and gestures.
Affective dialogue systems have been described in [a7j®.
Fig. 8. (Left) Intuitive soccer dialogue through context information with the help of
a mobile dialogue system. (Right) Dialogue with virtual characters in a virtual studio
(reprinted with permission lady. In the mobile situation, intuition is needed to un-
derstand the intentions of the user (perception). In the studio, intuition can lead to
emotional expression and emphatic group behaviour (generation).
‘ig. 9. The self-reflective mechanism consists of two structures, (1) the object-level,
nd (2) the meta-level, whereby 1 — 2 is an asymmetric relation monitoring, anc
»— 1 is an asymmetric relation control. Both form the information flow between the
wo levels; monitoring informs the meta-level and allows the meta-level to be updated.
Jepending on the meta-level, the object-level is controlled, i.e., to initiate, maintain,
wr terminate object-level cognitive activities like information retrieval or other dialogue
.ctions.
LO VEL HAVE ALY VOU UC VEL, SUL Ee}: ZL ALULIUOUUULY CAUVOTUD AVPIVALLILY
to multi-strategy dialogue management, see [64] for example.
The main point is that we can maintain a model of the dialogue environmen
on the meta-level which contains the context model, gets access to the mul
timodal sensory input, and also contains self-reflective information. We calle
such a model an introspective view, see figure[L0] An introspective view emerge
from monitoring the object level—the correct interpretation of available sensor:
inputs and the combination of this information with prior knowledge and expe
riences. We can use this knowledge to implement a dialogue reaction behaviou
(cf. [34], p. 194f) that can be called intuitive since the object level control fulfill
intuitive functions, i.e., the initiation, maintenance, or termination of object
level cognitive activities; our intuition controls our object-level behaviour b
formulating dialogue goals and triggering dialogue actions.
Fig. 2. Components of Human Performance in Virtual Environments [38]
a aa a WO SSR keane ES SETH a 2
overall workload among y team members and enhancing situational awareness 5 [34].
Endsley [34] described Situational Awareness (SA) in terms of three levels: per-
ception, comprehension, and projection. Specifically, situational awareness is the
result of achieving a comprehensive understanding of the battle space within an op-
erational context, which enables us to make effective and accurate decisions. In the
military domain, SA is further forged by awareness and understanding of an adver-
sary’s knowledge and capabilities. One of the key components in the acquisition of
SA and its implementation is the cognitive process used to evaluate information and
make a decision within a dynamic environment.
plicated to NP (57,507% of perfect identification) than P (80,00%) showing a
statistically significant difference (p-value of 0.0029).
These results show that while the differentiation between triggers and con-
ditions could be made naturally both by programmers and non programmers
the Spanish words for “when” and “if” are semantically close (as they are in
English) and may be misleading among users unfamiliar with the inflexibility o!
computing languages, e.g. sentences such as “when my favorite TV show begins.
if Tam not at home...” can be also expressed as “If I am not at home when my
favorite TV show begins...”. Thus, while triggers and conditions are eas-
ily differentiated, natural language does not make their identification
peasy to non—programmers.
Table 3. Rule—based language’s grammar
Fig. 2. Illustration of consumption policies for composite event detection. Various con-
sumption policies are compared with a “mixed” policy in which the composite event is
designed to use the first instances of the initiator and terminator events but the last
instance of the in-between events. Illustration inspired in [35].
Fig. 1. Incorporating conditions into event composition allows differentiating similaz
events based on the context in which they occur
Finally, a set of on_load rules can be added after the on_running rules, a
well as a concurrence factor (the number of times a TIMER can be running si
multaneously) in the form TIMER ending_time concurrence { THEN on_finishe
rules} {DURING WHICH on_running rules} {BEFORE on_load rules}. Thi
form combined with the ability to modify and use the status of the TIMEF
‘TIMER.pause, TIMER. start, TIMER.reset...) in the on_running rules, as if i
Fig.4. Representation of an agent, properties and relations in the Blackboard
Coordinating preferences is closely related to the problem of creating hierarchies.
Multiple users inhabiting the same space make the interaction dependent on
the remaining users and their preferences. Hierarchies are the natural social
structures for establishing an order of preference, but, linked as they are to
the social group that created them, they are multiple and dynamic and their
complexity reflects the complexity of the social group they rule.
To allow both control and accessibility, we want to be able to allow the cre-
ation of hierarchies as complex as the social structures they govern but without
imposing, especially to the simpler scenarios, a profound analysis or an a priori
knowledge to build their own.
Fig. 5. An example of interdependence in the graph created by the connections between
people and their agents (is_owner), and the agents with the objects they affect (affects)
Interdependence is easy to see in the Blackboard model as the graph created
by all the affects relations. Depending on the scenario, this graph ranges from
pyramidal structures to unconnected graphs or entangled networks, with a per-
son or persons behind each agent, an element of the environment in every leaf of
the graph and, in between, a complex structure of conditions that, as a whole,
governs the overall automatic behavior of the environment (see Figure [5). Scale,
on the other hand, can be appreciated in the different levels in which hierarchies
can be expressed.
Fig. 6. Evolution on the number of agents, rules, rules per agent and people program-
ming them in the Universidad Auténoma de Madrid’s Ambient Intelligence laboratory
from 2006 to 2009. The number of agents and rules grows as new domains of automa-
tion and preferences are tackled. Nevertheless, 2009 shows a significant decrease in the
number of rules since the inhabitants begin using wildcards to group several preferences
under a general one. Complex concepts, if correctly designed, are acquired through the
use of simpler ones (not through training) and used when really needed as, in this case,
having to deal with several people with similar preferences.
Fig. 2. Interaction may be regarded as constituting both an implicit and explicit component
Various models of interaction have been proposed in computational contexts, for ex-
ample, those of Norman [15] and Beale [16]. Ultimately, all frameworks coalesce
around the notions of input and output, though the humans and computer interpreta-
tion of each is not symmetrical. Obreovic and Starcevic [17] define input modalities
as being either stream-based or event based. In the later case, discrete events are pro-
duced in direct response to user actions, for example, clicking a mouse. In the forme!
case, a time-stamped array of values is produced. In the case of output modalities
these are classified as either static or dynamic according to the data presented to the
users. Static responses would usually be presented in modal dialog boxes. Dynamic
output may present as an animation - something that must be interpreted only after ¢
time interval has elapsed.
Fig. 3. Situations can be inferred from individual contexts hamessed from a suite of sensors
Fig. 4. Constituent components of the SIXTH middleware architecture