PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings volume 7527, including the Title Page, Copyright information, Table of Contents, Introduction, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The relationship of music to film has only recently received the attention of experimental psychologists and
quantificational musicologists. This paper outlines theory, semiotical analysis, and experimental results using relations
among variables of temporally organized visuals and music. 1. A comparison and contrast is developed among the
ideas in semiotics and experimental research, including historical and recent developments. 2. Musicological
Exploration: The resulting multidimensional structures of associative meanings, iconic meanings, and embodied
meanings are applied to the analysis and interpretation of a range of film with music. 3. Experimental Verification: A
series of experiments testing the perceptual fit of musical and visual patterns layered together in animations determined
goodness of fit between all pattern combinations, results of which confirmed aspects of the theory. However,
exceptions were found when the complexity of the stratified stimuli resulted in cognitive overload.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Douglas B. Shire, Patrick Doyle, Shawn K. Kelly, Marcus D. Gingerich, Jinghua Chen, Stuart F. Cogan, William A. Drohan, Oscar Mendoza, Luke Theogarajan, et al.
This presentation concerns the engineering development of the Boston visual prosthesis for restoring useful vision to
patients blind with degenerative retinal disease. A miniaturized, hermetically-encased, 15-channel wirelessly-operated
retinal prosthetic was developed for implantation and pre-clinical studies in Yucatan mini-pig animal models. The
prosthesis conforms to the eye and drives a microfabricated polyimide stimulating electrode array having sputtered
iridium oxide electrodes. This array is implanted into the subretinal space using a specially-designed ab externo surgical
technique; the bulk of the prosthesis is on the surface of the sclera. The implanted device includes a hermetic titanium
case containing a 15-channel stimulator chip; secondary power/data receiving coils surround the cornea. Long-term in
vitro pulse testing was also performed on the electrodes to ensure their stability over years of operation. Assemblies
were first tested in vitro to verify wireless operation of the system in biological saline using a custom RF transmitter
circuit and primary coils. Stimulation pulse strength, duration and frequency were programmed wirelessly using a
computer with a custom graphical user interface. Operation of the retinal implant was verified in vivo in 3 minipigs for
more than three months by measuring stimulus artifacts on the eye surface using contact lens electrodes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Assorted technologies such as; EEG, MEG, fMRI, BEM, MRI, TMS and BCI are being integrated to understand how
human visual cortical areas interact during controlled laboratory and natural viewing conditions. Our focus is on the
problem of separating signals from the spatially close early visual areas. The solution involves taking advantage of
known functional anatomy to guide stimulus selection and employing principles of spatial and temporal response
properties that simplify analysis. The method also unifies MEG and EEG recordings and provides a means for improving
existing boundary element head models. In going beyond carefully controlled stimuli, in natural viewing with scanning
eye movements, assessing brain states with BCI is a most challenging task. Frequent eye movements contribute artifacts
to the recordings. A linear regression method is introduced that is shown to effectively characterize these frequent
artifacts and could be used to remove them. In free viewing, saccadic landings initiate visual processing epochs and
could be used to trigger strictly time based analysis methods. However, temporal instabilities indicate frequency based
analysis would be an important adjunct. The class of Cauchy filter functions is introduced that have narrow time and
frequency properties well matched to the EEG/MEG spectrum for avoiding channel leakage.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In contrast to other arts, such as music, there is a very little neuroimaging research on visual art and in particular - on
drawing. Drawing - from artistic to technical - involves diverse aspects of spatial cognition, precise sensorimotor
planning and control as well as a rich set of higher cognitive functions. A new method for learning the drawing skill in
the blind that we have developed, and the technological advances of a multisensory MR-compatible drawing system,
allowed us to run for the first time a comparative fMRI study on drawing in the blind and the sighted. In each population,
we identified widely distributed cortical networks, extending from the occipital and temporal cortices, through the
parietal to the frontal lobe. This is the first neuroimaging study of drawing in blind novices, as well as the first study on
the learning to draw in either population.
We sought to determine the cortical reorganization taking place as a result of learning to draw, despite the lack of visual
input to the brains of the blind. Remarkably, we found massive recruitment of the visual cortex on learning to draw,
although our subjects had no previous experience, but only a short training with our new drawing method. This finding
implies a rapid, learning-based plasticity mechanism.
We further proposed that the functional level of the brain reorganization in the blind may still differ from that in the
sighted even in areas that overlap between the two populations, such as in the visual cortex. We tested this idea in the
framework of saccadic suppression. A methodological innovation allowed us to estimate the retinotopic regions locations
in the blind brain. Although the visual cortex of both groups was greatly recruited, only the sighted experienced dramatic
suppression in hMT+ and V1, while there was no sign of an analogous process in the blind. This finding has important
implications and suggests that the recruitment of the visual cortex in the blind does not assure a full functional parallel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Non-invasive and dynamic imaging of brain activity in the sub-millisecond time-scale is enabled by measurements on or near the scalp surface using an array of sensors that measure magnetic fields
(magnetoencephalography (MEG)) or electric potentials (electroencephalography (EEG)). Algorithmic reconstruction of
brain activity from MEG and EEG data is referred to as electromagnetic brain imaging (EBI). Reconstructing the actual
brain response to external events and distinguishing unrelated brain activity has been a challenge for many existing
algorithms in this field. Furthermore, even under conditions where there is very little interference, accurately
determining the spatial locations and timing of brain sources from MEG and EEG data is challenging problem because it
involves solving for unknown brain activity across thousands of voxels from just a few sensors (~300). In recent years,
my research group has developed a suite of novel and powerful algorithms for EBI that we have shown to be
considerably superior to existing benchmark algorithms. Specifically, these algorithms can solve for many brain sources,
including sources located far from the sensors, in the presence of large interference from unrelated brain sources. Our
algorithms efficiently model interference contributions to sensors, accurately estimate sparse brain source activity using
fast and robust probabilistic inference techniques. Here, we review some of these algorithms and illustrate their
performance in simulations and real MEG/EEG data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Research in our laboratory focuses on understanding the neural mechanisms that serve at the crossroads of perception,
memory and attention, specifically exploring how brain region interactions underlie these abilities. To accomplish this,
we study top-down modulation, the process by which we enhance neural activity associated with relevant information
and suppress activity for irrelevant information, thus establishing a neural basis for all higher-order cognitive operations.
We also study alterations in top-down modulation that occur with normal aging. Our experiments are performed on
human participants, using a multimodal approach that integrates functional MRI (fMRI), transcranial magnetic
stimulation (TMS) and electroencephalography (EEG).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of functional magnetic resonance imaging (fMRI) to measure functional connectivity among brain areas has the
potential to identify neural networks associated with particular cognitive processes. However, fMRI signals are not a
direct measure of neural activity but rather represent blood oxygenation level-dependent (BOLD) signals. Correlated
BOLD signals between two brain regions are therefore a combination of neural, neurovascular, and vascular coupling.
Here, we describe a procedure for isolating brain functional connectivity associated with a specific cognitive process.
Coherency magnitude (measuring the strength of coupling between two time series) and phase (measuring the temporal
latency differences between two time series) are computed during performance of a particular cognitive task and also for
a control condition. Subtraction of the coherency magnitude and phase differences for the two conditions removes
sources of correlated BOLD signals that do not modulate as a function of cognitive task, resulting in a more direct
measure of functional connectivity associated with changes in neuronal activity. We present two applications of this task
subtraction procedure, one to measure changes in strength of coupling associated with sustained visual spatial attention,
and one to measure changes in temporal latencies between brain areas associated with voluntary visual spatial attention.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Measuring preferences for moving video quality is harder than for static images due to
the fleeting and variable nature of moving video. Subjective preferences for image
quality can be tested by observers indicating their preference for one image over another.
Such pairwise comparisons can be analyzed using Thurstone scaling (Farrell, 1999).
Thurstone (1927) scaling is widely used in applied psychology, marketing, food tasting
and advertising research. Thurstone analysis constructs an arbitrary perceptual scale for
the items that are compared (e.g. enhancement levels). However, Thurstone scaling does
not determine the statistical significance of the differences between items on that
perceptual scale. Recent papers have provided inferential statistical methods that produce
an outcome similar to Thurstone scaling (Lipovetsky and Conklin, 2004). Here, we
demonstrate that binary logistic regression can analyze preferences for enhanced video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The merit of an objective quality estimator for either still images or video is gauged by its ability to accurately
estimate the perceived quality scores of a collection of stimuli. Encounters with radically different distortion types
that arise in novel media representations require that researchers collect perceived quality scores representative
of these new distortions to confidently evaluate a candidate objective quality estimator. Two common methods
used to collect perceived quality scores are absolute categorical rating (ACR)1 and subjective assessment for
video quality (SAMVIQ).2, 3
The choice of a particular test method affects the accuracy and reliability of the data collected. An awareness
of the potential benefits and/or costs attributed to the ACR and SAMVIQ test methods can guide researchers
to choose the more suitable method for a particular application. This paper investigates the tradeoffs of these
two subjective testing methods using three different subjective databases that have scores corresponding to each
method. The subjective databases contain either still-images or video sequences.
This paper has the following organization: Section 2 summarizes the two test methods compared in this
paper, ACR and SAMVIQ. Section 3 summarizes the content of the three subjective databases used to evaluate
the two test methods. An analysis of the ACR and SAMVIQ test methods is presented in Section 4. Section 5 concludes this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many visual difference predictors (VDPs) have used basic psychophysical data (such as ModelFest) to calibrate the
algorithm parameters and to validate their performances. However, the basic psychophysical data often do not contain
sufficient number of stimuli and its variations to test more complex components of a VDP. In this paper we calibrate the
Visual Difference Predictor for High Dynamic Range images (HDR-VDP) using radiologists' experimental data for
JPEG2000 compressed CT images which contain complex structures. Then we validate the HDR-VDP in predicting the
presence of perceptible compression artifacts. 240 CT-scan images were encoded and decoded using JPEG2000
compression at four compression ratios (CRs). Five radiologists participated to independently determine if each image
pair (original and compressed images) was indistinguishable or distinguishable. A threshold CR for each image, at which
50% of radiologists would detect compression artifacts, was estimated by fitting a psychometric function. The CT
images compressed at the threshold CRs were used to calibrate the HDR-VDP parameters and to validate its prediction
accuracy. Our results showed that the HDR-VDP calibrated for the CT image data gave much better predictions than the
HDR-VDP calibrated to the basic psychophysical data (ModelFest + contrast masking data for sine gratings).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic methods to evaluate the perceptual quality of a digital video sequence have widespread applications
wherever the end-user is a human. Several objective video quality assessment (VQA) algorithms exist, whose
performance is typically evaluated using the results of a subjective study performed by the video quality experts
group (VQEG) in 2000. There is a great need for a free, publicly available subjective study of video quality that
embodies state-of-the-art in video processing technology and that is effective in challenging and benchmarking
objective VQA algorithms. In this paper, we present a study and a resulting database, known as the LIVE
Video Quality Database, where 150 distorted video sequences obtained from 10 different source video content
were subjectively evaluated by 38 human observers. Our study includes videos that have been compressed by
MPEG-2 and H.264, as well as videos obtained by simulated transmission of H.264 compressed streams through
error prone IP and wireless networks. The subjective evaluation was performed using a single stimulus paradigm
with hidden reference removal, where the observers were asked to provide their opinion of video quality on
a continuous scale. We also present the performance of several freely available objective, full reference (FR)
VQA algorithms on the LIVE Video Quality Database. The recent MOtion-based Video Integrity Evaluation
(MOVIE) index emerges as the leading objective VQA algorithm in our study, while the performance of the
Video Quality Metric (VQM) and the Multi-Scale Structural SIMilarity (MS-SSIM) index is noteworthy. The
LIVE Video Quality Database is freely available for download1 and we hope that our study provides researchers
with a valuable tool to benchmark and improve the performance of objective VQA algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper reports on a series of experiments designed to examine the subjective quality of HDTV
encoded video. A common set of video sequences varying in spatial and temporal complexity will be
encoded at different H.264 encoding profiles. Each video clip used in the test will be available in 720p,
1080i and 1080p HDTV formats. Each video clip will then be encoded in CBR mode using H.264 at bitrates
ranging from 1 Mbit/s up to 15 Mbit/s with the other encoding parameters identical (e.g., identical
key frame intervals and CABAC entropy mode). This approach has been chosen to enable direct
comparison across bit-rates and formats most relevant to HDTV broadcast services. Three subjective
tests will be performed, one test for each HDTV format. The single-stimulus subjective test method with
the ACR rating scale will be employed. A total of 15 non-expert subjects will participate in each test. A
different sample of subjects will participate in each test, making 45 subjects in total. The test results will
be available by end of the summer 2009.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In his book "Understanding Media" social theorist Marshall McLuhan declared: "The medium is the message." The
thesis of this paper is that with respect to image quality, imaging system developers have taken McLuhan's dictum too
much to heart. Efforts focus on improving the technical specifications of the media (e.g. dynamic range, color gamut,
resolution, temporal response) with little regard for the visual messages the media will be used to communicate. We
present a series of psychophysical studies that investigate the visual system's ability to "see through" the limitations of
imaging media to perceive the messages (object and scene properties) the images represent. The purpose of these studies
is to understand the relationships between the signal characteristics of an image and the fidelity of the visual information
the image conveys. The results of these studies provide a new perspective on image quality that shows that images that
may be very different in "quality", can be visually equivalent as realistic representations of objects and scenes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the results of two psychophysical experiments and an associated computational analysis
designed to quantify the relationship between visual salience and visual importance. In the first experiment,
importance maps were collected by asking human subjects to rate the relative visual importance of each object
within a database of hand-segmented images. In the second experiment, experimental saliency maps were
computed from visual gaze patterns measured for these same images by using an eye-tracker and task-free
viewing. By comparing the importance maps with the saliency maps, we found that the maps are related, but
perhaps less than one might expect. When coupled with the segmentation information, the saliency maps were
shown to be effective at predicting the main subjects. However, the saliency maps were less effective at predicting
the objects of secondary importance and the unimportant objects. We also found that the vast majority of early
gaze position samples (0-2000 ms) were made on the main subjects, suggesting that a possible strategy of early
visual coding might be to quickly locate the main subject(s) in the scene.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We define experiments to measure vernier acuity caused by synchronization mismatch for moving images.
The experiments are used to obtain synchronization mismatch acuity threshold as a function of object
velocity and as a function of occlusion or gap width. Our main motivation for measuring the
synchronization mismatch vernier acuity is its relevance in the application of tiled display systems which
create a single contiguous image using individual discrete panels arranged in a matrix with each panel
utilizing a distributed synchronization algorithm to display parts of the overall image. We also propose a
subjective assessment method for perception evaluation of synchronization mismatch for large ultra high
resolution tiled displays. For this we design a synchronization mismatch measurement test video set for
various tile configurations for various inter-panel synchronization mismatch values. The proposed method
for synchronization mismatch perception can evaluate tiled displays with or without tile bezels. The results
from this work can help during design of low cost tiled display systems which utilize distributed
synchronization mechanisms for a contiguous or bezeled image display.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present an empirical study of properties of the remarkable
Normalized Compression Distance (NCD), a mathematical formulation by
M. Li et al. that quantifies similarity between two binary strings as
the additional amount of algorithmic information required to transform
the description of one to the other. In particular, we are interested
in the NCD values between an image and its modified versions by common
image processing techniques. Experimental data obtained indicate that
the NCD is symmetric and transitive, and that it can be used as a
reasonable perceptual measure of similarity, but only in the spatial
domain. Further, the NCD clusters the common image processing
techniques into three groups in a manner consistent with the human
perception of similarity.
We also introduce two independent modifications to the NCD and study
their properties. The first modification calls for applying a median
filter and then thresholding to the input images before the NCD is
computed, which results in a more uniform distribution of the NCD
values. The second modification modifies the NCD formula to reflect
the running time as well as the size of the shortest program that
transforms one input string to the other. Obtained data show that
using this modified NCD, it is possible to classify subtle changes
(e.g., watermarking) to a font character image as similar to the
original and drastic changes (e.g., rotation by 90 degrees) as
different.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In a previous study we investigated the roughness of real world textures taken from the CUReT database. We showed
that people could systematically judge the subjective roughness of these textures. However, we did not determine which
objective factors relate to these perceptual judgments of roughness. In the present study we take the first step in this
direction using a subband decomposition of the CUReT textures. This subband decomposition is used to predict the
subjective roughness judgments of the previous study. We also generated synthetic textures with uniformly distributed
white noise of the same variance in each subband, and conducted a perceptual experiment to determine the perceived
roughness of both the original and synthesized texture images. The participants were asked to rank-order the images
based on the degree of perceived roughness. It was found that the synthesis method produces images that are similar in
roughness to the original ones except for a small but systematic deviation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The term "auditory roughness" was first introduced in the 19th century to describe the buzzing, rattling auditory
sensation accompanying narrow harmonic intervals (i.e. two tones with frequency difference in the range of ~15-150Hz,
presented simultaneously). A broader definition and an overview of the psychoacoustic correlates of the auditory
roughness sensation, also referred to as sensory dissonance, is followed by an examination of efforts to quantify it over
the past one hundred and fifty years and leads to the introduction of a new roughness calculation model and an
application that automates spectral and roughness analysis of sound signals. Implementation of spectral and roughness
analysis is briefly discussed in the context of two pilot perceptual experiments, designed to assess the relationship among
cultural background, music performance practice, and aesthetic attitudes towards the auditory roughness sensation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Consistent product experience requires congruity between product properties such as visual appearance and sound.
Therefore, for designing appropriate product sounds by manipulating their spectral-temporal structure, product sounds
should preferably not be considered in isolation but as an integral part of the main product concept. Because visual
aspects of a product are considered to dominate the communication of the desired product concept, sound is usually
expected to fit the visual character of a product. We argue that this can be accomplished successfully only on basis of a
thorough understanding of the impact of audio-visual interactions on product sounds. Two experimental studies are
reviewed to show audio-visual interactions on both perceptual and cognitive levels influencing the way people encode,
recall, and attribute meaning to product sounds. Implications for sound design are discussed defying the natural tendency
of product designers to analyze the "sound problem" in isolation from the other product properties.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When evaluating the surface appearance of real objects, observers engage in complex behaviors involving active
manipulation and dynamic viewpoint changes that allow them to observe the changing patterns of surface reflections.
We are developing a class of tangible display systems to provide these natural modes of interaction in computer-based
studies of material perception. A first-generation tangible display was created from an off-the-shelf laptop computer
containing an accelerometer and webcam as standard components. Using these devices, custom software estimated the
orientation of the display and the user's viewing position. This information was integrated with a 3D rendering module
so that rotating the display or moving in front of the screen would produce realistic changes in the appearance of virtual
objects. In this paper, we consider the design of a second-generation system to improve the fidelity of the virtual surfaces
rendered to the screen. With a high-quality display screen and enhanced tracking and rendering capabilities, a secondgeneration
system will be better able to support a range of appearance perception applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Art and Science Render the High-Dynamic-Range World I
We tend to think of digital imaging and the tools of PhotoshopTM as a new phenomenon in imaging. We are also familiar
with multiple-exposure HDR techniques intended to capture a wider range of scene information, than conventional film
photography. We know about tone-scale adjustments to make better pictures. We tend to think of everyday, consumer,
silver-halide photography as a fixed window of scene capture with a limited, standard range of response. This description
of photography is certainly true, between 1950 and 2000, for instant films and negatives processed at the drugstore.
These systems had fixed dynamic range and fixed tone-scale response to light. All pixels in the film have the same
response to light, so the same light exposure from different pixels was rendered as the same film density.
Ansel Adams, along with Fred Archer, formulated the Zone System, staring in 1940. It was earlier than the trillions of
consumer photos in the second half of the 20th century, yet it was much more sophisticated than today's digital
techniques. This talk will describe the chemical mechanisms of the zone system in the parlance of digital image
processing. It will describe the Zone System's chemical techniques for image synthesis. It also discusses dodging and
burning techniques to fit the HDR scene into the LDR print. Although current HDR imaging shares some of the Zone
System's achievements, it usually does not achieve all of them.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Ansel Adams (1902-1984), photographer, musician, naturalist, explorer, critic, and teacher, was a giant in the field of
landscape photography. In his images of the unspoiled Western landscape, he strove to capture the sublime: the
transcendentalist concept that nature can generate the experience of awe for the viewer. Many viewers are familiar with
the heroic, high-contrast prints on high-gloss paper that Adams made to order beginning in the 1970s; much less well
known are the intimate prints that the artist crafted earlier in his career. This exhibition focuses on these masterful small
prints from the 1920s into the 1950s. During this time period, Adams's printing style changed dramatically. The
painterly, soft-focus, warm-toned style of the Parmelian Prints of the High Sierras from the 1920s evolved into the
sharp-focus style of the f/64 school of photography that Adams co-founded in the 1930s with Edward Weston and
Imogen Cunningham. After World War II, Adams opted for a cooler, higher-contrast look for his prints. Throughout the
various styles in which he chose to work, Adams explored the power of nature and succeeded in establishing landscape
photography as a legitimate form of modern art.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Art and Science Render the High-Dynamic-Range World II
For many centuries artists have considered and depicted illumination in art, from the effect of sunlight on objects at
different times of the day, of shadows and highlights as cast by the moon, through indirect light as that through an open
window or the artificial light of the candle or firelight. The presentation will consider artists who were fascinated by the
phenomena of natural and artificial illumination and how they were able to render the natural world as a form of dynamic
range through pigment. Artists have been long aware of the psychological aspects of the juxtaposition of colour in
exploiting the optical qualities and arranging visual effects in painting and prints. Artists in the 16th century were
attempting to develop an extended dynamic range through multi-colour, wood-block printing. Artists working at the
height of naturalist realism in the 17th through the 19th century were fascinated by the illusory nature of light on objects.
The presentation will also consider the interpretation of dynamic range through the medium of mezzotint, possibly the
most subtle of printing methods, which was used by printers to copy paintings, and to create highly original works of art
containing a dynamic range of tones.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Contrast has always been appreciated as a significant factor in image quality, but it is less widely recognized that
it is a key factor in the representation of depth, solidity and three-dimensionality in images in general, and in
paintings in particular. This aspect of contrast was a key factor in the introduction of oil paint as a painting
medium at the beginning of the fifteenth century, as a practical means of contrast enhancement. However, recent
conservatorship efforts have established that the first oil paintings were not, as commonly supposed, by van Eyck
in Flanders in the 1430s, but by Masolino da Panicale in Italy in the 1420s. These developments led to the use of
chiaroscuro technique in various forms, all of which are techniques for enhanced shadowing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The overall image quality benefits substantially from good reproduction of black tones. Modern displays feature
relatively low black level, making them capable rendering good dark tones. However, it is not clear if the
black level of those displays is sufficient to produce a "absolute black" color, which appears no brighter than an
arbitrary dark surface. To find the luminance necessary to invoke the perception of the absolutely black color,
we conduct an experiment in which we measure the highest luminance that cannot be discriminated from the
lowest luminance achievable in our laboratory conditions (0.003 cd/m2). We measure these thresholds under
varying luminance of surround (up to 900 cd/m2), which simulates a range ambient illumination conditions. We also analyze our results in the context of actual display devices. We conclude that the black level of the LCD display with no backlight dimming is not only insufficient for producing absolute black color, but it may also appear grayish under low ambient light levels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent interest in HDR scene capture and display has stimulated measurements of the usable range of contrast
information for human vision. These experiments have led to a model that calculates the retinal contrast image. A
fraction of the light from each scene pixel is scattered to all retinal pixels. The amount of scattered light decreases with
distance from the other pixels. By summing the light falling on each retinal pixel from all the scene pixels we can
calculate the retinal image contrast. As objects, such as text letters, get smaller, their retinal contrast gets lower, even
though the scene contrast is constant. This paper studies the Landolt C data, a commonly used test targets for measuring
visual acuity, using three frameworks. First, it compares the visual acuity measurements with the receptor mosaic
dimension. Second, discusses the Campbell and Robson's experiments and the limits of the Contrast Sensitivity Function
(CSF). Third, the paper reports the calculated retinal stimulus after intraocular scatter of both Landolt C and Campbell
and Robson's stimuli. These three different frameworks are useful in understanding limits of human vision. Each
approach gives only one piece of the puzzle. Retinal contrast, CSF, and retinal cone spacing all influence our
understanding of human vision limits. We have analyzed Landolt C and CSF using retinal contrast. Glare effect on
Landolt C shows that retinal images are significantly different from target images. Veiling glare of the sine-wave stimuli
used by Campbell and Robson to measure CSF, results in a retinal contrast decrease. This, above 3-4 cpd, correlates well
with the data reported by them.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The performance of the MaxRGB illumination-estimation method for color constancy and
automatic white balancing has been reported in the literature as being mediocre at best;
however, MaxRGB has usually been tested on images of only 8-bits per channel. The question
arises as to whether the method itself is inadequate, or rather whether it has simply been
tested on data of inadequate dynamic range. To address this question, a database of sets of
exposure-bracketed images was created. The image sets include exposures ranging from very
underexposed to slightly overexposed. The color of the scene illumination was determined by
taking an extra image of the scene containing 4 Gretag Macbeth mini Colorcheckers placed at
an angle to one another. MaxRGB was then run on the images of increasing exposure. The
results clearly show that its performance drops dramatically when the 14-bit exposure range of
the Nikon D700 camera is exceeded, thereby resulting in clipping of high values. For those
images exposed such that no clipping occurs, the median error in MaxRGB's estimate of the
color of the scene illumination is found to be relatively small.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Human observers are able to make fine discriminations of surface gloss. What cues are they using to perform this task? In
previous studies, we identified two reflection-related cues-the contrast of the reflected image (c, contrast gloss) and the sharpness of
reflected image (d, distinctness-of-image gloss)--but these were for objects rendered in standard dynamic range (SDR) images with
compressed highlights. In ongoing work, we are studying the effects of image dynamic range on perceived gloss, comparing high
dynamic range (HDR) images with accurate reflections and SDR images with compressed reflections. In this paper, we first present
the basic findings of this gloss discrimination study then present an analysis of eye movement recordings that show where observers
were looking during the gloss discrimination task. The results indicate that: 1) image dynamic range has significant influence on
perceived gloss, with surfaces presented in HDR images being seen as glossier and more discriminable than their SDR counterparts;
2) observers look at both light source highlights and environmental interreflections when judging gloss; and 3) both of these results
are modulated by surface geometry and scene illumination.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work we simulate the effect of the human eye's maladaptation to visual perception over time through a
supra-threshold contrast perception model that comprises adaptation mechanisms. Specifically, we attempt to
visualize maladapted vision on a display device. Given the scene luminance, the model computes a measure of
perceived multi-scale contrast by taking into account spatially and temporally varying contrast sensitivity in a
maladapted state, which is then processed by the inverse model and mapped to a desired display's luminance
assuming perfect adaptation. Our system simulates the effect of maladaptation locally, and models the shifting of
peak spatial frequency sensitivity in maladapted vision in addition to the uniform decrease in contrast sensitivity
among all frequencies. Through our GPU implementation we demonstrate the visibility loss of scene details due
to maladaptation over time at an interactive speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Perceptual and Cognitive Experiments in Virtual Environments
In this paper, we explored the use of low fidelity Synthetic Environments (SE; i.e., a combination of simulation
techniques) for product design. We explored the usefulness of low fidelity SE to make design problems explicit. In
particular, we were interested in the influence of interactivity on user experience. For this purpose, an industrial design
case was taken: the innovation of an airplane galley. A virtual airplane was created in which an interactive model of the
galley was placed. First, three groups of participants explored the SE in different conditions: Participants explored the SE
interactively (Interactive condition), watched a recording (Passive Dynamic condition), or watched static images
(Passive Static condition). Afterwards, participants were tested in a questionnaire on how accurately they had memorized
the spatial layout of the SE. The results revealed that interactive SE does not necessarily provoke participants to
memorize spatial layouts more accurately. However, the effect of interactive learning is dependent on the participants'
Visual Spatial Ability (VSA). Consequently, this finding supports use of interactive exploration of prototypes through
low fidelity SE for the product design cycle when taking the individual's characteristics into account.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cognition, Attention, and Eye Movements in Image Analysis
This work addresses the problem of document image analysis, and more particularly the topic of document
structure recognition in old, damaged and handwritten document. The goal of this paper is to present the interest
of the human perceptive vision for document analysis. We focus on two aspects of the model of perceptive vision:
the perceptive cycle and the visual attention. We present the key elements of the perceptive vision that can be
used for document analysis.
Thus, we introduce the perceptive vision in an existing method for document structure recognition, which
enable both to show how we used the properties of the perceptive vision and to compare the results obtained
with and without perceptive vision. We apply our method for the analysis of several kinds of documents (archive
registers, old newspapers, incoming mails . . . ) and show that the perceptive vision significantly improves their
recognition. Moreover, the use of the perceptive vision simplifies the description of complex documents. At last,
the running time is often reduced.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In current study we examine how letter permutation affects in visual recognition of words for two orthographically
dissimilar languages, Urdu and German. We present the hypothesis that recognition or reading of permuted and
non-permuted words are two distinct mental level processes, and that people use different strategies in handling
permuted words as compared to normal words. A comparison between reading behavior of people in these
languages is also presented. We present our study in context of dual route theories of reading and it is observed
that the dual-route theory is consistent with explanation of our hypothesis of distinction in underlying cognitive
behavior for reading permuted and non-permuted words. We conducted three experiments in lexical decision
tasks to analyze how reading is degraded or affected by letter permutation. We performed analysis of variance
(ANOVA), distribution free rank test, and t-test to determine the significance differences in response time
latencies for two classes of data. Results showed that the recognition accuracy for permuted words is decreased
31% in case of Urdu and 11% in case of German language. We also found a considerable difference in reading
behavior for cursive and alphabetic languages and it is observed that reading of Urdu is comparatively slower
than reading of German due to characteristics of cursive script.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Smooth pursuit eye movements align the retina with moving targets, ideally stabilizing the retinal image. At a steadystate,
eye movements typically reach an approximately constant velocity which depends on, and is usually lower than the
target velocity. Experiment 1 investigated the effect of target size and velocity on smooth pursuit induced by realistic
images (color photographs of an apple and flower subtending 2° and 17°, respectively), in comparison with a small dot
subtending a fraction of a degree. The extended stimuli were found to enhance smooth pursuit gain. Experiment 2
examined the absolute velocity limit of smooth pursuit elicited by the small dot and the effect of the extended targets on
the velocity limit. The eye velocity for tracking the dot was found to be saturated at about 63 deg/sec while the saturation
velocity occurred at higher velocities for the extended images. The difference in gain due to target size was significant
between dot and the two extended stimuli, while no statistical difference exists between an apple and flower stimuli of
wider angular extent. Detailed knowledge of the smooth pursuit eye movements is important for several areas of
electronic imaging, in particular, assessing perceived motion blur of displayed objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Color preference is an important aspect of human behavior, but little is known about why people like some colors more
than others. Recent results from the Berkeley Color Project (BCP) provide detailed measurements of preferences among
32 chromatic colors as well as other relevant aspects of color perception. We describe the fit of several color preference
models, including ones based on cone outputs, color-emotion associations, and Palmer and Schloss's ecological valence
theory. The ecological valence theory postulates that color serves an adaptive "steering' function, analogous to taste
preferences, biasing organisms to approach advantageous objects and avoid disadvantageous ones. It predicts that people
will tend to like colors to the extent that they like the objects that are characteristically that color, averaged over all such
objects. The ecological valence theory predicts 80% of the variance in average color preference ratings from the
Weighted Affective Valence Estimates (WAVEs) of correspondingly colored objects, much more variance than any of
the other models. We also describe how hue preferences for single colors differ as a function of gender, expertise,
culture, social institutions, and perceptual experience.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The previous literature on the aesthetics of color combinations has produced confusing and conflicting claims. For
example, some researchers suggest that color harmony increases with increasing hue similarity whereas others say it
increases with hue contrast. We argue that this confusion is best resolved by considering three distinct judgments about
color pairs: (a) preference for the pair as a whole, (b) perceived harmony of the two colors, and (c) preference for the
figural color when viewed against the background color. Empirical support for this distinction shows that pair
preference and harmony ratings both increase as hue similarity increases, but preference correlates more strongly with
component color preferences and lightness contrast than does harmony. Although ratings of both pair preference and
harmony decrease as hue contrast increases, ratings of figural color preference increase as hue contrast with the
background increases. Our results refine and clarify well-known and often contradictory claims of artistic color theory.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Factors governing human preference for artwork have long been studied but there remain many holes in our
understanding. Bearing in mind contextual factors (both the conditions under which the art is viewed, and the state of
knowledge viewers have regarding art) that play some role in preference, we assess in this paper three questions. First,
what is the relationship between perceived similarity and preference for different types of art? Second, are we naturally
drawn to certain qualities-and perhaps to certain image statistics-in art? And third, do social and economic forces
tend to select preferred stimuli, or are these forces governed by non-aesthetic factors such as age, rarity, or artist
notoriety? To address the first question, we tested the notion that perceived similarity predicts preference for three
classes of paintings: landscape, portrait/still-life, and abstract works. We find that preference is significantly correlated
with (a) the first principal component of similarity in abstract works; and (b) the second principal component for
landscapes. However, portrait/still-life images did not show a significant correlation between similarity and preference,
perhaps due to effects related to face perception. The preference data were then compared to a wide variety of image
statistics relevant to early visual system coding. For landscapes and abstract works, nonlinear spatial and intensity
statistics relevant to visual processing explained surprisingly large portions of the variance of preference. For abstract
works, a quarter of the variance of preference rankings could be explained by a statistic gauging pixel sparseness. For
landscape paintings, spatial frequency amplitude spectrum statistics explained one fifth of the variance of preference
data. Consistent with results for similarity, image statistics for portrait/still-life works did not correlate significantly with
preference. Finally, we addressed the role of value. If there are shared "rules" of preference, one might expect "free
markets" to value art in proportion to its aesthetic appeal, at least to some extent. To assess the role of value, a further
test of preference was performed on a separate set of paintings recently sold at auction. Results showed that the selling
price of these works showed no correlation with preference, while basic statistics were significantly correlated with
preference. We conclude that selling price, which could be seen as a proxy for a painting's "value," is not predictive of
preference, while shared preferences may to some extent be predictable based on image statistics. We also suggest that
contextual and semantic factors play an important role in preference given that image content appears to lead to greater
divergence between similarity and preference ratings for representational works, and especially for artwork that
prominently depicts faces. The present paper paves the way for a more complete understanding of the relationship
between shared human preferences and image statistical regularities, and it outlines the basic geometry of perceptual
spaces for artwork.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper examines the modernist phenomenon of seeking subject material outside the realm of the purely physical.
Starting with the Impressionist project to paint light or at least its effects, the paper traces a trajectory of related
experimentation through Man Ray's "rayographs" or solarization images of the 20s and 30s, to the work of conceptual
artists of the sixties and seventies who focused their attention on ephemeral media. For example, Michael Asher's sitespecific
pressured air works, or Robert Barry's pioneering use of unorthodox materials such as inert gases or carrier
waves. More recently Dan Flavin and James Turrell have focused their artistic endeavors on light and its color
components, and the Belgian artist Ann Veronica Janssens has used light, color and fog to create interactive sculptures.
Today what remains invisible to the human eye has taken on a more sinister, and often political, connotation of deeper
forces at work as evinced most clearly by Trevor Paglen's The Other Night Sky (2008) series for which he tracked and
photographed the movements of classified aircraft and geostationary satellites in the California night skies. All these
works take the subject of art to be aesthetic perception, and consequently favor the means in which the experience of
viewing is directed by the artist, rather than the traditional art object.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Haboku Landscape of Sesshu Toyo is perhaps one of the finest examples of Japanese and Chinese monk
landscapes in existence. We analyze the factors going into this painting from an artistic and aesthetic perspective,
and we model the painting using MPEG-7 description. We examine the work done in rendering ink landscapes
using computer-generated NPR. Finally we make some observations about measuring aesthetics in Chinese and
Japanese ink painting.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Some eighteen portraits are now recognized of Leonardo in old age, consolidating the impression from his bestestablished
self-portrait of an old man with long white hair and beard. However, his appearance when younger is
generally regarded as unknown, although he was described as very beautiful as a youth. Application of the principles of
metric iconography, the study of the quantitative analysis of the painted images, provides an avenue for the
identification of other portraits that may be proposed as valid portraits of Leonardo during various stages of his life, by
himself and by his contemporaries. Overall, this approach identifies portraits of Leonardo by Verrocchio, Raphael,
Botticelli, and others. Beyond this physiognomic analysis, Leonardo's first known drawing provides further insight into
his core motivations. Topographic considerations make clear that the drawing is of the hills behind Vinci with a view
overlooking the rocky promontory of the town and the plain stretching out before it. The outcroppings in the
foreground bear a striking resemblance to those of his unique composition, 'The Virgin of the Rocks', suggesting a deep
childhood appreciation of this wild terrain. and an identification with that religious man of the mountains, John the
Baptist, who was also the topic of Leonardo's last known painting. Following this trail leads to a line of possible selfportraits
continuing the age-regression concept back to a self view at about two years of age.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel system that employs an adaptive neural network for the no-reference assessment of perceived
quality of JPEG/JPEG2000 coded images. The adaptive neural network simulates the human visual system as a black
box, avoiding its explicit modeling. It uses image features and the corresponding subjective quality score to learn the
unknown relationship between an image and its perceived quality. Related approaches in literature extract a considerable
number of features to form the input to the neural network. This potentially increases the system's complexity, and
consequently, may affect its prediction accuracy. Our proposed method optimizes the feature-extraction stage by
selecting the most relevant features. It shows that one can largely reduce the number of features needed for the neural
network when using gradient-based information. Additionally, the proposed method demonstrates that a common
adaptive framework can be used to support the quality estimation for both compression methods. The performance of the
method is evaluated with a publicly available database of images and their quality score. The results show that our
proposed no-reference method for the quality prediction of JPEG and JPEG2000 coded images has a comparable
performance to the leading metrics available in literature, but at a considerably lower complexity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The current widespread use of webcams for personal video communication over the Internet suggests that
opportunities exist to develop video communications systems optimized for domestic use. We discuss both prior and
existing technologies, and the results of user studies that indicate potential needs and expectations for people relative to
personal video communications. In particular, users anticipate an easily used, high image quality video system, which
enables multitasking communications during the course of real-world activities and provides appropriate privacy
controls. To address these needs, we propose a potential approach premised on automated capture of user activity. We
then describe a method that adapts cinematography principles, with a dual-camera videography system, to automatically
control image capture relative to user activity, using semantic or activity-based cues to determine user position and
motion. In particular, we discuss an approach to automatically manage shot framing, shot selection, and shot transitions,
with respect to one or more local users engaged in real-time, unscripted events, while transmitting the resulting video to
a remote viewer. The goal is to tightly frame subjects (to provide more detail), while minimizing subject loss and
repeated abrupt shot framing changes in the images as perceived by a remote viewer. We also discuss some aspects of
the system and related technologies that we have experimented with thus far. In summary, the method enables users to
participate in interactive video-mediated communications while engaged in other activities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, Seshadrinathan and Bovik proposed the Motion-based Video Integrity Evaluation (MOVIE) index
for VQA.1, 2 MOVIE utilized a multi-scale spatio-temporal Gabor filter bank to decompose the videos and to
compute motion vectors. Apart from its psychovisual inspiration, MOVIE is an interesting option for VQA owing
to its performance. However, the use of MOVIE in a practical setting may prove to be difficult owing to the
presence of the multi-scale optical flow computation. In order to bridge the gap between the conceptual elegance
of MOVIE and a practical VQA algorithm, we propose a new VQA algorithm - the spatio-temporal video SSIM
based on the essence of MOVIE. Spatio-temporal video SSIM utilizes motion information computed from a
block-based motion-estimation algorithm and quality measures using a localized set of oriented spatio-temporal
filters. In this paper we explain the algorithm and demonstrate its conceptual similarity to MOVIE; we explore
its computational complexity and evaluate its performance on the popular VQEG dataset. We show that the
proposed algorithm allows for efficient FR VQA without compromising on the performance while retaining the
conceptual elegance of MOVIE.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
TV sent over the networks based on the Internet Protocol i.e IPTV is moving towards high definition (HDTV). There has
been quite a lot of work on how the HDTV is affected by different codecs and bitrates, but the impact of transmission
errors over IP-networks have been less studied.
The study was focusing on H.264 encoded 1280x720 progressive HDTV format and was comparing three different
concealment methods for different packet loss rates. One is included in a propriety decoder, one is part of FFMPEG and
different length of freezing. The target is to simulate what typically IPTV settop-boxes will do when encountering packet
loss. Another aim is to study whether the presentation upscaled on the full HDTV screen or presented pixel mapped in a
smaller area in the center of the sceen would have an effect on the quality.
The results show that there were differences between the two packet loss concealment methods in FFMPEG and in the
propriety codec. Freezing seemed to have similar effect as been reported before. For low rates of transmission errors the
coding impairments has impact on the quality, but for higher degree of transmission errors these does not affect the
quality, since they become overshadowed by transmission error. An interesting effect where the higher bitrate videos
goes from having higher quality for lower degree of packet loss, to having lower quality than the lower bitrate video at
higher packet loss, was discovered. The different way of presenting the video i.e. upscaled or not-upscaled was
significant on the 95% level, but just about.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Channelized Hotelling observers have been evaluated in signal detection in medical images. However, these model
observers fall short of overestimate of detection performance compared with human observers. Here, we present a
modified channelized Hotelling observer with divisive normalization mechanism. In this model, images first traverse a
series of Gabor filters of different orientations, phases and spatial frequencies. Then, the channels outputs pass through a
nonlinear process and pool in several channels. Finally, these channels responses are normalized through a divisive
operation. Human performances of nodule detection in chest radiography are evaluated by 2AFC experiments. The same
detection tasks are performed by the model observers. The results show that the modified channelized Hotelling observer
can well predict the human performances.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Perceptual image distortion measures can play a fundamental role in evaluating and optimizing imaging systems
and image processing algorithms. Many existing measures are formulated to represent "just noticeable differences"
(JNDs), as measured in psychophysical experiments on human subjects. But some image distortions,
such as those arising from small changes in the intensity of the ambient illumination, are far more tolerable to
human observers than those that disrupt the spatial structure of intensities and colors. Here, we introduce a
framework in which we quantify these perceptual distortions in terms of "just intolerable differences" (JIDs).
As in (Wang & Simoncelli, Proc. ICIP 2005), we first construct a set of spatio-chromatic basis functions to
approximate (as a first-order Taylor series) a set of "non-structural" distortions that result from changes in
lighting/imaging/viewing conditions. These basis functions are defined on local image patches, and are adaptive,
in that they are computed as functions of the undistorted reference image. This set is then augmented with a
complete basis arising from a linear approximation of the CIELAB color space. Each basis function is weighted
by a scale factor to convert it into units corresponding to JIDs. Each patch of the error image is represented
using this weighted overcomplete basis, and the overall distortion metric is computed by summing the squared
coefficients over all such (overlapping) patches. We implement an example of this metric, incorporating invariance
to small changes in the viewing and lighting conditions, and demonstrate that the resulting distortion values
are more consistent with human perception than those produced by CIELAB or S-CIELAB.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recognizing hand gestures from the video sequence acquired by a dynamic camera could be a useful interface between
humans and mobile robots. We develop a state based approach to extract and recognize hand gestures from moving
camera images. We improved Human-Following Local Coordinate (HFLC) System, a very simple and stable method for
extracting hand motion trajectories, which is obtained from the located human face, body part and hand blob changing
factor. Condensation algorithm and PCA-based algorithm was performed to recognize extracted hand trajectories. In last
research, this Condensation Algorithm based method only applied for one person's hand gestures. In this paper, we
propose a principal component analysis (PCA) based approach to improve the recognition accuracy. For further
improvement, temporal changes in the observed hand area changing factor are utilized as new image features to be
stored in the database after being analyzed by PCA. Every hand gesture trajectory in the database is classified into either
one hand gesture categories, two hand gesture categories, or temporal changes in hand blob changes. We demonstrate
the effectiveness of the proposed method by conducting experiments on 45 kinds of sign language based Japanese and
American Sign Language gestures obtained from 5 people. Our experimental recognition results show better
performance is obtained by PCA based approach than the Condensation algorithm based method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.