PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE
Proceedings Volume 6764, including the Title Page, Copyright
information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose to use new SVM-type classifiers in a binary hierarchical tree classification structure to efficiently address the multi-class classification problem. A new hierarchical design method, WSV (weighted support vector) K-means Clustering, is presented; it automatically selects the classes to be separated at each node in the hierarchy. Our method is able to visualize and cluster high-dimensional support vector data; therefore, it improves upon prior hierarchical classifier designs. At each node in the hierarchy, we apply an SVRDM (support vector representation and discrimination machine) classifier, which offers generalization and good rejection of unseen false objects; rejection is not achieved with the standard SVM classifier. We provide the theoretical basis and insight into the choice of the Gaussian kernel to provide the SVRDM's rejection ability. New classification and rejection test results are presented on a real IR (infra-red) database.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this paper is to introduce a concept of eclecticism for the design, development, simulation
and implementation of a real time controller for an intelligent, vision guided robots. The use of an eclectic
perceptual, creative controller that can select its own tasks and perform autonomous operations is
illustrated. This eclectic controller is a new paradigm for robot controllers and is an attempt to simplify the
application of intelligent machines in general and robots in particular. The idea is to uses a task control
center and dynamic programming approach. However, the information required for an optimal solution
may only partially reside in a dynamic database so that some tasks are impossible to accomplish. So a
decision must be made about the feasibility of a solution to a task before the task is attempted. Even when
tasks are feasible, an iterative learning approach may be required. The learning could go on forever. The
dynamic database stores both global environmental information and local information including the
kinematic and dynamic models of the intelligent robot. The kinematic model is very useful for position
control and simulations. However, models of the dynamics of the manipulators are needed for tracking
control of the robot's motions. Such models are also necessary for sizing the actuators, tuning the
controller, and achieving superior performance. Simulations of various control designs are shown. Much of
the model has also been used for the actual prototype Bearcat Cub mobile robot. This vision guided robot
was designed for the Intelligent Ground Vehicle Contest. A novel feature of the proposed approach lies in
the fact that it is applicable to both robot arm manipulators and mobile robots such as wheeled mobile
robots. This generality should encourage the development of more mobile robots with manipulator
capability since both models can be easily stored in the dynamic database. The multi task controller also
permits wide applications. The use of manipulators and mobile bases with a high-level control are
potentially useful for space exploration, certain rescue robots, defense robots, medical robotics, and robots
that aids older people in daily living activities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We overview National Aeronautics and Space Administration (NASA) objectives for future robotic exploration of lunar,
planetary and small bodies of the Solar System, and present several examples of supporting robotics R&D. The scope of
development spans autonomous surface exploration typified by the Mars Exploration Rovers (MER) and sequel Mars
surface missions, autonomous aerial and subsurface robotic exploration of the outer planet moons, and recently initiated
efforts under the Vision for Space Exploration (VSE) toward a sustained human-robotic presence at Earth moon.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the Embedded Object Concept (EOC) and a telepresence robot system which is a test case for the
EOC. The EOC utilizes common object-oriented methods used in software by applying them to combined Lego-like
software-hardware entities. These entities represent objects in object-oriented design methods, and they are the building
blocks of embedded systems. The goal of the EOC is to make the designing of embedded systems faster and easier. This
concept enables people without comprehensive knowledge in electronics design to create new embedded systems, and
for experts it shortens the design time of new embedded systems.
We present the current status of a telepresence robot created with Atomi-objects, which is the name for our
implementation of the embedded objects. The telepresence robot is a relatively complex test case for the EOC. The
robot has been constructed using incremental device development, which is made possible by the architecture of the
EOC. The robot contains video and audio exchange capability and a controlling system for driving with two wheels.
The robot consists of Atomi-objects, demonstrating the suitability of the EOC for prototyping and easy modifications,
and proving the capabilities of the EOC by realizing a function that normally requires a computer. The computer
counterpart is a regular PC with audio and video capabilities running with a robot control application. The robot is
functional and successfully tested.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A better understanding of intelligent information processing in human vision can be reached through a closer look at the
macro- and micro-hardware available in the hierarchy of cortical processors along the main visual pathway connecting
the retina, the CGL (corpus geniculatum laterale) and area V1 (cortical visual area 17). The building of the eye is driven
by the brain and the engineering of the main visual pathway back to V1 seems to be driven by the eyes.
The human eye offers to the brain much more intelligent information about the outer visible world than a camera
producing flat 2D images on a CCD. Intelligent processing of visual information in human vision - a strong cooperation
between eyes and brain - relays on axes related symmetry operations relevant for navigation in 4D spectral space-times,
on a hierarchy of dynamically balanced equilibrium states, on diffractive-optical transformation of the Visible into RGB
space, on range mapping based on RGB data (monocular and binocular 3D vision), on illuminant-adaptive optical
correlations of local onto global RGB data (color constancy performances) and on invariant fourier-optical log-polar
processing of image data (generic object classification; identification of objects). These performances are more
compatible with optical processing of modern diffractive-optical sensors and interference-optical correlators than with
cameras. The R+D project NAMIROS (Nano- and Micro-3D gratings for Optical Sensors) [8], coordinated by an
interdisciplinary team of specialists at Corrsys 3D Sensors AG, describes the roadmap towards a technical realization of
outstanding high-tech performances corresponding to human eye-brain co-processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Intelligent Ground Vehicle Competition (IGVC) is one of three, unmanned systems, student competitions
that were founded by the Association for Unmanned Vehicle Systems International (AUVSI) in the 1990s. The IGVC is
a multidisciplinary exercise in product realization that challenges college engineering student teams to integrate
advanced control theory, machine vision, vehicular electronics, and mobile platform fundamentals to design and build an
unmanned system. Teams from around the world focus on developing a suite of dual-use technologies to equip ground
vehicles of the future with intelligent driving capabilities. Over the past 15 years, the competition has challenged
undergraduate, graduate and Ph.D. students with real world applications in intelligent transportation systems, the military
and manufacturing automation. To date, teams from over 50 universities and colleges have participated. This paper
describes some of the applications of the technologies required by this competition and discusses the educational
benefits. The primary goal of the IGVC is to advance engineering education in intelligent vehicles and related
technologies. The employment and professional networking opportunities created for students and industrial sponsors
through a series of technical events over the four-day competition are highlighted. Finally, an assessment of the
competition based on participation is presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the implementation of a pedestrian detection system which is based on the Histogram of Oriented
Gradients (HOG) principle and which tries to improve the overall detection performance by combining several part
based detectors in a simple voting scheme. The HOG feature based part detectors are specifically trained for head, head-left,
head-right, and left/right sides of people, assuming that these parts should be recognized even in very crowded
environments like busy public transportation platforms. The part detectors are trained on the INRIA people image
database using a polynomial Support Vector Machine. Experiments are undertaken with completely different test
samples which have been extracted from two imaging campaigns in an outdoor setup and in an underground station. Our
results demonstrate that the performance of pedestrian detection degrades drastically in very crowded scenes, but that
through the combination of part detectors a gain in robustness and detection rate can be achieved at least for classifier
settings which yield very low false positive rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a vision based tracking system developed for very crowded situations like underground or railway
stations. Our system consists on two main parts - searching of people candidates in single frames, and tracking them
frame to frame over the scene. This paper concentrates mostly on the tracking part and describes its core components in
detail. These are trajectories predictions using KLT vectors or Kalman filter, adaptive active shape model adjusting and
texture matching. We show that combination of presented algorithms leads to robust people tracking even in complex
scenes with permanent occlusions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As part of the U.S. Department of Transportations Intelligent Vehicle Initiative (IVI) program, the Federal
Highway Administration (FHWA) is conducting R&D in vehicle safety and driver information systems.
There is an increasing number of applications where pedestrian monitoring is of high importance. Visionbased
pedestrian detection in outdoor scenes is still an open challenge. People dress in very different colors
that sometimes blend with the background, wear hats or carry bags, and stand, walk and change directions
unpredictably. The background is various, containing buildings, moving or parked cars, bicycles, street signs,
signals, etc. Furthermore, existing pedestrian detection systems perform only during daytime, making it
impossible to detect pedestrians at night. Under FHWA funding, we are developing a multi-pedestrian
detection system using IR LED stereo camera. This system, without using any templates, detects the
pedestrians through statistical pattern recognition utilizing 3D features extracted from the disparity map. A
new IR LED stereo camera is being developed, which can help detect pedestrians during daytime and night
time. Using the image differencing and denoising, we have also developed new methods to estimate the
disparity map of pedestrians in near real time. Our system will have a hardware interface with the traffic
controller through wireless communication. Once pedestrians are detected, traffic signals at the street
intersections will change phases to alert the drivers of approaching vehicles. The initial test results using
images collected at a street intersection show that our system can detect pedestrians in near real time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This work presents the implementation of the Kanade-Lucas-Tomasi tracking algorithm on a Digital Signal Processor
with a 40-bit fixed-point Arithmetic Logic Unit built into a smart camera. The main goal of this work was to obtain realtime
frame processing performance while loosing as little tracking accuracy as possible. This task was motivated by
increasing demand for the application of smart cameras as main data processing units in large surveillance systems,
where factors like cost and demand of space are excluding PCs from this role.
In a first effort the modification of the Kanade-Lucas-Tomasi to integer numbers was performed and then in the next
step the influence on stability and accuracy of this modification was investigated. It is demonstrated how changing the
numeric data type of intermediate results within the algorithm from float to integer, and decreasing the number of bits
used to store variables, affects tracking accuracy. Nevertheless the DSP implementation can be used where the
computation of optical flow based on a tracking algorithm needs to be done in real-time on an embedded platform where
limited subpixel accuracy can be tolerated. As a further result of this implementation we can conclude that a DSP with a
fixed-point arithmetic logic unit can be very effectively applied for complex computer vision tasks and is able deliver
good performance even compared to high-end PC architectures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents object detection and tracking algorithm which can adapt to object color shift. In this algorithm, we train and build multi target models using color under different illumination conditions. Each model called as Color Distinctiveness look up Tables or CDT. The color distinctiveness is the value integrating 1) similarity with target colors and 2) dissimilarity with non-target colors, which represents how distinctively the color can be classified into target pixel. Color distinctiveness can be used for pixel-wise target detection, because it takes 0.5 for colors on decision boundary of nearest neighbor classifier in color space. Also, it can be used for target tracking by continuously finding the most distinctive region. By selecting the most suitable CDT for camera direction, lighting condition, and camera parameters, the system can adapt target and background color change. We implemented this algorithm for a Pan-tilt stereo camera system. Through experiments using this system, we confirmed that this algorithm is robust against color shift caused by illumination change and it can measure the target 3D position at video rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes progress toward a street-crossing system for an outdoor mobile robot. The system can detect and track vehicles in real time. It reasons about extracted motion regions to decide when it is safe to cross.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel method for human head tracking using multiple cameras. Most existing methods estimate 3D target position according to 2D tracking results at different viewpoints. This framework can be easily affected by the inconsistent tracking results on 2D images, which leads 3D tracking failure. For solving this problem, an extension of Condensation using multiple images has been proposed. The method generates many hypotheses on a target (human head) in 3D space and estimates the likelihood of each hypothesis by integrating viewpoint dependent likelihood values of 2D hypotheses projected onto image planes. In theory, viewpoint dependent likelihood values should be integrated by multiplication, however, it is easily affected by occlusions. Thus we nvestigate this problem and propose a novel likelihood integration method in this paper and implemented a prototype system consisting of six sets of a PC and a camera. We confirmed the robustness against occlusions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Human motion analysis is one of the active research areas in computer vision. The trend shifts from computing motion
fields to determining actions. We present an action coding scheme based on a trajectory of features defined with respect
to a part based coordinate system. The method does not require prior human model or special motion capture hardware.
The features are extracted from images segmented in the form of silhouettes. The feature extraction step ignores 3D
effects such as self occlusions or motion perpendicular to the viewing plane. These effects are later revealed in the
trajectory analysis. We demonstrate preliminary experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a fully automatic quality system for injection molding. The proposed system includes an
on-line measurement platform with a digital camera, a methodology for adaptive design of experiments (DOE),
statistical modeling, process monitoring, and a closed loop process control. The system has been tested in the
manufacturing of plastic parts for mobile phones.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A better understanding of the color constancy mechanism in human color vision [7] can be reached through analyses of
photometric data of all illuminants and patches (Mondrians or other visible objects) involved in visual experiments. In
Part I [3] and in [4, 5 and 6] the integration in the human eye of the geometrical-optical imaging hardware and the
diffractive-optical hardware has been described and illustrated (Fig.1). This combined hardware represents the main
topic of the NAMIROS research project (nano- and micro- 3D gratings for optical sensors) [8] promoted and coordinated
by Corrsys 3D Sensors AG. The hardware relevant to (photopic) human color vision can be described as a diffractive or
interference-optical correlator transforming incident light into diffractive-optical RGB data and relating local RGB onto
global RGB data in the near-field behind the 'inverted' human retina. The relative differences at local/global RGB
interference-optical contrasts are available to photoreceptors (cones and rods) only after this optical pre-processing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes a methodology for creative learning that applies to man and machines. Creative learning is a general approach used to solve optimal control problems. The creative controller for intelligent machines integrates a dynamic database and a task control center into the adaptive critic learning model. The task control center can function as a command center to decompose tasks into sub-tasks with different dynamic models and criteria functions, while the dynamic database can act as an information system. To illustrate the theory of creative control, several experimental simulations for robot arm manipulators and mobile wheeled vehicles were included. The simulation results showed that the best performance was obtained by using adaptive critic controller among all other controllers. By changing the paths of the robot arm manipulator in the simulation, it was demonstrated that the learning component of the creative controller was adapted to a new set of criteria. The Bearcat Cub robot was another experimental example used for testing the creative control learning.
The significance of this research is to generalize the adaptive control theory in a direction toward highest level of human learning - imagination. In doing this it is hoped to better understand the adaptive learning theory and move forward to develop more human-intelligence-like components and capabilities into the intelligent robot. It is also hoped that a greater understanding of machine learning will motivate similar studies to improve human learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper a concept for industrial ubiquitous robotics is presented. The concept combines two different approaches to manage agile, adaptable production: firstly the human operator is strongly in the production loop and secondly, the robot workcell will be more autonomous and smarter to manage production. This kind of autonomous robot cell can be called production island. Communication to the human operator working in this kind of smart industrial environment can be divided into two levels: body area communication and operator-infrastructure communication including devices, machines and infra. Body area communication can be supportive in two directions: data is recorded by means of measuring physical actions, such as hand movements, body gestures or supportive when it will provide information to user such as guides or manuals for operation. Body area communication can be carried out using short range communication technologies such as NFC (Near Field communication) which is RFID type of communication. In the operator-infrastructure communication, WLAN or Bluetooth -communication can be used. Beyond the current Human Machine interaction HMI systems, the presented system concept is designed to fulfill the requirements for hybrid, knowledge intensive manufacturing in the future, where humans and robots operate in close co-operation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In computer vision, many algorithms have been developed for image registration based on image pattern matching.
However, there might be no universal method for all applications because of their advantages and disadvantages.
Therefore, we have to select the best method suited for each task. A representative sub-pixel registration method uses
one dimensional parabola fitting over the similarity measurements at three positions. The parabola fitting method could
be applied to two dimensional, assuming that horizontal and vertical displacements are independent. Although this
method has been widely used because of their simplicity and practical usability, large errors are involved. To avoid
these errors depending on the spatial structure of image pattern, "two-dimensional simultaneous sub-pixel estimation"
was proposed. However, it needs conditional branching control procedures such as scan field expansion and exception.
The conditional branching control procedures make estimation instable and disturb the speed of processing. Therefore,
the authors employ a paraboloid fitting: by using the least square method, a paraboloid is fitted with the image similarity
values at nine points and the best matching point is obtained with sub-pixel order. It is robust against the image pattern
and enables speed-up, but it still has error margin. The authors analyzed the error characteristics of the sub-pixel
estimation using the paraboloid fitting. The error can be characterized by "a bias; a systematic error" and "dispersion; a
random error." It was found that the magnitude of each error was different according to the sub-pixel values of the best
matching positions. In this paper, based on the analysis, the authors proposed a novel accurate algorithm for 2D subpixel
matching. The method does not need any iteration processes and any exception processes on runtime. Therefore,
it is easy to implement the method on software and hardware. Experimental results demonstrated the advantage of the
proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image matching is a common procedure in computer vision. Usually the size of the image template is fixed. If the matching is done repeatedly, as e.g. in stereo vision, object tracking, and strain measurements, it is beneficial, in terms of computational cost, to use as small templates as possible. On the other hand larger templates usually give more reliable matches, unless e.g. projective distortions become too great. If the template size is controlled locally dynamically, both computational efficiency and reliability can be achieved simultaneously. Adaptive template size requires though that a larger template can be sampled anytime.
This paper introduces a method to adaptively control the template size in a digital image correlation based strain measurement algorithm. The control inputs are measures of confidence of match. Some new measures are proposed in this paper, and the ones found in the literature are reviewed. The measures of confidence are tested and compared with each other as well as with a reference method using templates of fixed size. The comparison is done with respect to computational complexity and accuracy of the algorithm. Due to complex inter-actions of the free parameters of the algorithm, random search is used to find an optimal parameter combination to attain a more reliable comparison. The results show that with some confidence measures the dynamic scheme outperforms the static reference method. However, in order to benefit from the dynamic scheme, optimization of the parameters is needed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The problem of recognizing gestures from images using computers can be approached by closely understanding
how the human brain tackles it. A full fledged gesture recognition system will substitute mouse and keyboards
completely. Humans can recognize most gestures by looking at the characteristic external shape or the silhouette of the
fingers. Many previous techniques to recognize gestures dealt with motion and geometric features of hands. In this thesis
gestures are recognized by the Codon-list pattern extracted from the object contour. All edges of an image are described
in terms of sequence of Codons. The Codons are defined in terms of the relationship between maxima, minima and
zeros of curvature encountered as one traverses the boundary of the object. We have concentrated on a catalog of 24
gesture images from the American Sign Language alphabet (Letter J and Z are ignored as they are represented using
motion) [2]. The query image given as an input to the system is analyzed and tested against the Codon-lists, which are
shape descriptors for external parts of a hand gesture. We have used the Weighted Frequency Indexing Transform
(WFIT) approach which is used in DNA sequence matching for matching the Codon-lists. The matching algorithm
consists of two steps: 1) the query sequences are converted to short sequences and are assigned weights and, 2) all the
sequences of query gestures are pruned into match and mismatch subsequences by the frequency indexing tree based on
the weights of the subsequences. The Codon sequences with the most weight are used to determine the most precise
match. Once a match is found, the identified gesture and corresponding interpretation are shown as output.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have studied the use of cellular automata and cellular genetic algorithms for the object recognition, pose recognition, and image classification problems. The cellular genetic algorithm is a genetic algorithm that has some similarities with cellular automata. The preliminary results seem to support the hypothesis that in principle this kind of object and pose recognition and image classification method works relatively well. The problem with the proposed method is a large amount of calculations needed when we are testing the unknown object against the objects in the comparison set.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper studies the applicability of genetic algorithms and imaging to measure deformations. Genetic algorithms are used to search for the strain field parameters of images from a uniaxial tensile test. The non-deformed image is artificially deformed according to the estimated strain field parameters, and the resulting image is compared with the true deformed image. The mean difference of intensities is used as a fitness function. Results are compared with a node-based strain measurement algorithm developed by Koljonen et al. The reference method slightly outperforms the genetic algorithm as for mean difference of intensities. The root-mean-square difference of the displacement fields is less than one pixel. However, with some improvements suggested in this paper the genetic algorithm based method may be worth considering, also in other similar applications: Surface matching instead of individual landmarks can be used in camera calibration and image registration. Search of deformation parameters by genetic algorithms could be applied in pattern recognition tasks e.g. in robotics, object tracking and remote sensing if the objects are subject to deformation. In addition, other transformation parameters could be simultaneously looked for.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computer vision has been an active field of research for many decades; it has also become widely used for airborne
applications in the last decade or two. Much airborne computer vision research has focused on navigation for Unmanned
Air Vehicles; this paper presents a method to estimate the full 3D position information of a UAV by integrating visual
cues from one single image with data from an Inertial Measurement Unit under the Kalman Filter formulation. Previous
work on visual 3D position estimation for UAV landing has been achieved by using 2 or more frames of image data with
feature enriched information in the image; however raw vision state estimates are hugely suspect to image noise. This
paper uses a rather conventional type of landing pad with visual features extracted for use in the Kalman filter to obtain
optimal 3D position estimates. This methodology promises to provide state estimates that are better suited for guidance
and control of a UAV. This also promise autonomous landing of UAVs without GPS information to be conducted. The
result of this implementation tested with flight images is presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Unlike the navigation problem of Earth operations, the precise navigation of a vehicle in a remote planetary environment
presents a challenging problem for either absolute or relative navigation. There exist no GPS/INS solutions due to a lack
of a GPS constellation, few or no accurately surveyed markers for use in terminal sensing measurements, and highly
uncertain terrain elevation maps used by a TERCOM system. These, and other, issues prompted the investigation of the
potential use of a visual navigation aid to supplement an Inertial Navigation System (INS) and radar altimeter suite of a
planetary airplane for the purpose of the identifying the potential benefit of visual measurements to the overall
navigation solution.
The mission objective used in the study, described herein, requires the precise relative navigation of the airplane over an
uncertain terrain. Unlike the previously successful employment of visual aided navigation on the MER1 landing vehicle,
the mission objectives require that the airplane traverse a precise flight pattern over the objective terrain at relatively low
altitudes for hundreds of kilometers, and is more akin to a velocity correlator application than a terminal fix problem.
The results of the investigation indicate that a good knowledge of aircraft altitude is required in order to obtain the
desired performance for velocity estimate accuracy. However, it was determined that the direction of the velocity vector
can be obtained without a high accuracy height estimate. The characterization of the dependency of velocity estimate
accuracy upon the variety of factors involved in the process is the primary focus of this report.
This report describes the approach taken in this investigation to both define the architecture of the solution for minimal
impact upon payload requirements, and the analysis of the potential gains to the overall navigation problem. Also
described as part of the problem definition are the initially assumed contribution sources of visual measurement errors
and some additional constraints which limit the choices of solutions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a modified virtual force based obstacle avoidance approach suited for laser range
finder. The modified method takes advantage of the polar coordinate based data sent by the laser sensor by
mapping the environment in a polar coordinate system. The method also utilizes a Gaussian function based
certainty values to detect obstacle. The method successfully navigates through complex obstacles and
reaches target GPS waypoints.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper represents the vision processing solution used for lane detection by the Insight Racing team, for
DARPA Grand Challenge 2007. The problem involves detecting the lane markings for maintaining the position
of the autonomous vehicle within the lane, at usable frame rate. This paper describes a method based on color
interpretation and scanning based edge detection for quick and reliable results. First the color information is
extracted from the image using RGB to HSV transform and mapped to the Munsell color system. Next, the
regions of useful color are scanned adaptively to do an equivalent of single pixel edge detection in one stage.
These edges are then processed using Hough Transform to yield lines, which are then segmented, grouped and
approximated to reduce the number of lines representing straight and curved lane markings. The final lines
are then numbered and sent to the master controller for each frame. This allows the master controller to pick
the bounding lane markings and center the vehicle accordingly and navigate autonomously. OpenGL is used to
display the results. The solution has been tested and is being used by Insight Racing team for their entry to the
DARPA Grand Challenge 2007.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stereo vision is attractive for autonomous mobile robot navigation, but the cost and complexity of stereo camera systems and the computational requirements make full stereo vision impractical. A novel optical system allows the capture of a pair of short, wide stereo images from a single camera, which are then processed to detect vertical edges and infer obstacle positions and locations within the planar field of view, providing real-time obstacle detection. Our optical system involves a pair of right-angle prisms stacked vertically, splitting the camera field of view vertically in half. Right angle mirrors on either side redirect the image forward but at a horizontally displaced location, creating two virtual cameras. Tilting these mirrors provides an overlapping image area. Alternately, tilting the prisms produces the same effect. This image area is wide but not very tall. However, in a mobile robot scenario the majority of obstacles of interest intersect this field of view.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Omni-directional vision navigation for AGVs appears definite significant since its advantage of panoramic sight with a
single compact visual scene. This unique guidance technique involves target recognition, vision tracking, object
positioning, path programming. An algorithm for omni-vision based global localization which utilizes two overhead
features as beacon pattern is proposed in this paper. An approach for geometric restoration of omni-vision images has
to be considered since an inherent distortion exists. The mapping between image coordinates and physical space
parameters of the targets can be obtained by means of the imaging principle on the fisheye lens. The localization of the
robot can be achieved by geometric computation.
Dynamic localization employs a beacon tracker to follow the landmarks in real time during the arbitrary movement of
the vehicle. The coordinate transformation is devised for path programming based on time sequence images analysis.
The beacon recognition and tracking are a key procedure for an omni-vision guided mobile unit. The conventional
image processing such as shape decomposition, description, matching and other usually employed technique are not
directly applicable in omni-vision. Particle filter (PF) has been shown to be successful for several nonlinear estimation
problems. A beacon tracker based on Particle Filter which offers a probabilistic framework for dynamic state estimation in visual tracking has been developed. We independently use two Particle Filters to track double landmarks but a composite algorithm on multiple objects tracking conducts for vehicle localization. We have implemented the tracking and localization system and demonstrated the relevant of the algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper studies the testing of the imaging systems and algorithms with the genetic algorithms. We test if there are inherent natural weaknesses in the image processing algorithm or system and can they are search and found with the evolutionary algorithms. In this paper, we test the weaknesses of the error diffusion halftoning methods. We also take a closer look at the method and identify why these weaknesses appear and are relatively easy to identify with synthetic test images. Moreover, we discuss the importance of comprehensive testing before the results with some image processing methods can be trustworthy. The results seem to suggest that the error diffusion methods do not have as apparent inherent problems as e.g. dispersed dot method, but the GA testing does reveal some other problems, like delayed response to the image tone changes. The different error diffusion methods have similar problems, but with different intensity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A growing number of modern applications such as position determination, online object recognition and collision
prevention depend on accurate scene analysis. A low-cost and fast alternative to standard techniques like laser scanners or stereo vision is the distance measurement with modulated, incoherent infrared light based on the Photo Mixing Device (PMD) technique. This paper describes an enhanced calibration approach for PMD-based distance sensors, for which highly accurate calibration techniques have not been widely investigated yet. Compared to other known methods, our approach incorporates additional deviation errors related with the variation of the active illumination incident to
the sensor pixels. The resulting calibration yields significantly more precise distance information. Furthermore, we present a simple to use, vision-based approach for the acquisition of the reference data required by any distance calibration scheme, yielding a light-weighted, on-site calibration system with little expenditure in terms
of equipment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Kernel techniques have been used in support vector machines (SVMs), feature spaces, etc. In kernel methods, the wellknown
kernel trick is used to implicitly map the input data to a higher-dimensional feature space. If all terms can be
written as a kernel function, one can then use data in higher-dimensional space without actually computing the higherdimensional
features or knowing the mapping function Φ. In this paper, we address kernel distortion-invariant filters
(DIFs). Standard DIFs are synthesized in a linear feature space (in the image or Fourier domain). They are fast since
they use FFT-based correlations. If the data is mapped to a higher-dimensional feature space before filter synthesis and
before performing correlations, kernel filters result and performance can be improved. Kernel versions of several DIFs
(OTF, SDF, and Mace) have been presented in prior work. However, several key issues were ignored in all prior work.
These include : the unrealistic assumption of centered data in tests, the significantly larger storage and on-line
computation time required, and the proper type of energy minimization in filter synthesis to reduce false peaks is
necessary when the filters are applied to target scenes and has yet to be done. In addition, prior kernel DIF work used
test set data to select the value of the kernel parameter. In this paper, we analyze these issues, present supporting test
results on two face databases, and present several improvements to prior kernel DIF work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new method along with shape descriptor using support vector machine for classify fruit shape is developed, the image is
first subjected to a normalization process using its regular moments to obtain scale and translation invariance, the rotation invariant
Zernike features are then extracted from the scale and translation normalized images and the numbers of features are decided by
primary component analysis (PCA), at last, these features are input to support vector machine (SVM) classifier and are compared to
different classifiers. This method using support vector machine as classifier performs better than traditional approaches that is
verified by some experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object segmentation and extraction play an important role in computer vision and recognition problems. Unfortunately,
with current computing technologies, fully automatic object segmentation is not possible, but human intervention is
needed for outlining the rough boundary of the object to be segmented. The goal of this paper is to make the object
extraction automatic after the first semi-automatic segmentation. That is, once the semantically meaningful object such
as a house or a human body is extracted from the image under human's guidance, an image manipulation technique is
applied. There is no noticeable difference between the original and the manipulated images. However, the embedded
signature by the image manipulation can be detected automatically to be used to differentiate the object from the
background. The manipulated images, which is called automatic-object-extractible images, can be used to provide
training images with the same object but various background images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.