It is well known that, based on known multi view geometry, and given a single point in one image, its corresponding point in a second image can be determined up to a one dimensional ambiguity; and that, given a pair of corresponding points in two images, their corresponding point in the third image can be uniquely determined. These relationships have been widely used in computer vision community for the applications such as correspondences, stereo, motion analysis, etc. However, in the real world, images are noisy. How to apply accurate mathematical relationships of multi view geometry to noisy data and the various numerical algorithms available for doing so stably and accurately is an active topic of research. In this paper, some major methods currently available for the computation of two and three view geometries for both calibrated and un-calibrated cameras are analysed, a novel method of calculating the trifocal tensor for the calibrated camera is deduced, and a quantitative evaluation of the influences of the noise at different levels, corresponding to different methods of computing two and three view geometries, is performed through the experiments on synthetic data. Based on the experiment results, several novel algorithms are introduced which improve the performance of searching for correspondences in real images across two or three views.
Image morphing has proved to be a powerful tool for generating compelling and pleasing visual effects and has been widely used in entertainment industry. However, traditional image morphing methods suffer from a number of drawbacks: feature specification between images is tedious and the reliance on 2D information ignores the possible advantages to be gained from 3D knowledge. In this paper, we utilize recent advantages of computer vision technologies to diminish these drawbacks. By analyzing multi view geometry theories, we propose a processing pipeline based on three reference images. We first seek a few seed correspondences using robust methods and then recover multi view geometries using the seeds, through bundle adjustment. Guided by the recovered two and three view geometries, a novel line matching algorithm across three views is then deduced, through edge growth, line fitting and two and three view geometry constraints. Corresponding lines on a novel image is then obtained by an image transfer method and finally matched lines are fed into the traditional morphing methods and novel images are generated. Novel images generated by this pipeline have advantages over traditional morphing methods: they have an inherent 3D foundation and are therefore physically close to real scenes; not only images located between the baseline connecting two reference image centers, but also extrapolated images away from the baseline are possible; and the whole processing can be either wholly automatic, or at least the tedious task of feature specification in traditional morphing methods can be greatly relieved.
Retinal blurring resulting from the human eye's depth of focus has been shown to assist visual perception. Infinite focal depth within stereoscopically displayed virtual environments may cause undesirable effects, for instance, objects positioned at a distance in front of or behind the observer's fixation point will be perceived in sharp focus with large disparities thereby causing diplopia. Although published research on incorporation of synthetically generated Depth of Field (DoF) suggests that this might act as an enhancement to perceived image quality, no quantitative testimonies of perceptional performance gains exist. This may be due to the difficulty of dynamic generation of synthetic DoF where focal distance is actively linked to fixation distance. In this paper, such a system is described. A desktop stereographic display is used to project a virtual scene in which synthetically generated DoF is actively controlled from vergence-derived distance. A performance evaluation experiment on this system which involved subjects carrying out observations in a spatially complex virtual environment was undertaken. The virtual environment consisted of components interconnected by pipes on a distractive background. The subject was tasked with making an observation based on the connectivity of the components. The effects of focal depth variation in static and actively controlled focal distance conditions were investigated. The results and analysis are presented which show that performance gains may be achieved by addition of synthetic DoF. The merits of the application of synthetic DoF are discussed.
A visual telepresence system has been developed at the University of Reading which utilizes eye tracing to adjust the horizontal orientation of the cameras and display system according to the convergence state of the operator's eyes. Slaving the cameras to the operator's direction of gaze enables the object of interest to be centered on the displays. The advantage of this is that the camera field of view may be decreased to maximize the achievable depth resolution. An active camera system requires an active display system if appropriate binocular cues are to be preserved. For some applications, which critically depend upon the veridical perception of the object's location and dimensions, it is imperative that the contribution of binocular cues to these judgements be ascertained because they are directly influenced by camera and display geometry. Using the active telepresence system, we investigated the contribution of ocular convergence information to judgements of size, distance and shape. Participants performed an open- loop reach and grasp of the virtual object under reduced cue conditions where the orientation of the cameras and the displays were either matched or unmatched. Inappropriate convergence information produced weak perceptual distortions and caused problems in fusing the images.
Visual Telepresence system which utilize virtual reality style helmet mounted displays have a number of limitations. The geometry of the camera positions and of the display is fixed and is most suitable only for viewing elements of a scene at a particular distance. In such a system, the operator's ability to gaze around without use of head movement is severely limited. A trade off must be made between a poor viewing resolution or a narrow width of viewing field. To address these limitations a prototype system where the geometry of the displays and cameras is dynamically controlled by the eye movement of the operator has been developed. This paper explores the reasons why is necessary to actively adjust both the display system and the cameras and furthermore justifies the use of mechanical adjustment of the displays as an alternative to adjustment by electronic or image processing methods. The electronic and mechanical design is described including optical arrangements and control algorithms, An assessment of the performance of the system against a fixed camera/display system when operators are assigned basic tasks involving depth and distance/size perception. The sensitivity to variations in transient performance of the display and camera vergence is also assessed.
Intelligent viewing systems are required if efficient and productive teleoperation is to be applied to dynamic manufacturing environments. These systems must automatically provide remote views to an operator which assist in the completion of the task. This assistance increases the productivity of the teleoperation task if the robot controller is responsive to the unpredictable dynamic evolution of the workcell. Behavioral controllers can be utilized to give reactive 'intelligence.' The inherent complex structure of current systems, however, places considerable time overheads on any redesign of the emergent behavior. In industry, where the remote environment and task frequently change, this continual redesign process becomes inefficient. We introduce a novel behavioral controller, based on an 'ego-behavior' architecture, to command an active camera (a camera mounted on a robot) within a remote workcell. Using this ego-behavioral architecture the responses from individual behaviors are rapidly combined to produce an 'intelligent' responsive viewing system. The architecture is single-layered, each behavior being autonomous with no explicit knowledge of the number, description or activity of other behaviors present (if any). This lack of imposed structure decreases the development time as it allows each behavior to be designed and tested independently before insertion into the architecture. The fusion mechanism for the behaviors provides the ability for each behavior to compete and/or co-operate with other behaviors for full or partial control of the viewing active camera. Each behavior continually reassesses this degree of competition or co-operation by measuring its own success in controlling the active camera against pre-defined constraints. The ego-behavioral architecture is demonstrated through simulation and experimentation.
KEYWORDS: Data modeling, Neural networks, Modeling, Reliability, Control systems, Algorithm development, Process control, Statistical modeling, Failure analysis, Systems modeling
This paper presents a comparison of three differing methods applied to the analysis of control data from a high speed machinery application. The source data and the pre-processing applied to improve the suitability of the data to the analysis techniques is discussed. The methods compared are cluster analysis, multi-layer perceptron neural networks and self organizing feature maps. The aim of the work is to determine the merits of the techniques in separating normal running operation from faulty operation. The methodology used with each technique is explained and results are computed so as to give the fairest comparison of their respective abilities. Additionally, the ways in which such techniques would be integrated into a final system for the analysis, diagnosis, and control of a high speed machine to give improved reliability are discussed.
Dynamic multi-user interactions in a single networked virtual environment suffer from abrupt state transition problems due to communication delays arising from network latency--an action by one user only becoming apparent to another user after the communication delay. This results in a temporal suspension of the environment for the duration of the delay--the virtual world `hangs'--followed by an abrupt jump to make up for the time lost due to the delay so that the current state of the virtual world is displayed. These discontinuities appear unnatural and disconcerting to the users. This paper proposes a novel method of warping times associated with users to ensure that each user views a continuous version of the virtual world, such that no hangs or jumps occur despite other user interactions. Objects passed between users within the environment are parameterized, not by real time, but by a virtual local time, generated by continuously warping real time. This virtual time periodically realigns itself with real time as the virtual environment evolves. The concept of a local user dynamically warping the local time is also introduced. As a result, the users are shielded from viewing discontinuities within their virtual worlds, consequently enhancing the realism of the virtual environment.
In this paper we describe how to cope with the delays inherent in a real time control system for a steerable stereo head/eye platform. A purposive and reactive system requires the use of fast vision algorithms to provide the controller with the error signals to drive the platform. The time-critical implementation of these algorithms is necessary, not only to enable short latency reaction to real world events, but also to provide sufficiently high frequency results with small enough delays that controller remain stable. However, even with precise knowledge of that delay, nonlinearities in the plant make modelling of that plant impossible, thus precluding the use of a Smith Regulator. Moreover, the major delay in the system is in the feedback (image capture and vision processing) rather than feed forward (controller) loop. Delays ranging between 40 msecs and 80 msecs are common for the simple 2D processes, but might extend to several hundred milliseconds for more sophisticated 3D processes. The strategy presented gives precise control over the gaze direction of the cameras despite the lack of a priori knowledge of the delays involved. The resulting controller is shown to have a similar structure to the Smith Regulator, but with essential modifications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.