KEYWORDS: Diseases and disorders, Image segmentation, Object detection, Systems modeling, Data modeling, Color, Education and training, Image classification, Correlation coefficients, Deep learning
This paper handles with detection of leaf diseases using deep learning networks which have learned the color and shape parameters of leaf diseases. This paper considers the color distribution and shape information of leaf diseases, and exploits two deep leaning networks in training the normal leaves and diseases. The input color image is partitioned into small segments using color clustering, and the color information of each segment is inspected by the Color Network. When a segment is determined as abnormal (that is, disease segment), the shape parameters of the segment are inspected by Shape Network to classify the disease types. This paper uses HSV color space for Color Network and proposes 24 parameters for Shape Network such as boundary length ratio, densities of subregions, correlation coefficients of x-y coordinates in the disease segments. According to the experiments with three types of diseases (type A, B, C) for images of iceberg, strawberry, coffee, sunflower, chinar, blackgram, citrus, and apple leaves images, leaf diseases are detected with 97.9% recall for a segment unit and 99.3% recall for an input image where there are more than two disease segments.
Computer-generated holography (CGH), which is a process of generating digital holograms, is computationally expensive. Recently, several methods/systems of parallelizing the process using graphic processing units (GPUs) have been proposed. Indeed, use of multiple GPUs or a personal computer (PC) cluster (each PC with GPUs) enabled great improvements in the process speed. However, extant literature has less often explored systems involving rapid generation of multiple digital holograms and specialized systems for rapid generation of a digital video hologram. This study proposes a system that uses a PC cluster and is able to more efficiently generate a video hologram. The proposed system is designed to simultaneously generate multiple frames and accelerate the generation by parallelizing the CGH computations across a number of frames, as opposed to separately generating each individual frame while parallelizing the CGH computations within each frame. The proposed system also enables the subprocesses for generating each frame to execute in parallel through multithreading. With these two schemes, the proposed system significantly reduced the data communication time for generating a digital hologram when compared with that of the state-of-the-art system.
It is difficult to visually track a user’s hand because of the many degrees of freedom (DOF) a hand has. For this reason, most model-based hand pose tracking methods have relied on the use of multiview images or RGB-D images. This paper proposes a model-based method that accurately tracks three-dimensional hand poses using monocular RGB images in real time. The main idea of the proposed method is to reduce hand tracking ambiguity by adopting a step-by-step estimation scheme consisting of three steps performed in consecutive order: palm pose estimation, finger yaw motion estimation, and finger pitch motion estimation. In addition, this paper proposes highly effective algorithms for each step. With the assumption that a human hand can be considered as an assemblage of articulated planes, the proposed method uses a piece-wise planar hand model which enables hand model regeneration. The hand model regeneration modifies the hand model to fit the current user’s hand and improves the accuracy of the hand pose estimation results. Above all, the proposed method can operate in real time using only CPU-based processing. Consequently, it can be applied to various platforms, including egocentric vision devices such as wearable glasses. The results of several experiments conducted verify the efficiency and accuracy of the proposed method.
We describe and evaluate a practical approach for implementing computer-generated-holography (CGH) using multiple graphic processing units (GPUs). The proposed method can generate high-definition (HD) resolution (1920×1080) digital holograms in real-time. In order to demonstrate the plausibility of our method, some experimental results will be given. First, we discuss the advantage of GPUs for CGH against central processing units (CPUs) by comparing the performance of both. Our results show that use of GPUs can shorten CGH computation time by 2791 times. Then, we discuss the potential of multiple GPUs for generating HD resolution digital holograms in real-time by measuring and analyzing the CGH computational time in accordance with the number of GPUs. Our result shows that the CGH computational time decreases nonlinearly, with a logarithmic-like curve, as the number of GPU increases. Therefore, we can determine the number of GPUs to maximize the efficiency. Consequently, our implementation can generate HD resolution digital holograms at a rate of more than 66 hps (holograms-per-second) using two NVIDIA GTX 590 cards.
Natural feature-based approaches are still challenging for mobile applications (e.g., mobile augmented reality), because they are feasible only in limited environments such as highly textured and planar scenes/objects, and they need powerful mobile hardware for fast and reliable tracking. In many cases where conventional approaches are not effective, three-dimensional (3-D) knowledge of target scenes would be beneficial. We present a well-established framework for real-time visual tracking of less textured 3-D objects on mobile platforms. Our framework is based on model-based tracking that efficiently exploits partially known 3-D scene knowledge such as object models and a background’s distinctive geometric or photometric knowledge. Moreover, we elaborate on implementation in order to make it suitable for real-time vision processing on mobile hardware. The performance of the framework is tested and evaluated on recent commercially available smartphones, and its feasibility is shown by real-time demonstrations.
We propose a system that displays a three-dimensional (3D) scene on a 3D display device, consisting of a flat-panel display and a slanted parallax barrier, which can directly manipulate virtual 3D objects in real time. The proposed system consists of a 3D scene display part and a user interaction part. First, we propose a multi-view image interleaving method that considers the distance between a user and a display device for a slanted parallax barrier-based 3D display. Second, we define hand motion parameters for convenient hand motion analysis, and we propose a hand gesture recognition method that uses dynamic time warping (DTW) with specified hand motion parameters for user interaction. Thus, the proposed system makes it possible to see in any position, and viewers can perform event actions without regard to the shape or timing of hand motions.
Augmented reality (AR) has recently gained significant attention. The previous AR techniques usually need a fiducial marker with known geometry or objects of which the structure can be easily estimated such as cube. Placing a marker in the workspace of the user can be intrusive. To overcome this limitation, we present an AR system using invisible markers which are created/drawn with an infrared (IR) fluorescent pen. Two cameras are used: an IR camera and a
visible camera, which are positioned in each side of a cold mirror so that their optical centers coincide with each other. We track the invisible markers using IR camera and visualize AR in the view of visible camera. Additional algorithms are employed for the system to have a reliable performance in the cluttered background. Experimental results are given to demonstrate the viability of the proposed system. As an application of the proposed system, the invisible marker can act as a Vision-Based Identity and Geometry (VBIG) tag, which can significantly extend the functionality of RFID. The invisible tag is the same as RFID in that it is not perceivable while more powerful in that the tag information can be
presented to the user by direct projection using a mobile projector or by visualizing AR on the screen of mobile PDA.
A linear method that calibrates a camera from a single view of two concentric semicircles of known radii is presented. Using the estimated centers of the projected semicircles and the four corner points on the projected semicircles, the focal length and the pose of the camera are accurately estimated in real-time. Our method is applied to augmented reality applications and its validity is verified.
In stereoscopic television, there is a trade-off between visual comfort and 3D impact with respect to the baseline-stretch of 3D camera. It has been reported that an optimal condition can be reached when we set the baseline-stretch at about the distance of human pupils1. However, we cannot get such distance in case that the sizes of the lens and CCD module are big. In order to overcome this limitation, we attempt to control the baseline-stretch of stereoscopic camera by synthesizing virtual views at the desired location of interval between two cameras. Proposed technique is based on the stereo matching and view synthesis techniques. We first obtain a dense disparity map using a hierarchical stereo matching with the edge-adaptive shifted window. And then we synthesize virtual views using the disparity map. Simulation results with various stereoscopic images demonstrate the effectiveness of the proposed technique.
This paper focuses on the correspondence field estimation and utilizes the estimation performance to compress stereoscopic images. This paper proposes the dense correspondence estimation with new probabilistic diffusion algorithm based on maximum a posteriori (MAP) estimation. The MAP-based correspondence field estimation including occlusion and line field is derived with reflecting the probabilistic distribution of the neighborhoods, and is applied to the compression of stereoscopic images. The proposed probabilistic diffusion algorithm considers the neighborhoods in Markov random field with their joint probability density, which is the main difference from the previous MAP-based algorithms. The joint probability density of neighborhood system is implemented by using the probabilistic plane configuration model. And, the paper derives the upper and lower bounds of the probabilistic diffusion to analyze the applied to quadtree-decomposed blocks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.