PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 12644, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The verification of IP core with image processing algorithm is important for SoC and FPGA application in the field of machine vision. This paper proposes a verification framework with general purpose, real-time performance and agility for IP core with image processing algorithm by using heterogeneous platform composed of ARM and FPGA. In the verification framework, the Gigabit Ethernet communication between PC and ARM is established. The FPGA is used to build the data bus to be compatible with multiple types of images, and combine with a partial reconfiguration to achieve fast iteration of IP cores of the algorithm to be verified. The validation framework is reusable for the algorithm IP core, and the deployment speed of the IP cores to be verified is 25 times faster than global reconfiguration. Compared with the existing FPGA verification technology, it has better reusability, shorter verification cycle, more targeted test stimulus, and faster deployment of IP cores to be verified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of mobile internet, real-time target detection using mobile devices has wide application prospects, but the computing power of the terminal greatly limits the speed and accuracy of target detection. Edge-cloud collaborative computing is the main method to solve the lack of computing power of mobile terminals. The current method can't settle the problem of computation scheduling in the edge-cloud collaboration system. Given the existing problems, this paper proposes the pruning technology of classical target detection deep learning networks; training and prediction offloading strategy of edge-to-cloud deep learning network; dynamic load balancing migration strategy based on CPU, memory, bandwidth, and disk state-changing in cluster. After testing, the edge-to-cloud deep learning method can reduce the inference delay by 50% and increase the system throughput by 40%. The maximum waiting time for operation can be reduced by about 20%. The efficiency and accuracy of target detection are effectively improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper, originally published on 3 May 2023, was retracted from the SPIE Digital Library on 18 March 2024 upon verification that it plagiarized significant content from the two papers below without appropriate citation:
(1) Celebi, “Improving the performance of k-means for color quantization,” published in Imaging and Vision Computing, Vol. 29, No. 4 (March 2011): https://www.sciencedirect.com/science/article/abs/pii/S0262885610001411
(2) Huang, “An Efficient Palette Generation Method for Color Image Quantization,” published in Applied Sciences, 2021, 11: https://www.mdpi.com/2076-3417/11/3/1043
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object grasping is a very challenging problem in computer vision and robotics. Existing algorithms generally have a large number of training parameters, which lead to long training times and require high performance facilities. In this paper, we present a lightweight neural network to solve the problem of object grasping. Our network is able to generate grasps at real-time speeds (∼30ms), thus can be used on mobile devices. The main idea of GhostNet is to reduce the number of parameters by generating feature maps from each other in the process of convolution. We adopt this idea and apply it on the deconvolution process. Besides, we construct the lightweight grasp network based on these two processes. A lot of experiments on grasping datasets demonstrate that our network performs well. We achieve accuracy of 94% on Cornell grasp dataset and 91.8% on Jacquard dataset. At the same time, compared to traditional models, our model only requires 15% of the number of parameters and 47% of training time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video frame interpolation (VFI), which aims to synthesize predictive frames from bidirectional historical references, has made remarkable progress with the development of deep convolutional neural networks (CNNs) over the past years. Existing CNNs generally face challenges in handing large motions due to the locality of convolution operations, resulting in a slow inference structure. We introduce a Real-time video frame interpolation transformer (RVFIT), a novel framework to overcome this limitation. Unlike traditional methods based on CNNs, this paper does not process video frames separately with different network modules in the spatial domain but batches adjacent frames through a single UNet-style structure end-to-end Transformer network architecture. Moreover, this paper creatively sets up two-stage interpolation sampling before and after the end-to-end network to maximize the performance of the traditional CV algorithm. The experimental results show that compared with SOTA TMNet, RVFIT has only 50% of the network size (6.2M vs 12.3M, parameters) while ensuring comparable performance, and the speed is increased by 80% (26.1 fps vs 14.3 fps, frame size is 720*576).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Naked-eye 3D imaging needs to call the cell phone camera to detect the position of the human eye, the front camera of the cell phone is a certain distance away from the center of the cell phone screen, so there is a certain offset between the front camera and the human eye position detected by the front camera and the cell phone screen, and the 3D image display is the center of the two eyes detected by the front camera when the human eye looks directly at the center of the cell phone screen as the origin to switch the image. Therefore, the human eye positioning offset problem will be solved by direct measurement method and formula derivation method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
By analyzing the data from the Sloan Digital Sky Survey (SDSS) Data Release 16, which the spectra of the Quasars are major samples, we focus on the investigation of the possible variations of the fine structure constant on the cosmological temporal scales over the universe. We analyzed 14495 quasar samples (red shift z<1) constrained in the literature by using emission-line method on [OIII] doublet and obtained Δα/α=0.70±1.6×10-5. We investigated the precision limit for the measurement of fine-structure constant by SDSS spectrum analysis by designing the simulation about three main sources of systematics: Noise, Outflow of gas, and Skyline. In addition, we exerted cross-correlation analysis on a high-resolution spectrum from MagE (MagE Observations at the Magellan II Clay telescope) named “J131651.29+055646.9” and got the result Δα/α=-9.16±11.38×10-7. Better constraints (Skyline subtraction algorithm) may improve the precision slightly by using SDSS. The more possible and efficient method may be to constrain Δα/α with the spectra of high-resolution spectroscopy and large active galaxy/QSO surveys.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to avoid some errors in distance measurement, e.g. due to physical characteristics, environmental influences, human errors etc. during the measurement process, visual measurement and modern digital image related processing techniques are used. This requires the creation of relevant image acquisition systems and the detection of the edges of the acquired images using specialised procedures, on the basis of which the differences between the basic points of the image edges are brought into the measurement equation. The values obtained prove that the final results of this measurement method are consistent with the actual values, proving the accuracy of digital image correlation processing techniques in distance measurement. Digital image processing technology is a derivative of advanced manufacturing technology, and the rapid development of computer technology has led to the development of digital image recognition and image analysis capabilities. Engineering survey research is mainly applied in the process of engineering construction and engineering management, which can greatly reduce the error of engineering manufacturing and the period of engineering inspection. Based on digital image processing technology, this paper proposes an engineering displacement measurement method, an industrial part size measurement method and an industrial thread standard measurement method. Compared with the traditional manual measurement technology, the use of digital image technology can shorten the working period and improve the working efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the increasing number of private cars, traffic management departments are paying more and more attention to vehicle traffic problems. In the daily management, video image is often the most intuitive, effective and fast way to obtain information resources. In the actual parking lot, vehicle access management is a very complex and difficult job. As most of the parking lots are scanned manually to complete the task of entering and leaving the parking lot. In order to solve this problem, this paper is based on the human image recognition analysis technology to realize the effective and fast recording of incoming and outgoing personnel and vehicle information of the parking lot vehicle entry system, so that these information can be statistically analyzed and the corresponding processing plan can be made quickly when there is an unexpected situation, and at the same time can improve the road traffic efficiency and safety performance, and also provide a simple and fast, easy to operate work for the relevant staff. It also provides a simple, fast and easy to operate method for the relevant staff, which has certain practical value.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Focusing on how to obtain high-quality and sufficient synthetic aperture radar (SAR) data in deep learning, this paper proposed a new mothed named SARCUT (Self-Attention Relativistic Contrastive Learning for Unpaired Image-to-Image Translation) to translate optical images into SAR images. In order to improve the coordination of generated images and stabilize the training process, we constructed a generator with the self-attention mechanism and spectral normalization operation. Meanwhile, relativistic discrimination adversarial loss function was designed to accelerate the model convergence and improved the authenticity of the generated images. Experiments on open datasets with 6 image quantitative evaluation metrics showed our model can learn the deeper internal relations and main features between multiple source images. Compared with the classical methods, SARCUT has more advantages in establishing the real image domain mapping, both the quality and authenticity of the generated image are significantly improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rapid economic and social development has made people's demand for water resources increase rapidly, but at present, half of the world's rivers have been greatly reduced or seriously polluted, resulting in accelerated consumption of water resources. Therefore, the construction of a shared smart water resource recycling system is of great significance to the development of my country's current water resources management. The rapid development of computer technology provides important conditions for people to change traditional concepts and behaviors, and also provides new solutions for solving social conflicts brought about by traditional development models. Images surround humans all the time, and image processing technology replaces the work of the human eye with computers, which makes it possible for image processing technology to automate and intelligently recognize images. After continuous promotion and development, digital image processing has formed a relatively complete subject system. At present, various application fields put forward higher demands on digital image processing technology, which has promoted the development of this discipline system to a higher technical direction. This paper captures the serious problems of water shortage, pollution, waste and uneven distribution. Through the application of image processing technology in water quality analysis and detection, image processing algorithms are used for preprocessing, image segmentation, contour extraction, and the area, perimeter and other parameters for statistical analysis, and the integration of big data, deep learning and new image analysis technologies to develop intelligent platform systems. On this basis, it can not only improve the utilization efficiency of water resources, but also ensure national water security and sustainable economic and social development.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.