PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11884, including the Title Page, Copyright information, and Table of Contents.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
International Symposium on Artificial Intelligence and Robotics 2021
Remaining useful life(RUL) prediction of Lithium-ion batteries(LIBs) plays an important role in the battery management system, and accurate prediction can ensure the safe and stable operation of the battery. However, an accurate RUL prediction is difficult to achieve. In this paper, a method based on grey wolf optimization(GWO) and support vector regression(SVR) has proposed, which effectively improves the accuracy of LIBs remaining useful life prediction. Since the kernel parameter of SVR is difficult to select, the GWO algorithm is employed to optimize the SVR kernel parameters. This method is verified according to the battery datasets provided by the NASA Prognostics Center of Excellence(PCoE). Compared with the SVR method, the RUL prediction accuracy of the GWO-SVR has been significantly improved. On this basis, compared with the advanced method ALO-SVR, the average relative error of GWO-SVR is reduced by 7.16%. The accuracy of RUL prediction has been effectively improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Predicting the population density in certain key areas of the city is of great importance. It helps us rationally deploy urban resources, initiate regional emergency plans, reduce the spread risk of infectious diseases such as Covid-19, predict travel needs of individuals, and build intelligent cities. Although current researches focus on using the data of point-of-interest (POI) and clustering belonged to unsupervised learning to predict the population density of certain neighboring cities to define metropolitan areas, there is almost no discussion about using spatial-temporal models to predict the population density in certain key areas of a city without using actual regional images. We abstract 997 key areas in Beijing and their regional connections into a graph structure and propose a model called Word Embedded Spatial-temporal Graph Convolutional Network (WE-STGCN). WE-STGCN is mainly composed of three parts, which are the Spatial Convolution Layer, the Temporal Convolution Layer, and the Feature Component. Based on the data set provided by the Data Fountain platform, we evaluate the model and compare it with some typical models. Experimental results show that the Spatial Convolution Layer can merge features of the nodes and edges to reflect the spatial correlation, the Temporal Convolution Layer can extract the temporal dependence, and the Feature Component can enhance the importance of other attributes that affect the population density of the area. In general, the WE-STGCN is better than baselines and can complete the work of predicting population density in key areas.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, boundary information has gained more attention in improving the performance of semantic segmentation. This paper presents a novel symmetrical network, called BASNet, which contains four components: the pre-trained ResNet-101 backbone, semantic segmentation branch (SSB), boundary detection branch (BDB), and aggregation module (AM). More specifically, our BDB only focuses on processing boundary-related information using a series of spatial attention blocks (SABs). On the other hand, a set of global attention blocks (GABs) are used in SSB to further capture more accurate object boundary information and semantic information. Finally, the outputs of SSB and BDB are fed into AM, which merges the features from SSB and BDB to boost performance. The exhaustive experimental results show that our method not only predicts the boundaries of objects more accurately, but also improves the performance of semantic segmentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As a core component of the vehicle transmission system, the wet clutch is widely used in cars, heavy vehicles and tracked vehicles, whose dynamic engagement characteristics affect directly the safety, comfort and durability of the vehicle. In this paper, the meshing process is simulated by finite element method. In the fluid friction, the average Reynolds equation proposed by Patir and Cheng is improved, and dimensionless parameters are adopted, which is applied to calculate the viscous torque. In the boundary friction, a surface elastic contact model is established to calculate rough contact torque. In the mixed friction, total torque consists of viscus torque and rough contact torque. The simulation and bench test are completed to verify the validity of model proposed. Model proposed can be used to guide wet clutch design in early stages of product development.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A trajectory tracking method based on sliding mode control (SMC) and disturbance observer is proposed for the wheeled mobile robot (WMR) suffering from unknown disturbances. First of all, a continuous sliding mode control (CSMC) method is designed for an uncertain WMR. However, when the system confronts stronger disturbances from internal or external, the large control needs to be designed, which will lead to a large steadystate tracking error. To improve robustness and reduce the tracking error, two nonlinear disturbance observers (NDOs) are designed to respectively estimate various unpredictable disturbances in the kinematics and dynamics descriptions of the system studied, such as skiddings, slippings, and parameter uncertainties. Then, with the aid of disturbance estimations, a new trajectory tracking method is constructed for the WMR, which consists of the CSMC and NDOB. What’s more, the stability of the entire closed-loop system under the present control strategy is proved in detail. Finally, the tracking performance of the proposed controller is verified by the simulation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, test scenarios are designed to monitor the main performance indexes of three common video Apps by
starting from the studies on performance indexes of mobile APPs and combining the characteristics of video APPs. It
aims at providing references for related studies and users through the analysis of test results. A comparative test on the
performance of iQIYI, BiliBili and QQlive is conducted from CPU, memory, GPU and data consumption, using
automated test method, as well as PerfDog, iTest and ADB tools. According to the test results, three video APPs have
excellent performance in CPU utilization, being far below the expected value. As for GPU, BiliBili has the highest video
quality, occupied the least memory and consumed the least data. iQIYI has the worst performance in frame rate, memory
occupation and data consumption.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To realize the fault diagnoses as power transmissions, this paper first introduces the online monitoring methods by using sensors. Then, the traveling wave methods and the Wavelet methods for the fault locations of transmission lines are summarized according to the real-time operational data and the environmental variation data. Furthermore, the neural networks and the genetic algorithms for fault recognition are summarized according to the online monitoring data. Finally, this paper discusses the principles, the advantages, and the shortcomings of different methods, which establish the foundations for intelligent diagnosis of power transmissions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
At present, functional distributed computer network interconnection is an important component of LAN applications. For example, regional power grid monitoring centers, aerospace survey ship center systems, and airport ground monitoring systems mostly use this model. It is conceptually different from the distributed computer system of the organization. Each connected computer is still autonomously operated, and it is only functionally distributed. Therefore, the traditional LAN interconnection method such as Ethernet is still adopted in the structure, which is different from the traditional one. In the local area network, the interconnected microcomputers generally have no affiliation and are functionally equivalent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, with the increase of world trade volume, multimodal transport has become the main mode of international trade transportation, and its development is facing a new stage. As an important part of multimodal transport, invoice affects the efficiency of multimodal transport. Electronic invoices are replacing traditional paper invoices as the main form of bill of lading. But due to the characteristics of paperless and intangible electronic, the electronic invoices, unlike traditional paper invoices, cannot be actually possessed. Considering the block chain technology possesses the characteristics of unchanged and traceability, it will apply into the electronic invoice. It provides shippers and carriers with freight information and pre-agreed service contract rates, enables shippers and carriers to conduct secure and transparent transactions, improves electronic invoice processing efficiency, and reduces shipper and carrier costs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The advent of blockchain technology has transformed traditional business processes from centralized to decentralized. By eliminating the unnecessary intervention of middlemen, it can reduce the overall cost of patient medication by turning the drug supply chain entirely into a point-to-point decentralized business. This paper presents a reliable and encouraging P2P drug trading block chain technology and four related smart contracts. These contracts include consumer contracts and the supply, bidding and trading of drugs, which have been deployed on the Ethereum blockchain for the decentralized trading of drugs. We will use the Approach of Real cost descending (RCD) to achieve incentive transactions for suppliers and patients. This method provides P2P transactions and ensures the safety and transparency of drug data, as well as the anonymity of users in the transaction process. Finally, according to the requirements of Good Supplying Practice(GSP), the effectiveness of the proposed model is evaluated and analyzed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Drosophila visual system is extremely sensitive to moving targets, which provides a wealth of biological inspiration for the research of target motion perception in complex scenes, and also lays a biological theoretical foundation for the establishment of artificial drosophila visual neural networks. Drosophila's vision has been extensively studied in physiology, anatomy, and behavior, but our understanding of its underlying neural computing is still insufficient. In order to gain insight into the neural mechanism in Drosophila vision and take better advantage of its superiority in motion perception, we propose a Drosophila vision-inspired model, which constructs a complete Drosophila visual motion perception system by integrating continuous computing layers. Our hybrid model can fully demonstrate the motion perception process in Drosophila vision. In addition, the Drosophila vision-inspired model can also be exploited to salient object detection in dynamic scenes. This novel salient object detection model is different from the previous in that it can accurately identify the motion of interest (MOI) while suppressing background disturbances and ego-motion. Comprehensive evaluations using standard benchmarks demonstrate the superiority of our model in salient object detection compared with the state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For vibration signals measured at different speeds or loads, the different states of bearings will have a considerable internal variability, which further increases the difficulty of extracting fault signal consistency features. We believe that in order to improve the performance of feature learning and classification, a more comprehensive and extensive extraction and fusion of signals is needed. However, existing multiscale multi-stream architectures rely on contacting features at the deepest layers, which stack multiscale features by brute force but do not allow for a complete fusion. This paper proposes a novel multiscale shared learning network (MSSLN) architecture to extract and classify the fault feature inherent in multiscale factors of the vibration signal. The merits of the proposed MSSLN include the following: 1) multistream architecture is used to learn and fuse multiscale features from raw signals in parallel. 2) the shared learning architecture can fully exploit the shared representation with the consistency across multiscale factors. These two positive characteristics help MSSLN make a more faithful diagnosis in contrast to existing single-scale and multiscale methods. Extensive experimental results on Case Western Reserve dataset demonstrate that the proposed method has high accuracy and excellent generalization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stock price prediction has always been an issue full of challenges due to its complex volatile nature. Inspired by the observation that the traders prefer judging the stock trends from the charts, and have concluded useful techniques for predicting stock trends. In this paper, we try to predict stock price trends based on the stock images by using Deep Convolutional Neural Networks (DCNN), and mainly focus on how to generate the stock images and label the image to train the DCNN model. Specifically, on one hand, we transfer the stock time series data to 5 types of stock images, trying to explore effective visual representation for stock data. On the other hand, we label the images in a more meaningful way to train the attention-based DCNN model to build a practical model. Experimental results on the S&P 500 stocks show that the MACD image performs better than other types, and it is hard to predict the stocks’ next day trend. However, the convolutional block attention module (CBAM) can improve the performance of DCNN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper studies that the problems of the new perceived incentive system upgrade and management funding input introduced by the company, which establishes a production function model, and observes whether the introduction of the new management system can effectively help traditional mining enterprises Improve production efficiency and production safety factor. Through empirical research, this paper finds that the effective improvement of the enterprise management system and incentive system can improve the production efficiency of the enterprise, increase the participation of workers and promote the effective implementation of the project. Finally, based on the conclusions of empirical research, this paper further verifies the reliability of the data and gives relevant suggestions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the situation that retrieval chat robot relies too much on predefined responses and the training requirements of generative chat robot are too high, a hybrid and regenerative model text chat robot based on LSTM and Attention-model is designed. Due to the retrieval model can only handle scenarios with predefined responses, and a generative model with strong learning ability will produce grammatical errors in certain scenarios. Therefore, firstly,doing text processing based on corpus, and then the retrieval model generates a candidate data set, and the candidate data set is trained by generating model to obtain the final model. The experimental comparison results show that the hybrid and regenerative model chat robot can effectively improve the model response quality compared to the single model chat robot, and accuracy improved by thirty percent.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, reflection is a kind of noise in images which is frequently generated by reflections from windows, glasses and so on when you take pictures or movies. The reflection does not only degrade the image quality, but also affects computer vision tasks such as object detection and segmentation. In SIRR, learning models are often used because various patterns of reflection are possible, and the versatility of the model is required. In this study, we propose a deep learning model for SIRR. There are two problems with the conventional SIRR using deep learning models. The assumed scenes of reflection are vary, and there is little training data because it is difficult to obtain true values. In this study, we focus on the latter and propose an SIRR based on meta-learning. In this study, we adopt MAML, which is one of the methods of meta-learning. In this study, we propose an SIRR using a deep learning model with MAML, which is one of the methods of meta-learning. The deep learning model includes the Iterative Boost Convolutional LSTM Network (IBCLN) is adopted as the deep learning methods. Proposed method improve accuracy compared with conventional method of state-of-the-art result in SIRR.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the rapid development of social networks and the increasing use of mobile devices, the data scale of digital images
has increased sharply, and the dynamic detection of object categories has gradually become a research hotspot in the field
of computer vision. The key problems of dynamic detection of object categories are summarized. First introduces the
research background of the target category, then the target object type detection techniques were reviewed, including the
source code to compile, functional testing, model training and model verification points, four core technology and the
training of different data sets and evaluation standard, the final object categories listed dynamic detection algorithm of test
results, and summarize the main research of target category test difficulty and developing direction in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, energy minerals have become more important due to the rapid industrialization worldwide. Due to the rapid industrialization on a global scale, there is a shortage of mineral resources, and there are more opportunities to rely on alternative energy sources. Therefore, the exploration of marine resources, which are abundant in the ocean, is being promoted. However, it is dangerous and impractical for humans to dive and search for marine resources by hand. Therefore, it is possible to proceed with underwater exploration safely by having a robot do the work instead. Robots have been used as a mainstream search tool in the underwater environment due to the existence of various hazardous environmental conditions. However, there are several problems associated with robot control in underwater environments, one of which is poor visibility in the water. One of the problems is the poor visibility in the water. To improve the visibility in the water, we are trying to increase the resolution of underwater images by using super-resolution technology. In this paper, we conduct experiments using SRCNN, which is a basic super-resolution technique for underwater images. In addition, we investigate the effectiveness of "Mish", which has been attracting attention in recent years for its potential to surpass the performance of "ReLU", although "ReLU" is a typical activation function of neural networks, on SRCNN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Existing methods on appearance-based gaze estimation mostly regress gaze direction from eye images, neglecting facial information and head pose which can be much helpful. In this paper, we propose a robust appearance-based gaze estimation method that regresses gaze directions jointly from human face and eye. The face and eye regions are located based on the detected landmark points, and representations of the two modalities are modeled with the convolutional neural networks (CNN), which are finally combined for gaze estimation by a fused network. Furthermore, considering the various impact of different facial regions on human gaze, the spatial weights for facial area are learned automatically with an attention mechanism and are applied to refine the facial representation. Experimental results validate the benefits of fusing multiple modalities in gaze estimation on the Eyediap benchmark dataset, and the propose method can yield better performance to previous advanced methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In-vehicle cameras and surveillance cameras are used in many situations in our daily lives. Visibility degradation in foggy environments is caused by the scattering of reflected light from real objects by minute water droplets or fog in the medium through which light passes. The degree of degradation depends on the density of suspended microparticles existing between the observed object and the observation point in the medium. In general, the farther the object is from the camera, the more it is affected by the fog. The purpose of image de-fogging is to improve the clarity of an object by removing the effects of fog in the image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Transmission line is related to the security and stability of the power system, and it is the basis of the ensuring power supply. It is very important to ensure the normal operation of the transmission line and timely find and repair the faulty transmission line. This requires accurate identification of the fault type and the distance when a fault occurs. Therefore, this paper comprehensively summarizes the transmission line fault diagnosis technology and its research status. And the fault analysis method, traveling wave method, intelligent positioning method and other methods are analyzed and combed. On this basis, the research and application prospects of the artificial intelligence positioning method based on deep learning are summarized and prospected.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wireless communication and cloud are extremely popular because of its convenience, portability and immediacy. As the first line of defense for mobile communication with intelligent devices, identity authentication plays a more and more important role for system security and communication privacy. In order to enhance the security and privacy of wireless communication, this paper investigates an anonymous password-based remote user authentication and key exchange scheme for intelligent devices. Firstly, the scheme achieves secure mutual authentication while preserving user privacy. Secondly, client puzzle is employed to resist denial of service attacks and establish session keys with forward secrecy. Finally, the analysis demonstrate that our scheme works effectively and resists common known attacks, while maintaining the efficient performance for mobile users.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Virtual Reality (VR) systems are become popular in recent years, and the capture 3D objects from the real world have been studied. 3D objects have large and complex data. In this paper, we propose a novel method that use the shadow information and Photometric stereo with their surface from one point of view to recover the 3D shapes. The experimental results show that the proposed method performs well accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The artificial bee colony (ABC) algorithm shows a relatively powerful exploration search capability but show convergence rate, especially on unimodal functions. In this paper, an improved artificial bee colony algorithm is introduced to shorten its computation time. In the proposed algorithm, two novel update equations, utilizing the social experience of the whole population, are proposed to boost the performance of employed bees and onlooker bees respectively. The effectiveness of our algorithm is validated through the basic benchmark functions. Furthermore, a model of feed-forward artificial neural network is also employed to verify the effectiveness of our algorithm. The experimental results show the IUABC algorithm achieves better performance than the other compared algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of computer vision and deep learning, the convolutional neural network has been widely used in image processing such as object detection and semantic segmentation, and has achieved breakthrough achievements. However, when the training samples are insufficient, the conventional neural network usually has unsatisfactory robustness. In order to solve the problem, we improve the generalization performance of the few-shot detectors by focusing on the target center and can identify novel categories. The paper proposes a new attention mechanism based on the auxiliary circle feature map of the object center. By selecting an auxiliary circle feature map with the object center as the center of the circle and the minimum size in height and width as the diameter, adding it to the anchor-free CenterNet network as soft attention to promote network training. Several experiments on PASCAL VOC2007/2012 datasets show that the proposed method achieves the most advanced level in terms of the accuracy and standard deviation of few-shot object detection, which indicates the algorithm’s effectiveness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many animals, including pets and assistance dogs for the physically disabled, are all around us. However, there are also
dangers to contact many zoonotic diseases. One of the routes of infection is though excrement, so we have developed a
device to automatically fold pet sheets. In addition, we have developed a system that can remotely control this device so
that it can be used in a car. The system consists of seven servo motors, a DC motor, and a pair of communication devices.
When a button on the communication device is pressed while driving, the system automatically folds the pet sheets along
the creases and throws them into the trash. This allows pet owners to dispose of their waste without touching the pet sheet,
thus preventing zoonotic diseases. Experiments conducted on the disposal of pet sheets with urine and feces on them
showed a 100% success rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Telecare medicine information system (TMIS) can provide remote users with high-quality services at home. Hence, it is not easy to ensure the user information security in the complex network environment. This paper proposes a new biometric-based, mutual authenticated certificateless key agreement protocol for TMIS to protect user information security. The protocol is based on the difficulty of the discrete logarithm problem under the extended Canetti-Krawczyk (eCK) security model. And the protocol uses biological imprinting to provide more security for users. Meanwhile, the protocol can resist all kind of existing attack. Compared with the existing key agreement protocol, the protocol has higher computing and communication efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the past few years, the development of natural language processing has been able to deal with many issues such as emotional analysis, semantic analysis, and so on. This review first introduces the development of natural language processing, and then summarizes their applications in financial technology, which mainly focuses on public opinion analysis, financial prediction and analysis, risk assessment, intelligent question answering, and automatic document generation. The analysis shows that natural language processing can give full play to its advantages in the financial field. Moreover, this paper also discusses the problems and challenges for financial technology that are developed based on natural language processing. Finally, this paper presents two developing trends of natural language processing in financial technology: deep learning and knowledge graph.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As we known, the life of patients with Parkinson's disease (PD) which cannot be cured fundamentally has changed thoroughly. Then, automatic identification of early Parkinson's disease on feature data sets attracts many medical researchers. At present, machine learning especially deep learning algorithms have been widely adopted in the task of classification and regression, etc. But labeled data sets are rare and expensive to label in many areas, i.e., aerospace, medical. Transfer learning is often employed to solve the problems with small training dataset. In this paper, we proposed a parameters-based transfer learning algorithm to enhance generalization ability and avoid overfitting of the network. Then a new method is utilized to accelerate the training speed of the network, which help the algorithm to achieve results with high speed. At last, the Earth Mover’s Distance (EMD) is introduced into our proposed transfer learning algorithm for enhancing the precise of measurement which represents as a distance metric between the two probability distribution of images. The experimental results compared with other modern algorithms on the common Parkinson’s datasets show the effectiveness of our algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optical motion capture systems have been widely used in biomechanical, medical treatment, uav control and other fields. At present, the accuracy of optical motion system has been studied, but the relationship between the layout of optical camera in the detection area and the detection space boundary and the accuracy stability of the calibration system within the detection space has not been studied. In this paper, the optical motion capture system is composed of 8 Optitrack Prime 13 cameras arranged in a single ring with equal height and equal spacing. According to the size of the detection object, the pitch angle of the camera is adjusted through calculation, and the maximum detection volume is determined. The system is calibrated within the detection volume to upgrade the calibration accuracy. The quantization of the layout data of the optical motion capture system makes the system test data stable and reproducible, and the consistency of the test system parameters is realized.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The quality of Aluminum Profiles is the most important evaluation criterion in industrial production. To perform the quality control of Aluminum Profiles, strict defect detection must be carried out. Traditional machine learning methods need to design hand-crafted features in advance. Deep learning methods need to preset anchor parameters according to all defects, which is inefficient and inaccurate. In this paper, we propose an adaptive anchor network with an attention-based refinement mechanism for defect detection. The network has learnable parameters to generate anchors adaptively. Meanwhile, to better represent the different defects, we design a refinement module with the channel and spatial attention mechanism and deformable convolution at the stage of feature extraction. Besides, we also use cascade detection architecture to retain more defect information. The proposed method gets the AP of 62.4 and AP50 of 86.1 on an industrial dataset, which has AP of 12.8 and AP50 of 17.8 improved to the conventional methods and outperforms several state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As the information technology develops rapidly, the large-scale personal data such as sensors or IoT (Internet of Things) equipment is kept in the cloud or data centers. Sometimes, the data owner in cloud center needs to publish the data. Therefore, in the face of the risk of personal information leakage, how to take full advantage of data has become a hot research topic. When data is published many times, personal privacy is also disclosed. Thus, this paper puts forward a new clustering algorithm based on singular value decomposition to finish the clustering process. The ideas of distance and information entropy are considered to flexibly adjust data availability and privacy protection in this way. Secondly, this paper also puts forward a dynamic update mechanism to ensure that personal data will not be leaked after multiple releases and minimize information loss. Finally, the effectiveness and superiority of this method are verified by experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Software Defined Networks is a popular network architecture that separates control and data plane. It is completed as a software to move control plane functions from network hardware to a controller node. To optimize the consistency of the network, a controller placement problem (CPP) could be treated as an optimization problem to minimize the latency between nodes by setting their location. In this paper, as a modern Evolutionary Computation Algorithm, grey wolf optimization algorithm recently has been successfully applied into solving CPP. The experimental results demonstrate the effectiveness of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Underwater image is an important carrier of marine information. However, the problems of low contrast, color degradation, uneven illumination and detail loss cause the degradation of image quality. In this paper, we design a novel underwater image enhancement pipeline based on retinex. Firstly, we use bilateral filter instead of the traditional gaussian filter to estimate illumination image and obtain the reflectance image. Then, we design an attenuation map guided gray world method to overcome color distortion and use gamma correction to improve the illumination image. Finally, the fusion image is post-processed to further improve the contrast and obtain the final enhanced image. Qualitative and quantitative performance analysis prove that the proposed method has better performance than other underwater image enhancement methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Medical imaging, used for both diagnosis and therapy planning, is evolving towards multi-modality acquisition protocols. Manual segmentation of 3D images is a tedious task and prone to inter- and inter-experts variability. Moreover, the automatic segmentation exploiting the characteristics of multi-modal images is still a difficult problem. Towards this end, Positron emission tomography (PET) and computed tomography (CT) are widely used. PET imaging has a high contrast but often leads to blurry tumor edges due to its limited spatial resolution, while CT imaging has a high resolution but a low contrast between a tumor and its surrounding normal soft tissues. Tumor segmentation from either a single PET or CT image is difficult. It is known that co-segmentation methods utilizing the complementary information between PET and CT can improve the segmentation accuracy. This complementary information can be either consistent or inconsistent in the image level. How to correctly localize tumor edges with the inconsistent information is one major challenge for co-segmentation. Aiming to solve this problem, a novel joint level set model is proposed to combine the evidences of PET and CT in a united energy form, achieving a co-segmentation in these two modalities. The convergence of the co- segmentation model corresponds to the most optimal tradeoff between the PET and CT. The different characteristics in these two imaging modalities are considered in the adaptive convergence process which starts mostly with the PET evidence to constrain the tumor location and stops mostly with the CT evidences to delineate boundary details. The adaptability of our proposed model is automatically realized by stepwise moderating the joint weights during the convergence process. The performance of the proposed model is validated on 20 nonsmall cell lung tumor PET-CT images. It achieves an average dice similarity coefficient (DSC) of 0.846±0.064 and positive predictive value (PPV) of 0.889±0.079, demonstrating the high accuracy of the proposed model for PET-CT images lung tumor co-segmentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Different from object tacking on the ground, underwater object tracking is challenging due to the image attenuation and distortion. Also, challenges are increased by the high-freedom motion of targets under water. Target rotation, scale change, and occlusion significantly degenerate the performance of various tracking methods. Aiming to solve above problems, this paper proposes a multi-scale underwater object tracking method by adaptive feature fusion. The gray, HOG (Histogram of Oriented Gradient) and CN (Color Names) features are adaptively fused in the background-aware correlation filter (BACF) model. Moreover, a novel scale estimation method and a high-confidence model update strategy are proposed to comprehensively solve the problems caused by the scale changes and background noise influences. Experimental results demonstrate that the success ratio of the AUC criterion is 64.1% that is better than classic BACF and other methods, especially in challenging conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Because of the openness of the cloud storage architecture and the sharing of resources, data owners have lost control of the stored data, leading to frequent leakage of user privacy data, and security issues have been a significant element restricting the development of cloud storage. In this paper, three key algorithms for key distribution, hybrid encryption and dynamic key update are proposed for the key issues of data sharing security in cloud storage environment. Based on the above algorithms, a data security sharing model in cloud storage environment is proposed to solve cloud storage Trust dependence, user collusion attacks, and data dynamic security issues in the environment to protect private data during storage and sharing security. The analysis of the experimental results shows that the data security sharing technology can resist selective plaintext and collusion attacks. Therefore, the system of cloud storage encrypts shared data by using data security sharing technology, which can effectively protect the data confidentiality under context of the cloud storage.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hybrid encryption algorithms are flexible tools for modeling the correlation of random variables. They cover the scope from completely negative correlation to positive correlation, including independent cases and contain asymmetric correlation and broadly employed Gaussian correlation structure. The pair-encryption algorithm of the hybrid encryption algorithm takes advantage of the ease of use of the two-variable encryption algorithm, and it is recommended to decompose the hybrid encryption algorithm into a set of two-variable encryption algorithms. We have successfully applied this method to spatial data and established a powerful interpolation method on basis of spatial logarithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Interactive image segmentation can improve segmentation performance using manual intervention. Traditional interactive segmentation methods have unsatisfactory segmentation accuracy for images with complex background. Deep learning-based methods depend on large and accurate annotated datasets. In this paper, we propose an online interactive segmentation method based on graph convolutional network (GCN), which includes the superiorities of these two types of methods. We present a pre-segmentation stage to get an initial segmentation of the image, then propose an interactive GCN (iGCN) module to further improve the accuracy of the initial segmentation. Moreover, iGCN module is trained online without any pre-training burden. Experimental results show that our method outperforms several state-of-the-art methods on GrabCut and Berkeley datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As sophisticated network attack methods increase, the decentralized security event processing methods based on a single device can no longer meet the cur-rent needs of network security management. The security event correlation analysis technology analyzes the various security events through correlation that can accurately judge and extract meaningful security events. This paper proposes an information security event correlation analysis method based on the adaptive optimization algorithm in order to imitate the constitution of clusters in two-dimensional gas where particles do not keep still until they irreversibly hit and "stick" together. The simulation is established to analyze the aggregation of information security events.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper addresses the problem of the Bayesian estimation of the inverted Beta-Liouville mixture model (IBLMM), which has a fairly flexible positive data modeling capability. This problem does not usually admit an analytically tractable solution. Sampling approaches (e.g., Markov chain Monte Carlo (MCMC)) can be used to address this problem. However, these approaches are usually computationally demanding, and as a result, they may be impractical for real-world applications. Therefore, we adopt the recently proposed extended variational inference (EVI) framework to address this problem in an elegant way. First, some lower bound approximations are introduced to the evidence lower bound (ELBO) (i.e., the original objective function) in the conventional variational inference (VI) framework, which yields a computationally tractable lower bound. Then, we can derive a form-closed analytical solution by taking this bound as the new objective function and optimizing it with respect to individual variational factors. We verify the effectiveness of this method by using it in two real applications, namely, text categorization and face detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual relation detection (VRD) aims to describe images with relation triplets like <subject, predicate, object<, paying attention to the interaction between every two instances. To detect the visual relations that express the main content of a given image, visual relation of interest detection (VROID) is proposed as an extension of the traditional VRD task. The existing methods related to the general VRD task are mostly based on instance-level features and the methods that adopt detailed information only use part-level attention or human body parts. None of the existing methods take advantage of general semantic parts. Therefore, on the basis of the IPNet for VROID, we further propose an interest propagation form part (IPFP) method which propagates interest along “part-instance-pair-triplet” to detect visual relations of interest. The IPFP method consists of four modules, Panoptic Object-Part Detection (POPD) module, Part Interest Prediction (PartIP) module, Instance Interest Prediction (InstIP) module, Pair Interest Prediction (PairIP) module, and Predicate Interest Prediction (PredIP) module. The POPD module extracts instances with instance features and instance parts with part features; the PartIP module predicts interest for every single part; the InstIP module predicts interest for every single instance; the PairIP module predicts interest for each pair of instances; and the PredIP module predicts possible predicates for each instance pairs. The interest scores of visual relations are the product of pair interest scores and predicate possibilities for pairs. We evaluate the performance of the IPFP method and the effectiveness of important components using the ViROI dataset for VROID.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Infrared pedestrian detection often suffers from two problems, i.e., 1) the weak features of infrared images result in false alarms; 2) the generalization ability of infrared pedestrian detection methods is not satisfactory since the infrared images are similar due to the limited the acquisition method. To solve these problems, we proposed a multi-task infrared pedestrian detection method. Firstly, the domain adaptation is introduced to align the feature of visible light images and infrared images, by which visible light images are used as additional data to improve scene diversity and generalization ability. Secondly, the U-Net segmentation network is used to predict the pedestrian activity area, and the detected objects in nonpedestrian parts are filtered out to reduce the false alarm. The experiment results show that, Compared with EfficientDet, our method improved the average precision (AP) by 1.4% on the XDU-NIR2020 dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Information on flower number per grapevine inflorescence is critical for grapevine genetic improvement, early yield estimation and vineyard management. Previous approaches to automize this process by traditional image processing techniques, are unsuccessful in the improvement of a universal system that can be applied to inflorescences with diverse morphology. In this paper, we validated the efficiency of advanced deep learning-based approaches for automatic flower number estimation. The results were analyzed on an inflorescence dataset of 204 images from four different cultivars during various growth stages. The algorithm developed on patch-based instance segmentation using Mask R-CNN, produced counting results highly correlated to manual counts (R2 = 0.96). Practically constant MAPE values among different cultivars (from 5.50% to 8.45%), implying a high robustness in this method. Achieving the fastest counting (0.33 second per image of size 512 × 512) with slightly lower counting accuracy (R2 = 0.91), the method based on object density-map using U-Net turned out to be suitable for real-time flower counting systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
At present, the most commonly used and simplest physical isolation scheme is equipped with two sets of computers to access the internal network and the external network respectively, but this brings great inconvenience to the information exchange and use. Based on this, this paper designs and implements a system of file transfer between internal and external network devices under physical isolation based on visual recognition. On the basis of the realization of QR code picture transmission, it also realizes the QR code video transmission mode for the transmission of large files, which further improves the transmission efficiency. In addition, the system increased according to the instruction to fetch another computer pictures or database information and back to the original computer function, more in line with the actual use of demand.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
OptiTrack system was the most advanced optical motion capture system in recent years. Camera calibration was an indispensable part of optical motion capture. Its function was to convert 2D data obtained by each camera into spatial 3D data by multiple cameras. The mean error reflected the numerical relationship between the system coordinates and the world coordinates of the measurement area, so the mean error was related to the measurement accuracy of OptiTrack system directly. In this paper, the influence of average calibration error on measurement stability and accuracy was obtained by repeated calibration and sampling under control variables
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The traditional tag generation method of text resources is only based on the information of the text itself. However, it ignores words with low frequency but high topic relevance, resulting in low accuracy of tag generation. So, this paper bases on the traditional TextRank model, using the document-topic distribution and the distribution of words under the corresponding topics to measure the importance of words in the document, to adjust the random jump probability of nodes. Then, the similarity between word vectors and statistical feature information are used to update the weight of word nodes iteratively. As a result, a new word graph model is constructed to generate text tags. Compared with the traditional TF-IDF, TextRank and other related algorithms, the experimental results of our model on real datasets demonstrate the effectiveness of our proposed method, which has a certain improvement in accuracy, recall and F value.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image denoising is an important topic in the field of image processing. With the application of nonlocal similarity in sparse representation, the work of image denoising began to be performed on similar patch groups. The sparse representations of patches in a group will be learned together. In this paper, we propose a novel image denoising model by combining group sparsity residual with low-rankness. Firstly, motivated by the relationship between low rank and sparsity, a low rank constraint is imposed on the sparse coefficient matrix of each similar patch group to enhance the sparsity. Secondly, since 𝛾-norm can most closely match the true rank of a matrix, it is applied for rank approximation in our model. Finally, in view of the fact that numerous iterations are required in the group sparse representation (GSR) model, we develop an efficient algorithm based on the Majorize-Minimization (MM) optimization. It greatly reduces the computational complexity and the number of iterations. Experimental results show that our model makes great improvements in image denoising and outperforms many state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the construction of smart grid, a large number of user-side power data has been accumulated. This paper proposes a method for analyzing the user’s power behavior based on clustering algorithm. Firstly, the user load data is classified according to the season, and the user’s seasonal power characteristics are analyzed according to the typical daily load curve of the season. Then the average temperature plus load data is used as the feature, and K-means clustering algorithm is used to explore the influence of temperature and holidays on users’ electricity behavior in summer and winter respectively. This paper proposes a method of classifying and analyzing different power consumption modes of a single user, which provides data support for the subsequent load prediction model training for similar days, as well as the formulation of fine management and demand side management decisions for the power grid.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The analysis of magnetic resonance (MR) images plays an important role in medicine diagnosis. The localization of the anatomical structure of lesions or organs is a very important pretreatment step in clinical treatment planning. Furthermore, the accuracy of localization directly affects the diagnosis. We propose a multi-agent deep reinforcement learning-based method for prostate localization in MR image. We construct a collaborative communication environment for multi-agent interaction by sharing parameters of convolution layers of all agents. Because each agent needs to make action strategies independently, the fully connected layers are separate for each agent. In addition, we present a coarse-to-fine multi-scale image representation method to further improve the accuracy of prostate localization. The experimental results show that our method outperforms several state- of-the-art methods on PROMISE12 test dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The full convolution Siamese network for object tracker formulate tracking as convolutional feature cross-correlation between a target template and a search region. This tracker realizes real-time object tracking. However, when there are interference factors similar to the target object, Siamese trackers still have an accuracy gap compared with state-of-theart algorithms. Therefore, we proposes an object tracking based on response maps fusion Siamese network(Siam-RMF ). Different from the full convolution Siamese network for object tracker, when the Siam-RMF tracker performs similarity learning, it no longer uses the features extracted by the last layer of the network, but extracts the features of the last three-layer network. Moreover, we propose a new model architecture to perform layer-wise and depth-wise aggregations, the depth-wise separable convolution is used to learn the similarity respectively to obtain the effective fusion of the corresponding depth cross-correlation response map. The fusion response maps can effectively avoid the loss of spatial information after multi-layer feature extraction. Experimental results on TB50 and UAV123L demonstrate the effectiveness of the proposed tracker without decreasing the tracking speed, and show stronger robustness and better tracking performance in complex environments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the existing problems of object tracking in real scenes, such as complex background, illumination changes, fast motion, and object rotation, the paper has proposed an object tracking algorithm via adaptive multi-feature fusion. By extracting the HOG feature of the object and using convolutional neural networks to extract high-level and low-level convolutional features, an adaptive threshold segmentation method has been used to evaluate the effect of each feature, and the weight ratio of feature fusion has been obtained. The response map of each feature has fused according to the weight coefficient, and the new estimated position of the object has been obtained, and the object scale has been calculated by the scale correlation filter, and the object scale has been obtained to complete the object tracking. The experimental results had conducted on the OTB-2013 dataset. The two-layer convolutional feature and the HOG feature are adaptively fused, so that the more discriminative single feature fusion weight is greater, which better expresses the appearance model of the object, and shows strong object tracking accuracy in scenes such as complex background, the disappearance of the object, light change, fast movement, and rotation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Depression is a kind of mood disorder disease characterized by significant and lasting depression, which seriously affects people's physical and mental health. In recent years, the number of people suffering from depression has gradually increased. In order to improve the recognition rate of depression and reduce the workload of doctors, this paper proposes to apply the deep learning algorithm BiLSTM (Bi-directional Long Short-Term Memory) and Attention to recognize depression. Among them, BiLSTM is used to extract contextual temporal information of text features and facial features. Attention is used to learn the correlation between vision and text modalities. This paper undertakes extensive experiments to demonstrate the network's effect. The experimental results show that this method has certain practical application value for depression recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The analysis of microscopic pore characteristics of geological reservoirs is the key content of unconventional oil and gas exploration. Currently, there are still some problems in the known analysis methods, such as small analysis area, poor analysis accuracy and weak quantitative analysis of human factors. In order to solve the problems in the current exploration methods, this research adopts the artificial intellience image analysis method, uses the image processing and computer vision technology, proposes the hole contour fitting extraction algorithm, the pore aggregation degree discrimination algorithm and the convex hull calculation algorithm to analyze the image. A large number of scanning electron microscope images of shale gas reservoirs were used for experiments. The final experimental results show that, compared with the known qualitative methods of micropore characteristics, this method reduces the intervention of human factors, improves the efficiency of analysis, and can obtain more accurate analysis results under the premise of obtaining correct results. Keywords: Image processing, shale matrix, pore structure analysis, cracks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The detection of mobile robot performance often requires costly and high-precision trajectory tracking equipment, which
is often difficult for testing laboratories to afford. Therefore, it becomes a difficult problem to find the equipment that is
cheap and can meet the detection requirements. HTC's Vive Tracker, as an inexpensive virtual reality (VR) accessory,
provides good measurement accuracy. In this study, the Vive Tracker is used for the purpose of mobile robot performance
detection, mainly focusing on the ability of the device to capture and analyze the mobile robot's movement trajectory.
Through the test and analysis of Vive Tracker, it is verified that Vive Tracker can meet the accuracy requirements of
mobile robot detection, and can well capture the mobile robot's movement trajectory and be used for mobile robot
performance analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cruise ship cabin has the characteristics of complex structure and dense personnel. Once a fire occurs, it will seriously threaten personal safety, cause immeasurable property loss and bring great difficulties to rescue. Aiming at the practical problem of frequent cruise fire, this paper uses FDS fire simulation software to conduct numerical simulation on a certain cabin of cruise ship. The fire source is set in the cabin, and detectors are installed at different positions in the corridor from the fire source. Through the simulation experiment, the CO concentration change diagram, temperature change diagram and smoke spread distribution diagram are obtained, which provide a theoretical basis for the effective control of fire spread and personnel evacuation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays, medical image fusion serves as a significant aid for the precise diagnosis or surgical navigation. In this paper, we propose a novel tensor factorization based fusion strategy which well combines the multimodal, multiscale nature of medical images and multiway structure of tensors. Since our model adopts the sparse representation (SR) prior, we suffer from the systematic underestimation of the true solution because of the L1-norm regularization term. To address this problem, we introduce the generalized minimax-concave (GMC) penalty into our framework, which is a non-convex regularization term itself. It is beneficial for the whole cost function to maintain convexity. Furthermore, we combine the alternating direction method of multipliers (ADMM) algorithm and forward-backward (FB) method to achieve the optimization process. We conduct extensive experiments on five kinds of practical medical image fusion problems with 96 pairs of images in total. The results confirm that our model has great improvements in visual performance and objective metrics against the existing methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Internet online monitoring technology has been widely used in the temperature monitoring of high-voltage transmission
lines. However, due to the influence of data volume and other factors, it is possible for multiple lines to have delay at a
certain time and for a certain line to repeatedly appear abnormal. Manual processing speed is slow and low efficiency.
To improve the data transmission efficiency, make the line temperature prediction faster and grasp the line temperature
more accurately, comparing and analyzing the accuracy and training time of circuit temperature under the existing
several neural networks. Then, a temperature prediction method based on LSTM-ELM network under broadband and
narrowband fusion is proposed to unify the broadband and narrowband structure and transmit data of different sizes
through a unified frequency band. Establish the LSTM-ELM network, extract data features, analyze temperature data,
and realize the rapid prediction of the line temperature trend. The experimental results show that the network prediction
accuracy based on LSTM-ELM reaches 92.02% while the prediction time is greatly reduced to 863.68s, compared with
the traditional LSTM network, the predicted time is improved by nearly 2000%, which can provide a reliable basis for
background management in engineering practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the nonlinear and underactuated characteristics of unmanned surface vehicle system and the uncertainty of environmental model, it is hard to establish accurate dynamic model and control law obtained by traditional algorithm which is too complex and has no engineering practice realization. In this paper, based on deep reinforcement learning algorithm of deep deterministic policy gradients, the line of sight algorithm is used firstly to obtains the expected value of heading angle of USV according to the current time position and the expected trajectory of USV. Meanwhile, we adopt the double Gaussian reward function to evaluate the training action, so as to obtain the optimal control action to realize the accurate tracking control. Finally, compared with explicit model predictive controller and linear quadratic regulator, the designed track controller based on DDPG has shorter adjusting time and smaller overshoot than explicit model predictive controller and linear quadratic regulator.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep neural networks are frequently used to automate the examination of radiographic images in medical. These approaches may be used to train on huge datasets or extract features from small datasets using pre-trained networks. Due to the lack of large pulmonary tuberculosis datasets, it is possible to diagnose tuberculosis using pre-trained deep convolutional neural networks. Thus, this article aims to detect and diagnose tuberculosis in chest X-rays by combining a pre-trained deep convolutional neural network with a machine learning model. Combined the deep pre-trained DenseNet201 network with the machine learning XGBoost classifier to create a hybrid model for classifying patients as tuberculosis infected or not. The proposed model extracts feature using the pre-trained DenseNet201 neural networks and classify them employing the XGBoost classifier. We performed extensive experiments to assess the performance of the proposed DenseNet201-XGBoost model using tuberculosis chest x-ray images. Comparative study shows that the proposed DenseNet201-XGBoost-based tuberculosis classification model outperforms other competing approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The traditional coal preparation methods include the jigging coal preparation, the dry coal preparation, and the γ- ray coal preparation. Although these methods achieve the function of the coal preparation, they have some problems such as the low accuracy, the high cost, the long time-consuming, and the great health hazard. Aiming at these problems, a improved threshold recognition method is developed by using the x-ray image. First, the images of the coal and the gangue is obtained by using x-ray scanner, and then the gray values is obtained. Second, the thickness of the coal and the gangue is calculated. Third, the gray value and the thickness information of the coal and the gangue are combined, and the separation threshold is determined. Finally, the recognition of the coal and the gangue is realized. The experimental results show that the recognition accuracy can reach about 98%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For the current problem of small target detection, this paper first sorts out the development and current situation of target detection algorithms, and systematically summarizes the research progress of target detection algorithms on complex ground backgrounds. Secondly, we start with two major categories of hyperspectral small target detection and infrared small target detection, and each category is analyzed from different methods. Then we take the representative algorithm as an example to analyze its detection performance and its application under the actual complex ground background conditions. Finally, we respectively make prospects and predictions for each type of algorithm in the application of complex ground background target detection, which provides a reference for future research on small target detection problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To better suppress the reflection layer image by shooting through the glass, we propose a reflection suppression model to highlight the main information of the reflected image. We combine the local linear model of a guided filter with the gradient threshold to enhance the boundary contour of the image to achieve the effect of suppressing reflections and effectively solve the established partial differential equations by using discrete cosine transform. Experiments on images taken in different scenes prove the superiority of this method is the problem of single- image reflection suppression.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Person re-identification (ReID) is an important task in computer vision. Most methods based on supervised strategies have achieved high performance. However, performance cannot be maintained when these methods are applied without labels because styles in different scenes exhibit considerable discrepancy. To address this problem, we propose an attention mutual teaching (AMT) network for unsupervised domain adaptation person ReID. The AMT method improves the performance of a model through iterative clustering and retraining. Meanwhile, two attention modules can teach each other to reduce clustering noise. We conduct extensive experiments on the Market-1501 and DukeMTMC-reID datasets. The experiments show that our approach performs better than state-of-the-art unsupervised methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Pedestrian detection is a hot and difficult topic in the computer vision field. The Histograms of Oriented Gradients (HOG) feature, because of its high performance in accuracy, is widely used in pedestrian detection. Nonetheless, its information description capacity needs further improvement. so, I-HOG (Improved HOG) was proposed. I-HOG has two major improvements. First, I-HOG enhances the description of edge features. Through the different scales for the block histograms of a set of correlation graphs, makes the correlation between characteristic information. Second, I-HOG using multi-scale feature extraction methods, include wider edge feature description information, make up for the deficiencies of the HOG feature, because HOG features are only extracted in fixed block size, The experimental results show that in the INRIA database, using I-HOG, detection rate increased by 5.4% and 4.3% respectively, combined with the feature of CSS after detection rate increased by 2.8% and 4.0% respectively compared to the HOG.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The passive video surveillance system is designed in this paper. At first, it captures images through the camera sensor and displays the image in the client program without battery and complicated wiring. This system consists of three modules: energy harvesting, data transmission, and data processing. Firstly, the system transmits UHF radio frequency signals to the environment through a signal transmitter, then the system converts the received radio frequency signal into an electrical signal and stores it in a on-board supercapacitor to supply power to each device in the system. The system dispatches the image capture task through a low-power microprocessor, and transmits the data to the host under the same local network through a Wi-Fi module.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Suffered from low resolution images and straightforward features, existing R-CNN based methods for pulmonary nodule detection usually fail in detecting objects with small scales. In this paper, we propose a novel context-aware network which takes the pulmonary regions and their neighbors for joint learning. The contextual cues of these regions reinforce each other, which is beneficial for the detection of small regions. Moreover, a set of redesigned anchors are used to adapted pulmonary nodules with various sizes. In order to avoid dilution by redundant samples specifying large nodules, a data enhancement strategy is implemented in the training stage by identifying hard samples. We test the proposed network on a dataset with 2000 lung images and demonstrate it performs well in detection of lung nodules with various sizes. The proposed method has 7% improved to the original Faster R-CNN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Faster RCNN, as a classical detection algorithm, still faces a huge challenge in detecting small objects. Therefore, we
introduce a multi-scale auxiliary feature fusion strategy to make sure that each layer of features contains rich semantic and
spatial information. Firstly, we introduce shallow features extracted by a multi-scale auxiliary feature network into the
backbone network, as a way to ensure that there is sufficient spatial information for detecting small objects even for the
deepest feature. Secondly, we design a fusion module to fuse the auxiliary feature and backbone feature. Finally, to make
the object proposal boxes positioning more precise in the ROI classification and regression network, replace RoIPool with
RoIAlign. Our experiments are conducted on PASCAL VOC and KITTI autopilot datasets. Compared with the
conventional methods, the improved Faster RCNN algorithm has 2.48% and 3.09% improved in mean average precision
on PASCAL VOC and KITTI datasets, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Massive MIMO and beamforming are primary technologies in 5G and beyond (B5G). The line-of-sight (LOS) beamforming technology form a beam path for a specific user to significantly compensates for the high attenuation of millimeter-wave (mmWave). This paper considers serving multicast users using beamforming based on the massive MIMO antenna array, aiming to optimize the beam utilization and maximize the successfully served users. In particular, we divide the massive MIMO antenna array into multiple smaller sub-arrays with different scales to form flexible beams with various beam widths and transmission coverage to serve unicast or multicast users based on needs. Regarding this, we propose an Integer Linear Programming (ILP) optimization model to minimize the power consumption of the entire antenna array while maximizing the number of successfully served users using beamforming multicast and/or unicast, subject to the constraints of power consumption and signal-to-interference-noise ratio (SINR). The proposed optimization model is numerically evaluated, and results show that significant power can be saved using multicast beamforming technology
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.