The accuracy of image semantic segmentation directly affects the ability of autonomous driving technology to perceive the surrounding environment. To address the problems of unclear object edge segmentation and inaccurate small target object segmentation in the semantic segmentation model of road images in deep learning, this paper combines the convolutional attention mechanism module with the multiscale feature fusion module to optimize and improve the Deeplabv3+ algorithm. The convolutional attention mechanism module is added to the feature extraction network to improve the network feature extraction capability. Feature enhancement and fusion operations are introduced in the encoder part to make features of different sizes deeper and more expressive. The model was experimentally and validated on the Cityscapes dataset, and the results showed that the method designed in this paper ensures segmentation efficiency while making object edge segmentation clearer and segmenting small target objects more accurate. The segmentation accuracy of the model has been improved.
KEYWORDS: Optical character recognition, Deep learning, Education and training, Data modeling, Detection and tracking algorithms, Data conversion, Performance modeling, Image processing, Binary data, Feature extraction
As an important technology to promote office automation, document detection and recognition can improve the efficiency of business processes and user experience, make enterprise business more intelligent, and have very broad application scenarios. In this paper, a document detection and recognition system based on DB detection model and CRNN recognition model is built to detect and recognize document images using 3.64 million samples from the image dataset intercepted by ICDARD 2015 and Chinese corpus, and display the document information in the corresponding table in real time. The test results show that the system effectively improves the model inference speed while ensuring the accuracy of document recognition and detection, and completes the document information entry efficiently and quickly.
UAV technology has developed rapidly in recent years, Images extracted by UAV are widely used in urban division, crop classification, land monitoring etc. However, there are problems in UAV image segmentation such as image category imbalance, object scale variation, and insufficient utilization of contextual information, etc. To address the above problems, this paper uses optimized deeplabv3+ network model, and cross-entropy loss function for balancing the dataset samples in the experimental process for image semantic segmentation research. The results show that the algorithm of this paper has a high accuracy rate for semantic segmentation of UAV images, and can recognize each category of UAV images better, and the segmentation effect is better.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.