Due to the manufacturing process and environmental effects steel surfaces can have a variety of defects. The nonuniform surface brightness and the variety of shapes of defects make their detection challenging. In our paper we propose neural networks for the recognition of new defect classes and also for the classification of known types. For the former a zero-shot approach, based on a siamese network, is used learning features to classify unseen classes without a single training example. Additionally, we can utilize one branch (one structural part) of the same network for the classifications of previously trained defects. For performance evaluations, experiments were carried out on two benchmark data-sets: the Northeastern University and the Xsteel surface defect data-sets. Results show that our method outperforms the state-of-the-art solutions on the NEU data-set for zero-shot learning and for classification with accuracy 85.80% and 100% respectively. In case of the Xsteel data-set, we reached 98% for classification (which is the top known performance).
In our paper we combine neural networks with Hidden Markov Models for multiview object recognition. While convolutional neural networks are very efficient in object recognition there is still need for improvements in many practical cases. For example if the training is not satisfactory or the object localization is not solved with the neural network then information fusion from several images and from inertial sensors can still help a lot to improve recognition rate. In our use case we are to recognize objects from several directions with the VGG16 network. We assume that no localization of objects is possible on the images due to the lack of bounding box annotations, we have to recognize the objects even if they occupy only about 25% of the field of view. To overcome this problem we propose to use a Hidden Markov Model approach where the consecutive queries, shots taken from different viewing directions, are first evaluated with VGG16 inference and then with the Viterbi algorithm. The role of the later is to estimate the most probable sequence of poses of candidates (from the predefined 8 horizontal views in our experiments), thus we can select the most probable object. The approach, as evaluated with different number of queries over a set of 40 objects from the COIL-100 dataset, can result in significant increase of hit rate compared to one shot recognition or to combining individual shots without the HMM model.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.