Surface Defect Detection in Manufactured Parts with Transfer Learning

TechLabs Aachen
8 min readNov 10, 2021

--

This project was carried out as part of the TechLabs “Digital Shaper Program” in Aachen (Summer Term 2021)

Abstract
Deep convolutional neural network models may take days or even weeks to train on very large datasets. A way to reduce the time taken for training the datasets is by Transfer learning, a method of altering a few layers of pre-trained models of a similar application.

The weights in re-used layers may be used as the starting point for the training process and adapted in response to the new problem. The base model used in this work is Xception, which is a 71-layer of CNN and also was already trained on more than a million images from the ImageNet database.

Introduction
In recent years, many buildings and machines have been built to support and improve the quality of human life. Steel material is popularly used in these works because of its high resistance to wear during use and operation. Therefore, there are more and more iron being mined and steel components being manufactured for steel industries.

Although all stages of the production process are rigorous and under constant supervision, defective products are inevitable. Businesses have to ensure and maintain the quality of finished products. Hence, it is necessary to ensure and maintain the quality of the finished product.

With the remarkable development of computer and camera technology, steel manufacturing companies have applied machine learning into the processes, not only to increase productivity and maintain high quality by detecting and eliminating the parts with defects but also to upgrade the stages of the processes in the future.

However, creating and training a new model for each different manufacturer will take a lot of time and money, because collecting and labelling data is very expensive. To detect defects without spending too much time and money, we use transfer learning.

Transfer learning is a technique, which adjusts some layers of a trained model to retrain it with less time and a small dataset for a different task. The goal of our project is to change the last few layers of a pre-trained model and apply it to the task that detects the defects of steel sheets. Furthermore, we use the new model for different defect types and different domains.

In this blog, first, we describe and briefly explain the methodology, which contains the dataset and algorithm we used. Then we present the result and problems we met. Finally, we draw a conclusion and a possible future improvement.

Methodology
The search for a dataset started with a vigorous hunt on many platforms. After having an overview of many datasets, few among them were shortlisted. Few had very little data, few had highly unorganized data. Thus after brief analyses, the Severstal Steel Defect Detection was chosen as a feasible dataset.

Transfer Learning enables using a pre-trained network on a new dataset that has insufficient data or serves as a generic model in image classification. By using Transfer learning, the training time for the model is also decreased. As the challenge was about Image Classification and noting the recent advances, the Xception model has better top 1 and top 5 accuracies against the VGG16, ResNet-152, and Inception-13.

Xception has depth-wise separable convolutions which are less prone to overfitting and also need lesser parameters. Also, Xception is trained on more than a million images from the ImageNet database. Thus, the Xception model is chosen as a base model for Transfer Learning.

Dataset
The Dataset was acquired from one of the leading Russian steel companies operating in the field of Steel and mining industry, Severstal. With the onset of the 4th Industrial revolution and constant growth in the manufacturing industry, steel played an integral role due to its high tensile strength complemented with low cost, whether giant structures to small tools and products, automobiles, and even in the weapon-making industry. The boom of the steel industry was so significant that it was often considered an indicator of economic growth.

The Severstal dataset provided two sets of images, Test Images (5506 images) and Train Images (12568 images). Moreover, each image has a unique ImageID which can be categorized in four classes [1,2,3,4 ], an image might have no defect, defect of a single class, or defects of multiple classes. The given sets of images illustrate monochrome camera captured surfaces of manufactured steel with normal characteristics (see Fig. 1) or with some defects (Fig. 2).

Fig.1 Normal Characteristics (No defect)
Fig 2. Steel with Defects

The dataset also provided a train.csv file with each ImageID, ClassID and its encoded pixels (see Fig. 3)

Fig 3. Train.csv file included in the dataset

The main idea is to segregate, organize the images into the correct set of their respective classes. The framework to train the machine learning model, we chose the Xception model, a CNN in which pre-trained images could be loaded and classified.

Algorithm

The deep learning model is based on the Tensorflow framework. Various libraries required for the working of the model have been imported. The preprocessing of the data is done by labeling the data set. Images with unique ImageId are segregated to avoid redundancy. Further, the images with defects are labeled with ‘1’, and those with no defects are labeled with ‘0’. By doing this, a data frame is obtained with ImageId and has_defect has columns.

The dataset which has 12568 images is split into train dataset and test dataset randomly. The test size is chosen to be 12% while yields 9731 images for train data, 1509 images for test data, and 1328 images for validation data.
Performance measurements such as Recall, Precision, and F1 score have been defined.

Data augmentation is carried out randomly on the ‘a’ value with different orientations like flip_left_right, flip_up_down, rot90, also, by changing the brightness
and saturation.

The images are decoded with 3 channels and the shape is resized to (299,299). In the next steps, the functions for organizing train and test data are defined.

Xception model is chosen as the base model. This takes the image input which is of the shape (299,299,3). Transfer Learning is incorporated by meddling with the last few layers. The details of the layers are shown in the figure.

Fig.4

The Global Average Pooling is applied to the input stack of images and fed to the input layer. Further, 2 hidden dense layers are used which use ‘relu’ activation and ‘he_uniform’ kernel initializer.

The model is compiled with Adam optimizer, Binary cross-entropy as loss function. Logging of data and visualization of the results are seen using Tensorboard. The training involves 24 epochs and 580 steps per epoch. The performance scores of the model are then calculated. Lastly, the function for real-time prediction has been defined and the results are predicted.

Project Results

The accuracy and the validation accuracy of the 24-epoch training process are illustrated in the figure below.

Fig 5. Metrics visualization during the training process

The figure shows that the validation accuracy reached the maximum value, about 0.9594, at the 22nd epoch (21st in the figure), with the accuracy being 0.9886. Afterwards, we applied the weights of the 22nd epoch into the training set with different thresholds. The best confusion matrix is gained with the threshold value 0.5.

Fig 6. Confusion matrix

Then we applied our model on random 20 images of the dataset. The results are shown below

Fig 7. Results of 20 random images

The model was trained also without the data augmentation. The results obtained were nearly the same to the one with data augmentation. Thus the data augmentation did not yield any additional value.

The training process usually lasted for about 6–8 hours. It caused the problem that Google Colab disconnected because there was a long time without interaction. We added some code into the console of the browser to make it automatically click to interact with Google Colab during the training time.

Conclusion
The project is to implement transfer learning on defect detection on steel surfaces using the Severstal dataset. From the results shown above, we have successfully implemented transfer learning using the Xception Model. The validation accuracy of 0.9886 is achieved. The best accuracy was achieved at the 22nd epoch, which is taken as a starting point for the next processing.

The Severstal dataset is resized to 299x299 because the small size of images gives accuracy less than 50 % for all transfer learning models. Otherwise, the problem that occurs in this work is choosing the batch size and epoch for each transfer learning. If larger batch size and epoch used the model will be overfitting, If larger batch size and epoch the model will be underfitting.

The further development planned is implementing the transfer learning on similar applications to further study the working and limits of Transfer learning and how diverse the application shall be, for example, the different types of steel defects, defects in human bone through x-ray, etc.

Furthermore, we will re-modify the infrastructure of our model or find another pretrained model to reduce the training time. The result of this project will be useful to future TechLabs techies.

Mentors

Arturo García Zendejas
Manuel Belke

Techies:

Hong Phuoc Nguyen Nguyen
Akshay Sanjay Puned
Praveen Jai Bharath Nadkarni
Ankit Bhardwaj
Abdul Aziz Abdul Sahul Hamid

TechLabs Aachen e.V. reserves the right not to be responsible for the topicality, correctness, completeness or quality of the information provided. All references are made to the best of the authors’ knowledge and belief. If, contrary to expectation, a violation of copyright law should occur, please contact journey.ac@techlabs.org so that the corresponding item can be removed.

--

--

TechLabs Aachen
TechLabs Aachen

Written by TechLabs Aachen

Learn Data Science, AI, Web Development by means of our pioneering Digital Shaper program that combines online learning, projects and community — Free for You!!

Responses (1)