laitimes

Smart Building: Building Exterior Wall Defect Detection Based on YOLOv7

Follow and star

Never get lost

Institute of Computer Vision

Smart Building: Building Exterior Wall Defect Detection Based on YOLOv7

Computer Vision Research Institute

Scan the QR code on the homepage to get how to join

Special column of the Institute of Computer Vision

Column of Computer Vision Institute

Cosmetic defects affect not only the aesthetics of the building, but also its function. In addition, they can endanger pedestrians, occupants, and property. We have this phenomenon in many old and dilapidated community buildings, and now there is an urgent need for AI real-time monitoring to improve the safety environment of residents.

01

Foreground Overview

Existing deep learning-based methods face some challenges in terms of recognition speed and model complexity. In order to ensure the accuracy and speed of building exterior wall defect detection, we studied an improved YOLOv7 method, BFD-YOLO. First, the original ELAN module in YOLOv7 was replaced with a lightweight MobileOne module to reduce the number of parameters and improve the inference speed. Secondly, the coordinate attention module is added to the model to enhance the feature extraction ability. Next, SCYLLA-IoU is used to speed up convergence and increase the recall of the model. Finally, we extended the open dataset to construct a building façade damage dataset that included three typical defects. BFD-YOLO demonstrated superior accuracy and efficiency based on this dataset. Compared to YOLOv7, the accuracy and [email protected] of BFD-YOLO are improved by 2.2% and 2.9%, respectively, while maintaining comparable efficiency. Experimental results show that the proposed method achieves high detection accuracy under the premise of ensuring real-time performance.

02

Status quo and project motivation

The presence of façade defects is a pressing issue in the operational phase of a building and is often attributed to mechanical and environmental factors. Typical defects are concrete spalling, decorative spalling, component cracks, large-scale deformation, tile damage, moisture damage, etc. These defects can affect the appearance and reduce the lifespan of the building. What's more, falling objects on the exterior wall can cause safety accidents and irreparable damage. Structural damage detection is an integral part of structural health monitoring and is critical to ensuring the safe operation of buildings. As an integral part of structural damage detection, the detection of defects in building exterior walls can enable the government and management to accurately understand the overall condition of building exterior walls, thereby helping to formulate reasonable repair plans. This is an effective way to reduce building maintenance costs, extend the life of the building, and mitigate the impact of damage to the façade. Many countries and regions are developing policies for regular and standardized visual inspections. The detection of defects in building facades has become a critical component of building maintenance.

Visual inspection is a simple and reliable way to assess the external condition of a building. Traditional architectural visual inspections typically require professionals to arrive at the inspection site with specialized tools, where they are evaluated using visual observation, hammering, and other techniques. These methods rely on the inspector's expertise and experience, which is subjective, dangerous, and inefficient. Due to the increase in the number and scale of buildings, manual visual inspection methods are no longer sufficient to meet the requirements of large-scale inspections. As technology advances, many new methods, such as laser scanning, 3D thermal imaging, and SLAM, are being used for façade damage detection through drones and robotic platforms. These new methods are more convenient and safer than traditional techniques, but they are time-consuming and costly. As a result, these methods also face challenges in meeting the need for large-scale inspections. Therefore, it is necessary to develop a more accurate and effective surface defect detection method to improve the inspection efficiency and reduce the computational cost.

03

New ideas and practical details

There are many types of building exterior wall defects, and different detection methods are suitable. Common types include cracks, spalling, and wall hollowing. For cracks, there are more studies using semantic segmentation for detection. For wall hollowing, percussion and infrared thermal imaging are more widely used. After research and research, we have selected defect types that are suitable for the object detection method and easy to build a data set. The images in the dataset are mainly from building façade images taken by mobile phones, video cameras, and drones. In addition, some images from the internet and public datasets are also used for expansion. All images are between 1000 and 3000 pixels wide and 2000 to 5000 pixels high. The dataset consists of three building façade defects: delamination, spalling, and tile loss. A total of 1907 original images were collected, which contained about 2% of the background images. A background image is an image without defects that is added to the dataset to reduce the location of errors. The training, validation, and testing sets are divided according to the ratio of 7:2:1. The following diagram shows an example of a defect in the dataset.

Smart Building: Building Exterior Wall Defect Detection Based on YOLOv7

From left to right, delamination, peeling, and tile loss.

Data Augmentation

Large amounts of data are often required in model training for neural networks. However, it is relatively difficult to obtain images of defects in building facades, and there is a problem of category imbalance in the collected data. To mitigate the impact of this problem, we apply data enrichment techniques to the training data. Data enrichment is a common technique for performing various transformations of raw data. It is widely used in the field of deep learning to systematically generate more training data. Data augmentation can help the model learn more data variations and prevent it from over-relying on specific training samples. Supervised data augmentation techniques include geometric transformations (e.g., flipping, rotating, scaling, cropping, etc.) and pixel transformations (e.g., noise, blurring, brightness adjustments, saturation adjustments, and so on).

Smart Building: Building Exterior Wall Defect Detection Based on YOLOv7

New design framework

Smart Building: Building Exterior Wall Defect Detection Based on YOLOv7

It can be divided into the trunk and the head. The function of the backbone network is to extract features. The original backbone of YOLOv7 consisted of several CBS, MP, and ELAN modules. CBS is a module consisting of a convolutional kernel, batch normalization, and SiLU activation functions. The councillors are made up of MaxPooling and CBS. The improved backbone replaces the ELAN module with a MobileOne module for increased speed, and a coordinated attention module is added behind each MobileOne module. The proposed improved method can pay attention to the salient features in the input image and suppress the external information, so as to effectively improve the detection accuracy.

The head of the network is a PaFPN structure that consists of an SPPPCC, several ELAN2s, CatConv, and three RepVGG blocks. The design of ELAN adopts a gradient path design strategy. Compared with the data path design strategy, the gradient path design strategy focuses on analyzing the source and composition of the gradient to design a network architecture that makes efficient use of network parameters. The implementation of this strategy can make the network architecture more lightweight. The difference between ELAN and ELAN2 is that they differ in the number of channels. Apply the structure reparameterization method to the RepVGG block. The method adopts a multi-branch training structure and a single-branch inference structure to improve the training performance and inference speed. After outputting three feature maps, the header generates three predictions of different sizes through three RepConv modules.

04

The effect of the project

An experimental platform was set up to train the model and test it. The hardware and software configuration of the experimental platform is shown in the following table.

environment name
system Win 10
CPU I7-11700
GPU RTX 3090
RAM 32GB
language Python 3.7
frame Pytorch 1.11.0

In training, SGD is used for model training with a momentum of 0.937 and a weight decay

Read on