laitimes

YOLO algorithm improves one of the backbone series: Fcaformer

author:Nuist Object Detection

At present, one of the main research directions for designing more efficient visual transformers is to reduce the computational cost of self-attention modules by adopting sparse attention or using local attention windows. In contrast, we propose a different approach that aims to improve the performance of transformer-based architectures through intensive attention patterns. Specifically, we propose forward cross-notation for FcaFormer, i.e., the secondary use of the marker of the previous block at the same stage. To achieve this, FcaFormer utilizes two innovative components: Learnable Scale Factors (LSFs) and the Marker Merging and Enhancement Module (TME). LSFs can efficiently process cross-tokens, while TME can generate representative cross-tokens. By integrating these components, the proposed FcaFormer enhances the interaction between tag blocks with potentially different semantics and encourages the flow of more information down the stream.

Based on Forward Cross Attention (Fca), we designed a series of FcaFormer models to achieve the best trade-offs between model size, computational cost, memory cost, and accuracy. For example, without the need for enhanced training through knowledge distillation, our FcaFormer was able to achieve a top-1 accuracy of 83.1% on Imagenet with only 16.3 million parameters and about 3.6 billion MACs. This saves nearly half the parameters and a small amount of computational costs compared to the refined EfficientFormer, while also improving accuracy by 0.7%.

The overall structure of the FcaFormer model is shown below:

YOLO algorithm improves one of the backbone series: Fcaformer
YOLO algorithm improves one of the backbone series: Fcaformer
YOLO algorithm improves one of the backbone series: Fcaformer

Tutorial for adding a model as a backbone in a YOLOv5 project:

(1) Modify the models/yolo.py of the YOLOv5 project to the parse_model function and the _forward_once function of the BaseModel

YOLO algorithm improves one of the backbone series: Fcaformer
YOLO algorithm improves one of the backbone series: Fcaformer

(2) Create a new fcaformer.py in the models/backbone file and add the following code:

YOLO algorithm improves one of the backbone series: Fcaformer

(3) Import the model in models/yolo.py and modify it in the parse_model function as follows (import the file first):

YOLO algorithm improves one of the backbone series: Fcaformer

(4) Create a new configuration file under the model: yolov5_fcaformer.yaml

YOLO algorithm improves one of the backbone series: Fcaformer

(5) Run verification: specify the --cfg parameter as the newly created yolov5_fcaformer.yaml parameter in the models/yolo.py file

YOLO algorithm improves one of the backbone series: Fcaformer

Read on