In this paper, an efficient object detection method YOLO-Ti is proposed to detect tiny facial markers. Our study is driven by the practical requirements of 3D face modeling, requiring the incorporation of as many facial features as possible for reference. This research can even provide information for facial expression recognition and joint deformation. To achieve this, we first present a feature fusion module called Cross-BiFPN, which incorporates additional crossconnecting branches between different network layers to utilize low-level features more effectively. Secondly, we add a high-resolution detection head and attention module to the YOLOv8 model to improve the ability of detecting tiny objects, while at the same time ensuring the lightweight detection model by reducing redundant network layers. Thirdly, we collect a dataset of facial markers with an average size much smaller than publicly available small object datasets. Ablation studies and comparison experiments are conducted to evaluate the performance of our approach. Compared with the baseline YOLOv8 model, YOLO-Ti shows a 30.4% improvement in mAP50 while reducing model parameters by 65.1%. The automatic feature extraction provided by our model facilitates the construction of digital humans, providing significant savings in manpower and time for modelers.
|