由于热图像存在无颜色信息,边缘模糊,细节信息较弱等问题,较难获得高质量的图像分割效果.为解决这个问题,在编码-解码(encode-decode)架构的基础上,本文增加了多级像素空间注意模块(multi-level pixel spatial attention module, MPAM)、边缘提取模块(edge extraction module, EEM)和小目标提取模块(tiny target extraction module, TTM).其中,MPAM能使网络充分保留细节的同时捕捉到语义信息,EEM和TTM分别提取具有语义信息的边缘和小目标等细节特征.为提高各类别边缘相交区域像素点和小目标物体的预测精度,设计了专门的损失函数对已获得的边缘和小目标特征进行监督训练,提高各类别边缘相交区域像素点和小目标物体的预测精度.将该方法分别应用于课题组构建的热图像数据集SCUT_SEG、公开的热图像数据集SODA和合成热红外数据集Cityscpae,实验结果表明:本文方法比FCN、PSPNet、Deeplabv3+、MCNet、EC-CNN等5种网络分割算法效果略好,性能提升约2.2个百分点.
Objective: As a fundamental process for night-time autonomous driving and night-time intelligent monitoring, thermal image segmentation has drawn extensive attention. Despite much research effort that has been devoted to thermal image segmentation, high quality segmented results cannot be readily obtained due to the absence of color information, blurred edges, and weak details in thermal images. Here, based on Deeplabv3+, we propose a tiny-target and edge-enhanced network to tackle problems related to edges and small targets.
Methods: The tiny targets and edge-enhanced network algorithm is based on the baseline of Deeplabv3+ or other segmentation baselines, and the former is used herein. First, we design a multi-level pixel spatial attention module (MPAM). This MPAM enables the network to make full use of the feature and context information of each layer so that details at the pixel space can be effectively recovered. Second, we design both an edge extraction module (EEM) and a tiny target extraction module (TTM), which are modeled in EEM and TTM, respectively. Output results of modules above can lead to accurate features of the edges and small targets. Finally, specialized loss functions have been designed to supervise the edge and tiny target features to improve the accuracy of small target and pixels along edges, because the ground truth map can supervise these functions.
Results: In existing semantic segmentation methods based on CNNs, the intensity, shape and texture of features are mixed, and the segmentation of small targets and edge targets cannot be handled properly. For the purpose of obtaining strong semantic information, the network structure of stacked layers is generally used for information extraction, also leading to the loss of a large amount of detailed information. To validate the effectiveness of our proposed method, we first conduct a great number of comparative experiments in the thermal infrared dataset (SCUT_SEG) with several mostly- similar algorithms. After we have visualized and analyzed these results in detail, the proposed algorithm does improve the segmentation of small targets on edges. Second, a number of ablation experiments have also been designed to further validate the effectiveness of each proposed module. To verify that our model can be adapted to different datasets in the same scenario, we have chosen to experiment on publicly available SODA and synthetic thermal infrared Cityscapes. Final experimental results on three thermal image datasets, namely SCUT_SEG, SODA, and synthetic thermal infrared Cityscapes show that our method yields a slight 2.2% improvement compared with other state-of-art algorithms in the same scenario. In terms of segmentation accuracies for specific target classes and for small targets and details such as edges, the proposed algorithm offers more satisfactory segmentation results.
Conclusions: Existing semantic image segmentation algorithms for thermal infrared images endure problems of losses of details such as edges and small targets. The designed MPAM capably and fully utilizes the feature and context information of each layer to effectively recover details at the pixel space. Via explicitly modeling, EEM and TTE modules are used to recover the detail information such as edges and small targets. Specific functions are provided to supervise edge features and small target features so that the two features can be maintained. Finally, accuracies of small-target and edge features with semantic category information can be improved.