IEEE Transactions on Geoscience and Remote Sensing - new TOC TOC Alert for Publication# 36
- Fully Convolutional Network-Based Fast UAV Detection in Pulse Doppler Radarel febrero 22, 2024 a las 1:18 pm
With the popularity of drones, how to conduct effective and fast detection of unmanned aerial vehicle (UAV) to prevent unauthorized flying becomes a hot topic. Based on statistical theory, traditional constant false alarm rate (CFAR) works well on data with uniform background. But for low-slow-small UAV, it is prone to miss detection. In recent years, data-driven deep learning (DL) method is proved to have better performance than CFAR. However, the use of sliding window to convert complex detection task into simple classification task leads to low efficiency. In this article, we propose a fast detection method that applies a fully convolutional network (FCN) on the whole range-Doppler (RD) map. To achieve comparable accuracy to our previous work, the network is first designed on the principle that the effective receptive field (ERF) of unit in the feature map for prediction is close to the size of the sliding window. And the best bifurcation position of classification and regression is searched. Then, considering the imbalance of positive and negative samples, a new scheme to create ground-truth (GT) data is designed to expand the positive samples, and random sampling (RS) of negative samples is adopted further. Lastly, a post-processing mechanism combining probability thresholding and minimum deviation positioning (MDP) is developed for accurate location of target. Comparison with existing methods on the experimental data shows that the proposed method can increase the detection speed by up to 47 times while maintain a promising accuracy.
- EFLNet: Enhancing Feature Learning Network for Infrared Small Target Detectionel febrero 19, 2024 a las 1:19 pm
Single-frame infrared small target detection is considered to be a challenging task, due to the extreme imbalance between target and background, bounding box regression is extremely sensitive to infrared small target, and target information is easy to lose in the high-level semantic layer. In this article, we propose an enhancing feature learning network (EFLNet) to address these problems. First, we notice that there is an extremely imbalance between the target and the background in the infrared image, which makes the model pay more attention to the background features rather than target features. To address this problem, we propose a new adaptive threshold focal loss (ATFL) function that decouples the target and the background, and utilizes the adaptive mechanism to adjust the loss weight to force the model to allocate more attention to target features. Second, we introduce the normalized Gaussian Wasserstein distance (NWD) to alleviate the difficulty of convergence caused by the extreme sensitivity of the bounding box regression to infrared small target. Finally, we incorporate a dynamic head mechanism into the network to enable adaptive learning of the relative importance of each semantic layer. Experimental results demonstrate our method can achieve better performance in the detection performance of infrared small target compared to the state-of-the-art (SOTA) deep-learning-based methods. The source codes and bounding box annotated datasets are available at https://github.com/YangBo0411/infrared-small-target.
- ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detectionel febrero 19, 2024 a las 1:19 pm
Existing oriented object detection in aerial images has progressed a lot in recent years and achieved favorable success. However, high-precision oriented object detection in aerial images remains a challenging task. Some recent works have adopted the classification-based method to predict the angle to address boundary problems in angle. However, we have found that these works often neglect the sensitivity of objects with different aspect ratios to angle. At the same time, it is worth exploring a suitable way to improve the emerging transformer-based approaches to adapt them to oriented object detection. In this article, we propose an Aspect Ratio-Sensitive DEtection TRansformer, termed ARS-DETR, for oriented object detection in aerial images. Specifically, a new angle classification method, called aspect ratio-aware circle smooth label (AR-CSL), is proposed to smooth the angle label more reasonably and discard the hyperparameter introduced by previous work [e.g., circular smooth label (CSL)]. Then, a rotated deformable attention (RDA) module is designed to rotate the sampling points with the corresponding angles and eliminate the misalignment between region features and sampling points. Moreover, a dynamic weight coefficient according to the aspect ratio is adopted to calculate the angle loss. Comprehensive experiments on several challenging datasets demonstrate that our method achieves a competitive performance in the high-precision oriented object detection task.
- Multiscale 3-D–2-D Mixed CNN and Lightweight Attention-Free Transformer for Hyperspectral and LiDAR Classificationel febrero 19, 2024 a las 1:19 pm
The effective combination of hyperspectral image (HSI) and light detection and ranging (LiDAR) data can be used for land cover classification. Recently, deep-learning-based classification methods, especially those using transformer networks, have achieved remarkable success. However, deep learning classification methods for multisource data still encounter various technical challenges, such as comprehensive utilization of multiscale information, lightweight network design, and efficient fusion strategies for heterogeneous data. To address these challenges, we propose a novel and efficient deep neural network, namely, multiscale 3-D–2-D mixed CNN feature extraction and multisource data lightweight attention-free fusion network (M2FNet) based on CNN and transformer. Through end-to-end training, this network effectively combines heterogeneous information from multiple sources, leading to improved performance in joint classification. Specifically, M2FNet uses a multiscale 3-D–2-D mixed CNN design to extract both the spatial–spectral features of HSI and the depth-based elevation features of LiDAR data. Subsequently, the extracted features are fed into a novel encoder comprising a feature enhancement (FE) module, designed with mathematical morphology and a dilated convolutional module derived from the self-attention of the conventional transformer encoder (DConvformer), which plays a crucial role in integrating multisource information within the network. The well-designed architecture enables the network to acquire multiscale depth and high-order features, significantly reducing the number of training parameters. Comparative experimental results and ablation studies demonstrate that M2FNet outperforms other advanced methods. The source code is publicly available at https://github.com/cupid6868/M2FNet.git.
- Two-Stage Domain Adaptation Based on Image and Feature Levels for Cloud Detection in Cross-Spatiotemporal Domainel febrero 19, 2024 a las 1:19 pm
Cloud detection in high-resolution remote sensing images (HRSIs) is widely applied to cross-spatiotemporal domains with various scenarios change. However, cloud detection semantic segmentation models based on limited samples cannot ensure the consistency of data distribution between the source domain (SD) and the target domain (TD), resulting in a decrease in cross-domain segmentation accuracy and robust ability. Therefore, this article proposed a two-stage domain adaptation based on the image and feature levels (TDAIF) cloud detection framework. TDAIF designs a pseudo-TD data generator (PTDDG) at the image level to fuse the SD foreground and TD background information effectively, assisting the model in mining invariant semantic knowledge of the TD. Then, a domain discriminator and self-ensembling joint (DDSEJ) framework is explored at the feature level to implicitly handle the alignment of global features and the optimization of decision boundaries-local features. TDAIF ultimately weakens the impact of image radiation diversity and scale divergence and improves the adaptive processing capabilities for cross-spatiotemporal data. Horizontal and internal comparative experiments on TDAIF were conducted on three domain transfer data. Experimental results show that TDAIF dramatically reduces the network accuracy loss in cross-domain. Compared with CycleGAN and AdaptSegNet, the IoU is improved by about 30%. TDAIF performs better than state-of-the-art computational visual domain adaptation (DA) methods, indicating that hierarchical data alignment from the image to the feature level is very effective.