Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3
Search results for: Shuangcheng Jia
3 Keypoint Detection Method Based on Multi-Scale Feature Fusion of Attention Mechanism
Authors: Xiaoxiao Li, Shuangcheng Jia, Qian Li
Abstract:
Keypoint detection has always been a challenge in the field of image recognition. This paper proposes a novelty keypoint detection method which is called Multi-Scale Feature Fusion Convolutional Network with Attention (MFFCNA). We verified that the multi-scale features with the attention mechanism module have better feature expression capability. The feature fusion between different scales makes the information that the network model can express more abundant, and the network is easier to converge. On our self-made street sign corner dataset, we validate the MFFCNA model with an accuracy of 97.8% and a recall of 81%, which are 5 and 8 percentage points higher than the HRNet network, respectively. On the COCO dataset, the AP is 71.9%, and the AR is 75.3%, which are 3 points and 2 points higher than HRNet, respectively. Extensive experiments show that our method has a remarkable improvement in the keypoint recognition tasks, and the recognition effect is better than the existing methods. Moreover, our method can be applied not only to keypoint detection but also to image classification and semantic segmentation with good generality.Keywords: keypoint detection, feature fusion, attention, semantic segmentation
Procedia PDF Downloads 1192 A Monocular Measurement for 3D Objects Based on Distance Area Number and New Minimize Projection Error Optimization Algorithms
Authors: Feixiang Zhao, Shuangcheng Jia, Qian Li
Abstract:
High-precision measurement of the target’s position and size is one of the hotspots in the field of vision inspection. This paper proposes a three-dimensional object positioning and measurement method using a monocular camera and GPS, namely the Distance Area Number-New Minimize Projection Error (DAN-NMPE). Our algorithm contains two parts: DAN and NMPE; specifically, DAN is a picture sequence algorithm, NMPE is a relatively positive optimization algorithm, which greatly improves the measurement accuracy of the target’s position and size. Comprehensive experiments validate the effectiveness of our proposed method on a self-made traffic sign dataset. The results show that with the laser point cloud as the ground truth, the size and position errors of the traffic sign measured by this method are ± 5% and 0.48 ± 0.3m, respectively. In addition, we also compared it with the current mainstream method, which uses a monocular camera to locate and measure traffic signs. DAN-NMPE attains significant improvements compared to existing state-of-the-art methods, which improves the measurement accuracy of size and position by 50% and 15.8%, respectively.Keywords: monocular camera, GPS, positioning, measurement
Procedia PDF Downloads 1441 DMBR-Net: Deep Multiple-Resolution Bilateral Networks for Real-Time and Accurate Semantic Segmentation
Authors: Pengfei Meng, Shuangcheng Jia, Qian Li
Abstract:
We proposed a real-time high-precision semantic segmentation network based on a multi-resolution feature fusion module, the auxiliary feature extracting module, upsampling module, and atrous spatial pyramid pooling (ASPP) module. We designed a feature fusion structure, which is integrated with sufficient features of different resolutions. We also studied the effect of side-branch structure on the network and made discoveries. Based on the discoveries about the side-branch of the network structure, we used a side-branch auxiliary feature extraction layer in the network to improve the effectiveness of the network. We also designed upsampling module, which has better results than the original upsampling module. In addition, we also re-considered the locations and number of atrous spatial pyramid pooling (ASPP) modules and modified the network structure according to the experimental results to further improve the effectiveness of the network. The network presented in this paper takes the backbone network of Bisenetv2 as a basic network, based on which we constructed a network structure on which we made improvements. We named this network deep multiple-resolution bilateral networks for real-time, referred to as DMBR-Net. After experimental testing, our proposed DMBR-Net network achieved 81.2% mIoU at 119FPS on the Cityscapes validation dataset, 80.7% mIoU at 109FPS on the CamVid test dataset, 29.9% mIoU at 78FPS on the COCOStuff test dataset. Compared with all lightweight real-time semantic segmentation networks, our network achieves the highest accuracy at an appropriate speed.Keywords: multi-resolution feature fusion, atrous convolutional, bilateral networks, pyramid pooling
Procedia PDF Downloads 149