Query:
学者姓名:孙宏滨
Refining:
Year
Type
Indexed by
Source
Complex
Co-Author
Language
Clean All
Abstract :
Stereo rectification and stereo matching are two critical components for the practical application of stereo vision systems. Previous studies treat them as two individual issues. For stereo rectification, var-ious traditional algorithms are proposed to estimate homography transformations, but the performance and the efficiency are unsatisfactory for real-time deployment. For stereo matching, disparity accuracy has been largely improved by learning based methods. However, the input data of all previous stereo net-works are assumed to be a pair of offline pre-rectified images, making them invalidate for accurate matching when the stereo vision system suffers from mechanical misalignment due to external collisions or temperature variations. In this paper, we optimize these two components jointly and propose an end -to-end learning framework to achieve online self-rectification and self-supervised disparity prediction simultaneously. The overall network contains two cascaded subnetworks which enable stereo rectifica-tion and stereo matching sequentially for a pair of unrectified images. The experimental results are eval-uated on both publicly available datasets and realistic scenarios. Evaluation results demonstrate that, the proposed network produces state-of-the-art results for self-rectification in terms of computation accu-racy and speed, and also produces competitive disparity results with previous self-supervised methods. Therefore, the proposed design provides a more practical and efficient solution for stereo vision systems deployed on mobile platforms.(c) 2022 Elsevier B.V. All rights reserved.
Keyword :
Disparity prediction End -to -end learning Self-rectification Self-supervised matching Stereo vision
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Xuchong , Zhao, Yongli , Wang, Hang et al. End-to-end learning of self-rectification and self-supervised disparity prediction for stereo vision [J]. | NEUROCOMPUTING , 2022 , 494 : 308-319 . |
MLA | Zhang, Xuchong et al. "End-to-end learning of self-rectification and self-supervised disparity prediction for stereo vision" . | NEUROCOMPUTING 494 (2022) : 308-319 . |
APA | Zhang, Xuchong , Zhao, Yongli , Wang, Hang , Zhai, Han , Sun, Hongbin , Zheng, Nanning . End-to-end learning of self-rectification and self-supervised disparity prediction for stereo vision . | NEUROCOMPUTING , 2022 , 494 , 308-319 . |
Export to | NoteExpress RIS BibTex |
Abstract :
As the core of the attitude determination system, the star sensor working in "lost in space" scenarios requires the star identification algorithm to be robust and fast with limited computing and memory resources. Nevertheless, previous algorithms are not satisfactory in robustness and identification speed. Hence, motivated by the fact that the one-dimensional convolutional neural network (1D-CNN) is suitable for sequential data, this article proposes a robust and efficient star identification algorithm, where 1D-CNN is used to process mixed initial features from star points. Moreover, this article proposes a combined star points selection strategy technique and a mixed initial features extraction technique to further improve the performance of 1D-CNN-based algorithm. Experimental results show that, compared with the state-of-the-art algorithm, the proposed algorithm can improve the average identification accuracy by 0.76%, the identification speed by 1.86x with the comparable memory consumption.
Keyword :
Feature extraction Kernel mixed initial features Neural networks One-dimensional convolutional neural network (1D-CNN) Position measurement robustness Robustness Space vehicles star identification star points selection strategy Uncertainty
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Yang, Shaofei , Liu, Longjun , Zhou, Jiantao et al. Robust and Efficient Star Identification Algorithm based on 1-D Convolutional Neural Network [J]. | IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS , 2022 , 58 (5) : 4156-4167 . |
MLA | Yang, Shaofei et al. "Robust and Efficient Star Identification Algorithm based on 1-D Convolutional Neural Network" . | IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS 58 . 5 (2022) : 4156-4167 . |
APA | Yang, Shaofei , Liu, Longjun , Zhou, Jiantao , Zhao, Yunfu , Hua, Gengxin , Sun, Hongbin et al. Robust and Efficient Star Identification Algorithm based on 1-D Convolutional Neural Network . | IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS , 2022 , 58 (5) , 4156-4167 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Efficient real-time disparity estimation is critical for the application of stereo vision systems in various areas. Recently, stereo network based on coarse-to-fine method has largely relieved the memory constraints and speed limitations of large-scale network models. Nevertheless, all of the previous coarse-to-fine designs employ constant offsets and three or more stages to progressively refine the coarse disparity map, still resulting in unsatisfactory computation accuracy and inference time when deployed on mobile devices. This paper claims that the coarse matching errors can be corrected efficiently with fewer stages as long as more accurate disparity candidates can be provided. Therefore, we propose a dynamic offset prediction module to meet different correction requirements of diverse objects and design an efficient two-stage framework. In addition, a disparity-independent convolution is proposed to regularize the compact cost volume efficiently and further improve the overall performance. The disparity quality and efficiency of various stereo networks are evaluated on multiple datasets and platforms. Evaluation results demonstrate that, the disparity error rate of the proposed network achieves 2.66% and 2.71% on Kim 2012 and 2015 test sets respectively, where the computation speed is 2x faster than the state-of-the-art lightweight models on high-end and source-constrained GPUs.
Keyword :
dynamic offsets prediction lightweight model mobile application real-time Stereo matching network
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Dai, He , Zhang, Xuchong , Zhao, Yongli et al. Adaptive Disparity Candidates Prediction Network for Efficient Real-Time Stereo Matching [J]. | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2022 , 32 (5) : 3099-3110 . |
MLA | Dai, He et al. "Adaptive Disparity Candidates Prediction Network for Efficient Real-Time Stereo Matching" . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 32 . 5 (2022) : 3099-3110 . |
APA | Dai, He , Zhang, Xuchong , Zhao, Yongli , Sun, Hongbin , Zheng, Nanning . Adaptive Disparity Candidates Prediction Network for Efficient Real-Time Stereo Matching . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY , 2022 , 32 (5) , 3099-3110 . |
Export to | NoteExpress RIS BibTex |
Abstract :
DRAM latency has remained almost constant over decades and has become a performance bottleneck of computing systems. In this study, we propose a low-cost DRAM architecture enabling dynamic reconfiguring of row decoder to provide reduced latency with high flexibility and reliability. We apply minimum changes to row decoders and allow dynamic reconfiguration to switch array blocks between two modes: 1) normal mode, where the DRAM array behaves in the same manner as the conventional DRAM does and 2) low latency mode, where two DRAM cells in the neighbor array blocks are coupled to operate as a logical cell and reduce latency reliably according to the differential principle. On the basis of an industrial open bitline (BL) cell array, we only change the wordline decoding scheme but keep the cell array and sense amplifiers (SAs) untouched to avoid modifications to the DRAM process for cost and reliability considerations. Our circuit simulation shows that the low-latency mode can reduce row-to-column delay and row access strobe time by 25.7% and 23.2%, respectively. We evaluate the reduced-latency LPDDR4 DRAM on various workloads. Compared with the JEDEC standard DRAM, our proposal provides a maximum system performance improvement of 8.5%. We believe that our proposal is a reliable and cost-friendly solution to DRAM latency reduction.
Keyword :
DRAM memory design memory structures
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Bai, Fujun , Wang, Song , Jia, Xuerong et al. A Low-Cost Reduced-Latency DRAM Architecture With Dynamic Reconfiguration of Row Decoder [J]. | IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS , 2022 , 31 (1) : 128-141 . |
MLA | Bai, Fujun et al. "A Low-Cost Reduced-Latency DRAM Architecture With Dynamic Reconfiguration of Row Decoder" . | IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 31 . 1 (2022) : 128-141 . |
APA | Bai, Fujun , Wang, Song , Jia, Xuerong , Guo, Yixin , Yu, Bing , Wang, Hang et al. A Low-Cost Reduced-Latency DRAM Architecture With Dynamic Reconfiguration of Row Decoder . | IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS , 2022 , 31 (1) , 128-141 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Dental age estimation is widely used in forensic identification, but the accuracy of traditional methods cannot satisfy the demand for accuracy, especially for age estimation of adults. We introduce a deep learning-based methodology to estimate the age based on collected X-ray images of the teeth. We present a new dental dataset, which contains labeled orthopan-tomograms (OPGs) of 27, 957 people, including 16, 383 OPGs for females as well as 11, 574 OPGs for males. All ages range from 0 to 93-year-old with a median of 27. The accuracy of the age labels is guaranteed by the ID card information. Aiming at the characteristics of the dental data itself, we explore various neural network elements that are effective for age estimation, including proper network depth, convolution kernel size, multi-branch structure, and the feature reusing of early layers. Based on the characteristic exploration, we further search models for dental age estimation by using the popular Neural Architecture Search (NAS) method. Experiment results show that our model achieves a mean absolute error (MAE) of 1.64 years, surpass all existing CNN models. Compared with Inception-v4 with an MAE of 1.70 and 20.46B FLOPs (inputs size 384×384), the FLOPs of our model can be reduced by 2.7 times (7.49B FLOPs). To our best knowledge, this is the first study for age estimation by exploring and searching the DNN model. Our results have surpassed legal medical expert-level performance (with an MAE of more than 2) for age estimation. Our methodology and results in this paper are very meaningful to forensic medicine for aging estimation with panoramic radiograph images. © 2021 IEEE.
Keyword :
Deep learning Digital forensics Multilayer neural networks Network architecture Radiography
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Hou, Wenxuan , Liu, Longjun , Gao, Jinxia et al. Exploring Effective DNN Models for Forensic Age Estimation based on Panoramic Radiograph Images [C] . 2021 . |
MLA | Hou, Wenxuan et al. "Exploring Effective DNN Models for Forensic Age Estimation based on Panoramic Radiograph Images" . (2021) . |
APA | Hou, Wenxuan , Liu, Longjun , Gao, Jinxia , Zhu, Anguo , Pan, Keyang , Sun, Hongbin et al. Exploring Effective DNN Models for Forensic Age Estimation based on Panoramic Radiograph Images . (2021) . |
Export to | NoteExpress RIS BibTex |
Abstract :
Depthwise separable convolution (DSC) has become one of the essential structures for lightweight convolutional neural networks. Nevertheless, its hardware architecture has not received much attention. Several previous hardware designs incur either high off-chip memory traffic or large on-chip memory usage, and hence have deficiency in terms of hardware efficiency as well as performance. This paper proposes two efficient dynamic design techniques, i.e. adaptive row-based dataflow scheduling and adaptive computation mapping, to achieve a much better trade-off between hardware efficiency and performance for DSC-based lightweight CNN accelerator. The effectiveness and efficiency of the proposed dynamic design techniques have been extensively evaluated using six DSC-based lightweight CNNs. Compared with the reference architectures, the simulation results show the proposed architectural techniques can at least reduce on-chip buffer size by 50.4% and improve the performance of convolution calculation by 1.18x while maintaining the minimum off-chip memory traffic. MobileNetV2 is implemented on Zynq UltraScale+ ZCU102 SoC FPGA, and the results show the proposed accelerator can achieve 381.7 frames per second (fps), which is 1.43x of the reference design, and it can save about 36.3% on-chip buffer size compared with the reference design, while maintaining the same off-chip memory traffic.
Keyword :
adaptive computation mapping adaptive row-based dataflow scheduling Convolutional neural network depthwise separable convolution
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Baoting , Wang, Hang , Zhang, Xuchong et al. Dynamic Dataflow Scheduling and Computation Mapping Techniques for Efficient Depthwise Separable Convolution Acceleration [J]. | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS , 2021 , 68 (8) : 3279-3292 . |
MLA | Li, Baoting et al. "Dynamic Dataflow Scheduling and Computation Mapping Techniques for Efficient Depthwise Separable Convolution Acceleration" . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS 68 . 8 (2021) : 3279-3292 . |
APA | Li, Baoting , Wang, Hang , Zhang, Xuchong , Ren, Jie , Liu, Longjun , Sun, Hongbin et al. Dynamic Dataflow Scheduling and Computation Mapping Techniques for Efficient Depthwise Separable Convolution Acceleration . | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS , 2021 , 68 (8) , 3279-3292 . |
Export to | NoteExpress RIS BibTex |
Abstract :
In-memory error correction code (ECC) is a promising technique to improve the yield and reliability of high density memory design. However, the use of in-memory ECC poses a new problem to memory repair analysis algorithm, which has not been explored before. This article first makes a quantitative evaluation and demonstrates that the straightforward algorithms for memory with redundancy and in-memory ECC have serious deficiency on either repair rate or repair analysis speed. Accordingly, an optimal repair analysis algorithm that leverages preprocessing/filter algorithms, hybrid search tree, and depth-first search strategy is proposed to achieve low computational complexity and optimal repair rate in the meantime. In addition, a heuristic repair analysis algorithm that uses a greedy strategy is proposed to efficiently find repair solutions. Experimental results demonstrate that the proposed optimal repair analysis algorithm can achieve optimal repair rate and increase the repair analysis speed by up to 10(5) x compared with the straightforward exhaustive search algorithm. The proposed heuristic repair analysis algorithm is approximately 28 percent faster than the proposed optimal algorithm, at the expense of 5.8 percent repair rate loss.
Keyword :
in-memory ECC Memory repair reliability repair analysis algorithm yield
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Lv, Minjie , Sun, Hongbin , Xin, Jingmin et al. Efficient Repair Analysis Algorithm Exploration for Memory With Redundancy and In-Memory ECC [J]. | IEEE TRANSACTIONS ON COMPUTERS , 2021 , 70 (5) : 775-788 . |
MLA | Lv, Minjie et al. "Efficient Repair Analysis Algorithm Exploration for Memory With Redundancy and In-Memory ECC" . | IEEE TRANSACTIONS ON COMPUTERS 70 . 5 (2021) : 775-788 . |
APA | Lv, Minjie , Sun, Hongbin , Xin, Jingmin , Zheng, Nanning . Efficient Repair Analysis Algorithm Exploration for Memory With Redundancy and In-Memory ECC . | IEEE TRANSACTIONS ON COMPUTERS , 2021 , 70 (5) , 775-788 . |
Export to | NoteExpress RIS BibTex |
Abstract :
The past few years have witnessed the dramatic increase in layers of convolutional neural networks (CNN). Most studies focused on the CNN's vertical structure design (e.g. residual structure, creating short paths architecture from early layers to later layers in vertical connections), but few people pay their attention to the process of feature generation and extraction in a single convolutional layer in CNN. In this paper, we find the non-feature suppression phenomenon in the process of extracting features. On the basis of this, we proposed an orthogonal approach named HSC (Horizontal Shortcut Connections) to improve feature representation fusion and computational efficiency for CNN. Especially, our HSC approach can effectively reduce interference overhead of non-feature areas and enhance the information fusion for depthwise convolution and group convolution which are the key blocks in lightweight neuron network. At HSC layer, the feature-maps of all preceding layer are properly connected with our strategy in horizon direction to constitute features and then produce a new representation which are used as input feature-maps passed on subsequent layers. Our HSC block can be plugged into convolution neural networks that include group convolution or depewise convolution, and can effectively improve accuracy of convolutional networks with slight additional computational cost. We evaluate our design on the popular lightweight neural networks and standard CNN structure. Compared with existing methods, we can achieve 1.63% accuracy improvement for MobileNet v2 on CIFAR-10 dataset and up to 3.70% accuracy improvement on CIFAR-100 dataset by adding HSC block after depthwise convolution, and 2.80% accuracy improvement on ImageNet dataset. For Mobilenet v3-small, we can achieve 0.8% accuracy improvement on ImageNet dataset. In order to prove the improvement effect of group convolution, the standard convolution is changed manually to group convolution and then the HSC block is added after group convolution, we can achieve 4X to 6X FLOPs improvement while maintaining the accuracy of neural networks. Notably, on ILSVRC-2012, our method reduces more than 43% FLOPs on ResNet-50 without accuracy declines and reduces 60.1% FLOPs on ResNet-50 with 0.44% accuracy declines.We also present primary hardware experiment results when HSC framework running on special hardware platform. (c) 2021 Elsevier B.V. All rights reserved.
Keyword :
Computational efficiency Depthwise convolution FLOPs decrease Group convolution Lightweight Mobile model Neural network architecture
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhu, Anguo , Liu, Longjun , Hou, Wenxuan et al. HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN [J]. | NEUROCOMPUTING , 2021 , 457 : 141-154 . |
MLA | Zhu, Anguo et al. "HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN" . | NEUROCOMPUTING 457 (2021) : 141-154 . |
APA | Zhu, Anguo , Liu, Longjun , Hou, Wenxuan , Sun, Hongbin , Zheng, Nanning . HSC: Leveraging horizontal shortcut connections for improving accuracy and computational efficiency of lightweight CNN . | NEUROCOMPUTING , 2021 , 457 , 141-154 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Pruning can remove redundant parameters and structures of Deep Neural Networks (DNNs) to reduce inference time and memory overhead. As an important component of neural networks, the feature map (FM) has stated to be adopted for network pruning. However, the majority of FM-based pruning methods do not fully investigate effective knowledge in the FM for pruning. In addition, it is challenging to design a robust pruning criterion with a small number of images and achieve parallel pruning due to the variability of FMs. In this paper, we propose Adaptive Knowledge Extraction for Channel Pruning (AKECP), which can compress the network fast and efficiently. In AKECP, we first investigate the characteristics of FMs and extract effective knowledge with an adaptive scheme. Secondly, we formulate the effective knowledge of FMs to measure the importance of corresponding network channels. Thirdly, thanks to the effective knowledge extraction, AKECP can efficiently and simultaneously prune all the layers with extremely few or even one image. Experimental results show that our method can compress various networks on different datasets without introducing additional constraints, and it has advanced the state-of-the-arts. Notably, for ResNet-110 on CIFAR-10, AKECP achieves 59.9% of parameters and 59.8% of FLOPs reduction with negligible accuracy loss. For ResNet-50 on ImageNet, AKECP saves 40.5% of memory footprint and reduces 44.1% of FLOPs with only 0.32% of Top-1 accuracy drop. © 2021 ACM.
Keyword :
Arts computing Data mining Deep neural networks Extraction Frequency modulation
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Haonan , Liu, Longjun , Zhou, Hengyi et al. AKECP: Adaptive Knowledge Extraction from Feature Maps for Fast and Efficient Channel Pruning [C] . 2021 : 648-657 . |
MLA | Zhang, Haonan et al. "AKECP: Adaptive Knowledge Extraction from Feature Maps for Fast and Efficient Channel Pruning" . (2021) : 648-657 . |
APA | Zhang, Haonan , Liu, Longjun , Zhou, Hengyi , Hou, Wenxuan , Sun, Hongbin , Zheng, Nanning . AKECP: Adaptive Knowledge Extraction from Feature Maps for Fast and Efficient Channel Pruning . (2021) : 648-657 . |
Export to | NoteExpress RIS BibTex |
Abstract :
It is critical to continously improve the hardware efficiency of deep neural network accelerators for its application on resource constrained platform. This brief proposes a lane shared bit-pragmatic architecture to address the synchronization induced performance bottleneck and hence further improve the performance and efficiency of bit-serial computing architecture. The effectiveness and efficiency of the proposed architecture are demonstrated by extensive evaluation results. © 2004-2012 IEEE.
Keyword :
Computer architecture Deep neural networks Efficiency Network architecture Neural networks Timing circuits
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Yang, Shaofei , Liu, Longjun , Li, Yingxiang et al. Lane Shared Bit-Pragmatic Deep Neural Network Computing Architecture and Circuit [J]. | IEEE Transactions on Circuits and Systems II: Express Briefs , 2021 , 68 (1) : 486-490 . |
MLA | Yang, Shaofei et al. "Lane Shared Bit-Pragmatic Deep Neural Network Computing Architecture and Circuit" . | IEEE Transactions on Circuits and Systems II: Express Briefs 68 . 1 (2021) : 486-490 . |
APA | Yang, Shaofei , Liu, Longjun , Li, Yingxiang , Li, Xinxin , Sun, Hongbin , Zheng, Nanning . Lane Shared Bit-Pragmatic Deep Neural Network Computing Architecture and Circuit . | IEEE Transactions on Circuits and Systems II: Express Briefs , 2021 , 68 (1) , 486-490 . |
Export to | NoteExpress RIS BibTex |
Export
Results: |
Selected to |
Format: |