• Complex
  • Title
  • Author
  • Keyword
  • Abstract
  • Scholars
Search
High Impact Results & Cited Count Trend for Year Keyword Cloud and Partner Relationship

Query:

学者姓名:孟德宇

Refining:

Source

Submit Unfold

Co-Author

Submit Unfold

Language

Submit

Clean All

Export Sort by:
Default
  • Default
  • Title
  • Year
  • WOS Cited Count
  • Impact factor
  • Ascending
  • Descending
< Page ,Total 20 >
Uncertainty-guided hierarchical frequency domain Transformer for image restoration EI SCIE Scopus
期刊论文 | 2023 , 263 | KNOWLEDGE-BASED SYSTEMS
SCOPUS Cited Count: 10
Abstract&Keyword Cite

Abstract :

Existing convolutional neural network (CNN)-based and vision Transformer (ViT)-based image restora-tion methods are usually explored in the spatial domain. However, we employ Fourier analysis to show that these spatial domain models cannot perceive the entire frequency spectrum of images, i.e., mainly focus on either high-frequency (CNN-based models) or low-frequency components (ViT-based models). This intrinsic limitation results in the partial missing of semantic information and the appearance of artifacts. To address this limitation, we propose a novel uncertainty-guided hierarchical frequency domain Transformer named HFDT to effectively learn both high and low-frequency information while perceiving local and global features. Specifically, to aggregate semantic information from various fre-quency levels, we propose a dual-domain feature interaction mechanism, in which the global frequency information and local spatial features are extracted by corresponding branches. The frequency domain branch adopts the Fast Fourier Transform (FFT) to convert the features from the spatial domain to the frequency domain, where the global low and high-frequency components are learned with Log -linear complexity. Complementarily, an efficient convolution group is employed in the spatial domain branch to capture local high-frequency details. Moreover, we introduce an uncertainty degradation -guided strategy to efficiently represent degraded prior information, rather than simply distinguishing degraded/non-degraded regions in binary form. Our approach achieves competitive results in several degraded scenarios, including rain streaks, raindrops, motion blur, and defocus blur.(c) 2023 Elsevier B.V. All rights reserved.

Keyword :

Frequency-domain Transformer Image restoration Log-linear complexity Uncertainty-guided

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Shao, Mingwen , Qiao, Yuanjian , Meng, Deyu et al. Uncertainty-guided hierarchical frequency domain Transformer for image restoration [J]. | KNOWLEDGE-BASED SYSTEMS , 2023 , 263 .
MLA Shao, Mingwen et al. "Uncertainty-guided hierarchical frequency domain Transformer for image restoration" . | KNOWLEDGE-BASED SYSTEMS 263 (2023) .
APA Shao, Mingwen , Qiao, Yuanjian , Meng, Deyu , Zuo, Wangmeng . Uncertainty-guided hierarchical frequency domain Transformer for image restoration . | KNOWLEDGE-BASED SYSTEMS , 2023 , 263 .
Export to NoteExpress RIS BibTex
InDuDoNet plus : A deep unfolding dual domain network for metal artifact reduction in CT images EI SCIE Scopus
期刊论文 | 2023 , 85 | MEDICAL IMAGE ANALYSIS
SCOPUS Cited Count: 15
Abstract&Keyword Cite

Abstract :

During the computed tomography (CT) imaging process, metallic implants within patients often cause harmful artifacts, which adversely degrade the visual quality of reconstructed CT images and negatively affect the subsequent clinical diagnosis. For the metal artifact reduction (MAR) task, current deep learning based methods have achieved promising performance. However, most of them share two main common limitations: (1) the CT physical imaging geometry constraint is not comprehensively incorporated into deep network structures; (2) the entire framework has weak interpretability for the specific MAR task; hence, the role of each network module is difficult to be evaluated. To alleviate these issues, in the paper, we construct a novel deep unfolding dual domain network, termed InDuDoNet+, into which CT imaging process is finely embedded. Concretely, we derive a joint spatial and Radon domain reconstruction model and propose an optimization algorithm with only simple operators for solving it. By unfolding the iterative steps involved in the proposed algorithm into the corresponding network modules, we easily build the InDuDoNet+ with clear interpretability. Furthermore, we analyze the CT values among different tissues, and merge the prior observations into a prior network for our InDuDoNet+, which significantly improve its generalization performance. Comprehensive experiments on synthesized data and clinical data substantiate the superiority of the proposed methods as well as the superior generalization performance beyond the current state-of-the-art (SOTA) MAR methods. Code is available at https://github.com/hongwang01/InDuDoNet_plus.

Keyword :

CT imaging geometry Generalization ability Metal artifact reduction Physical interpretability

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wang, Hong , Li, Yuexiang , Zhang, Haimiao et al. InDuDoNet plus : A deep unfolding dual domain network for metal artifact reduction in CT images [J]. | MEDICAL IMAGE ANALYSIS , 2023 , 85 .
MLA Wang, Hong et al. "InDuDoNet plus : A deep unfolding dual domain network for metal artifact reduction in CT images" . | MEDICAL IMAGE ANALYSIS 85 (2023) .
APA Wang, Hong , Li, Yuexiang , Zhang, Haimiao , Meng, Deyu , Zheng, Yefeng . InDuDoNet plus : A deep unfolding dual domain network for metal artifact reduction in CT images . | MEDICAL IMAGE ANALYSIS , 2023 , 85 .
Export to NoteExpress RIS BibTex
RCDNet: An Interpretable Rain Convolutional Dictionary Network for Single Image Deraining SCIE Scopus
期刊论文 | 2023 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
SCOPUS Cited Count: 7
Abstract&Keyword Cite

Abstract :

As common weather, rain streaks adversely degrade the image quality and tend to negatively affect the performance of outdoor computer vision systems. Hence, removing rains from an image has become an important issue in the field. To handle such an ill-posed single image deraining task, in this article, we specifically build a novel deep architecture, called rain convolutional dictionary network (RCDNet), which embeds the intrinsic priors of rain streaks and has clear interpretability. In specific, we first establish a rain convolutional dictionary (RCD) model for representing rain streaks and utilize the proximal gradient descent technique to design an iterative algorithm only containing simple operators for solving the model. By unfolding it, we then build the RCDNet in which every network module has clear physical meanings and corresponds to each operation involved in the algorithm. This good interpretability greatly facilitates an easy visualization and analysis of what happens inside the network and why it works well in the inference process. Moreover, taking into account the domain gap issue in real scenarios, we further design a novel dynamic RCDNet, where the rain kernels can be dynamically inferred corresponding to input rainy images and then help shrink the space for rain layer estimation with few rain maps, so as to ensure a fine generalization performance in the inconsistent scenarios of rain types between training and testing data. By end-to-end training such an interpretable network, all involved rain kernels and proximal operators can be automatically extracted, faithfully characterizing the features of both rain and clean background layers and, thus, naturally leading to better deraining performance. Comprehensive experiments implemented on a series of representative synthetic and real datasets substantiate the superiority of our method, especially on its well generality to diverse testing scenarios and good interpretability for all its modules, compared with state-of-the-art single image derainers both visually and quantitatively.

Keyword :

Dictionary learning generalization performance interpretable deep learning (DL) single image rain removal

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wang, Hong , Xie, Qi , Zhao, Qian et al. RCDNet: An Interpretable Rain Convolutional Dictionary Network for Single Image Deraining [J]. | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2023 .
MLA Wang, Hong et al. "RCDNet: An Interpretable Rain Convolutional Dictionary Network for Single Image Deraining" . | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023) .
APA Wang, Hong , Xie, Qi , Zhao, Qian , Li, Yuexiang , Liang, Yong , Zheng, Yefeng et al. RCDNet: An Interpretable Rain Convolutional Dictionary Network for Single Image Deraining . | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2023 .
Export to NoteExpress RIS BibTex
Sparsity-Enhanced Convolutional Decomposition: A Novel Tensor-Based Paradigm for Blind Hyperspectral Unmixing EI SCIE Scopus
期刊论文 | 2022 , 60 | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
SCOPUS Cited Count: 42
Abstract&Keyword Cite

Abstract :

Blind hyperspectral unmixing (HU) has long been recognized as a crucial component in analyzing the hyperspectral imagery (HSI) collected by airborne and spaceborne sensors. Due to the highly ill-posed problems of such a blind source separation scheme and the effects of spectral variability in hyperspectral imaging, the ability to accurately and effectively unmixing the complex HSI still remains limited. To this end, this article presents a novel blind HU model, called sparsity-enhanced convolutional decomposition (SeCoDe), by jointly capturing spatial-spectral information of HSI in a tensor-based fashion. SeCoDe benefits from two perspectives. On the one hand, the convolutional operation is employed in SeCoDe to locally model the spatial relation between the targeted pixel and its neighbors, which can be well explained by spectral bundles that are capable of addressing spectral variabilities effectively. It maintains, on the other hand, physically continuous spectral components by decomposing the HSI along with the spectral domain. With sparsity-enhanced regularization, an alternative optimization strategy with alternating direction method of multipliers (ADMM)-based optimization algorithm is devised for efficient model inference. Extensive experiments conducted on three different data sets demonstrate the superiority of the proposed SeCoDe compared to previous state-of-the-art methods. We will also release the code at https://github.com/danfenghong/IEEE_TGRS_SeCoDe to encourage the reproduction of the given results.

Keyword :

Blind hyperspectral unmixing (HU) Context modeling Convolutional codes convolutional sparse coding (CSC) Encoding Hyperspectral imaging Optimization spectral bundles spectral variability (SV) Task analysis tensor decomposition Tensors

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Yao, Jing , Hong, Danfeng , Xu, Lin et al. Sparsity-Enhanced Convolutional Decomposition: A Novel Tensor-Based Paradigm for Blind Hyperspectral Unmixing [J]. | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2022 , 60 .
MLA Yao, Jing et al. "Sparsity-Enhanced Convolutional Decomposition: A Novel Tensor-Based Paradigm for Blind Hyperspectral Unmixing" . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 60 (2022) .
APA Yao, Jing , Hong, Danfeng , Xu, Lin , Meng, Deyu , Chanussot, Jocelyn , Xu, Zongben . Sparsity-Enhanced Convolutional Decomposition: A Novel Tensor-Based Paradigm for Blind Hyperspectral Unmixing . | IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING , 2022 , 60 .
Export to NoteExpress RIS BibTex
STEIN VARIATIONAL GRADIENT DESCENT ON INFINITE-DIMENSIONAL SPACE AND APPLICATIONS TO STATISTICAL INVERSE PROBLEMS EI SCIE Scopus
期刊论文 | 2022 , 60 (4) , 2225-2252 | SIAM JOURNAL ON NUMERICAL ANALYSIS
SCOPUS Cited Count: 1
Abstract&Keyword Cite

Abstract :

In this paper, we propose an infinite-dimensional version of the Stein variational gradient descent (iSVGD) method for solving Bayesian inverse problems. The method can generate approximate samples from posteriors efficiently. Based on the concepts of operator-valued kernels and vector-valued reproducing kernel Hilbert spaces, a rigorous definition is given for the infinite-dimensional objects, e.g., the Stein operator, which are proved to be the limit of finite-dimensional ones. Moreover, a more efficient iSVGD with preconditioning operators is constructed by generalizing the change of variables formula and introducing a regularity parameter. The proposed algorithms are applied to an inverse problem of the steady state Darcy flow equation. Numerical results confirm our theoretical findings and demonstrate the potential applications of the proposed approach in the posterior sampling of large-scale nonlinear statistical inverse problems.

Keyword :

Bayes? method machine learning statistical inverse problems Stein variational gradient descent variational inference method

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Jia, Junxiong , LI, Peijun , Meng, Deyu . STEIN VARIATIONAL GRADIENT DESCENT ON INFINITE-DIMENSIONAL SPACE AND APPLICATIONS TO STATISTICAL INVERSE PROBLEMS [J]. | SIAM JOURNAL ON NUMERICAL ANALYSIS , 2022 , 60 (4) : 2225-2252 .
MLA Jia, Junxiong et al. "STEIN VARIATIONAL GRADIENT DESCENT ON INFINITE-DIMENSIONAL SPACE AND APPLICATIONS TO STATISTICAL INVERSE PROBLEMS" . | SIAM JOURNAL ON NUMERICAL ANALYSIS 60 . 4 (2022) : 2225-2252 .
APA Jia, Junxiong , LI, Peijun , Meng, Deyu . STEIN VARIATIONAL GRADIENT DESCENT ON INFINITE-DIMENSIONAL SPACE AND APPLICATIONS TO STATISTICAL INVERSE PROBLEMS . | SIAM JOURNAL ON NUMERICAL ANALYSIS , 2022 , 60 (4) , 2225-2252 .
Export to NoteExpress RIS BibTex
Two-Stream Graph Convolutional Network for Intra-Oral Scanner Image Segmentation EI SCIE Scopus
期刊论文 | 2022 , 41 (4) , 826-835 | IEEE TRANSACTIONS ON MEDICAL IMAGING
SCOPUS Cited Count: 14
Abstract&Keyword Cite

Abstract :

Precise segmentation of teeth from intra-oral scanner images is an essential task in computer-aided orthodontic surgical planning. The state-of-the-art deep learning-based methods often simply concatenate the raw geometric attributes (i.e., coordinates and normal vectors) of mesh cells to train a single-stream network for automatic intra-oral scanner image segmentation. However, since different raw attributes reveal completely different geometric information, the naive concatenation of different raw attributes at the (low-level) input stage may bring unnecessary confusion in describing and differentiating between mesh cells, thus hampering the learning of high-level geometric representations for the segmentation task. To address this issue, we design a two-stream graph convolutional network (i.e., TSGCN), which can effectively handle inter-view confusion between different raw attributes to more effectively fuse their complementary information and learn discriminative multi-view geometric representations. Specifically, our TSGCN adopts two input-specific graph-learning streams to extract complementary high-level geometric representations from coordinates and normal vectors, respectively. Then, these single-view representations are further fused by a self-attention module to adaptively balance the contributions of different views in learning more discriminative multi-view representations for accurate and fully automatic tooth segmentation. We have evaluated our TSGCN on a real-patient dataset of dental (mesh) models acquired by 3D intraoral scanners. Experimental results show that our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.

Keyword :

Dentistry Feature extraction graph convolutional network Image segmentation Intra-oral scanner image segmentation Shape Task analysis Teeth Three-dimensional displays

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Zhao, Yue , Zhang, Lingming , Liu, Yang et al. Two-Stream Graph Convolutional Network for Intra-Oral Scanner Image Segmentation [J]. | IEEE TRANSACTIONS ON MEDICAL IMAGING , 2022 , 41 (4) : 826-835 .
MLA Zhao, Yue et al. "Two-Stream Graph Convolutional Network for Intra-Oral Scanner Image Segmentation" . | IEEE TRANSACTIONS ON MEDICAL IMAGING 41 . 4 (2022) : 826-835 .
APA Zhao, Yue , Zhang, Lingming , Liu, Yang , Meng, Deyu , Cui, Zhiming , Gao, Chenqiang et al. Two-Stream Graph Convolutional Network for Intra-Oral Scanner Image Segmentation . | IEEE TRANSACTIONS ON MEDICAL IMAGING , 2022 , 41 (4) , 826-835 .
Export to NoteExpress RIS BibTex
Context-Based Multiscale Unified Network for Missing Data Reconstruction in Remote Sensing Images EI SCIE Scopus
期刊论文 | 2022 , 19 | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS
SCOPUS Cited Count: 9
Abstract&Keyword Cite

Abstract :

Missing data reconstruction is a classical yet challenging problem in remote sensing images. Most current methods based on traditional convolutional neural network require supplementary data and can only handle one specific task. To address these limitations, we propose a novel generative adversarial network-based missing data reconstruction method in this letter, which is capable of various reconstruction tasks given only single source data as input. Two auxiliary patch-based discriminators are deployed to impose additional constraints on the local and global regions, respectively. In order to better fit the nature of remote sensing images, we introduce special convolutions and attention mechanism in a two-stage generator, thereby benefiting the tradeoff between accuracy and efficiency. Combining with perceptual and multiscale adversarial losses, the proposed model can produce coherent structure with better details. Qualitative and quantitative experiments demonstrate the uncompromising performance of the proposed model against multisource methods in generating visually plausible reconstruction results. Moreover, further exploration shows a promising way for the proposed model to utilize spatio-spectral-temporal information. The codes and models are available at https://github.com/Oliiveralien/Inpainting-on-RSI.

Keyword :

Context aware Gallium nitride generative adversarial network (GAN) Generators image reconstruction Image reconstruction Logic gates multiscale Remote sensing remote sensing images Task analysis Training

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Shao, Mingwen , Wang, Chao , Wu, Tianjun et al. Context-Based Multiscale Unified Network for Missing Data Reconstruction in Remote Sensing Images [J]. | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS , 2022 , 19 .
MLA Shao, Mingwen et al. "Context-Based Multiscale Unified Network for Missing Data Reconstruction in Remote Sensing Images" . | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS 19 (2022) .
APA Shao, Mingwen , Wang, Chao , Wu, Tianjun , Meng, Deyu , Luo, Jiancheng . Context-Based Multiscale Unified Network for Missing Data Reconstruction in Remote Sensing Images . | IEEE GEOSCIENCE AND REMOTE SENSING LETTERS , 2022 , 19 .
Export to NoteExpress RIS BibTex
DICDNet: Deep Interpretable Convolutional Dictionary Network for Metal Artifact Reduction in CT Images EI SCIE Scopus
期刊论文 | 2022 , 41 (4) , 869-880 | IEEE TRANSACTIONS ON MEDICAL IMAGING
SCOPUS Cited Count: 23
Abstract&Keyword Cite

Abstract :

Computed tomography (CT) images are often impaired by unfavorable artifacts caused by metallic implants within patients, which would adversely affect the subsequent clinical diagnosis and treatment. Although the existing deep-learning-based approaches have achieved promising success on metal artifact reduction (MAR) for CT images, most of them treated the task as a general image restoration problem and utilized off-the-shelf network modules for image quality enhancement. Hence, such frameworks always suffer from lack of sufficient model interpretability for the specific task. Besides, the existing MAR techniques largely neglect the intrinsic prior knowledge underlying metal-corrupted CT images which is beneficial for the MAR performance improvement. In this paper, we specifically propose a deep interpretable convolutional dictionary network (DICDNet) for the MAR task. Particularly, we first explore that the metal artifacts always present non-local streaking and star-shape patterns in CT images. Based on such observations, a convolutional dictionary model is deployed to encode the metal artifacts. To solve the model, we propose a novel optimization algorithm based on the proximal gradient technique. With only simple operators, the iterative steps of the proposed algorithm can be easily unfolded into corresponding network modules with specific physical meanings. Comprehensive experiments on synthesized and clinical datasets substantiate the effectiveness of the proposed DICDNet as well as its superior interpretability, compared to current state-of-the-art MAR methods. Code is available at https://github.com/hongwang01/DICDNet.

Keyword :

Computed tomography CT metal artifact reduction Dictionaries generalization performance Image reconstruction interpretable dictionary learning Mars Metals Optimization Task analysis

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Wang, Hong , Li, Yuexiang , He, Nanjun et al. DICDNet: Deep Interpretable Convolutional Dictionary Network for Metal Artifact Reduction in CT Images [J]. | IEEE TRANSACTIONS ON MEDICAL IMAGING , 2022 , 41 (4) : 869-880 .
MLA Wang, Hong et al. "DICDNet: Deep Interpretable Convolutional Dictionary Network for Metal Artifact Reduction in CT Images" . | IEEE TRANSACTIONS ON MEDICAL IMAGING 41 . 4 (2022) : 869-880 .
APA Wang, Hong , Li, Yuexiang , He, Nanjun , Ma, Kai , Meng, Deyu , Zheng, Yefeng . DICDNet: Deep Interpretable Convolutional Dictionary Network for Metal Artifact Reduction in CT Images . | IEEE TRANSACTIONS ON MEDICAL IMAGING , 2022 , 41 (4) , 869-880 .
Export to NoteExpress RIS BibTex
KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution CPCI-S Scopus
期刊论文 | 2022 , 13679 , 235-253 | COMPUTER VISION, ECCV 2022, PT XIX
SCOPUS Cited Count: 8
Abstract&Keyword Cite

Abstract :

Although current deep learning-based methods have gained promising performance in the blind single image super-resolution (SISR) task, most of them mainly focus on heuristically constructing diverse network architectures and put less emphasis on the explicit embedding of the physical generation mechanism between blur kernels and highresolution (HR) images. To alleviate this issue, we propose a modeldriven deep neural network, called KXNet, for blind SISR. Specifically, to solve the classical SISR model, we propose a simple-yet-effective iterative algorithm. Then by unfolding the involved iterative steps into the corresponding network module, we naturally construct the KXNet. The main specificity of the proposed KXNet is that the entire learning process is fully and explicitly integrated with the inherent physical mechanism underlying this SISR task. Thus, the learned blur kernel has clear physical patterns and the mutually iterative process between blur kernel and HR image can soundly guide the KXNet to be evolved in the right direction. Extensive experiments on synthetic and real data finely demonstrate the superior accuracy and generality of our method beyond the current representative state-of-the-art blind SISR methods. Code is available at: https://github.com/jiahong- fu/KXNet.

Keyword :

Blind single image super-resolution Kernel estimation Model-driven Mutual learning Physical generation mechanism

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Fu, Jiahong , Wang, Hong , Xie, Qi et al. KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution [J]. | COMPUTER VISION, ECCV 2022, PT XIX , 2022 , 13679 : 235-253 .
MLA Fu, Jiahong et al. "KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution" . | COMPUTER VISION, ECCV 2022, PT XIX 13679 (2022) : 235-253 .
APA Fu, Jiahong , Wang, Hong , Xie, Qi , Zhao, Qian , Meng, Deyu , Xu, Zongben . KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution . | COMPUTER VISION, ECCV 2022, PT XIX , 2022 , 13679 , 235-253 .
Export to NoteExpress RIS BibTex
Infrared Action Detection in the Dark via Cross-Stream Attention Mechanism EI SCIE Scopus
期刊论文 | 2022 , 24 , 288-300 | IEEE TRANSACTIONS ON MULTIMEDIA
SCOPUS Cited Count: 16
Abstract&Keyword Cite

Abstract :

Action detection plays an important role in video understanding and attracts considerable attention in the last decade. However, current action detection methods are mainly based on visible videos, and few of them consider scenes with low-light, where actions are difficult to be detected by existing methods, or even by human eyes. Compared with visible videos, infrared videos are more suitable for the dark environment and resistant to background clutter. In this paper, we investigate the temporal action detection problem in the dark by using infrared videos, which is, to the best of our knowledge, the first attempt in the action detection community. Our model takes the whole video as input, a Flow Estimation Network (FEN) is employed to generate the optical flow for infrared data, and it is optimized with the whole network to obtain action-related motion representations. After feature extraction, the infrared stream and flow stream are fed into a Selective Cross-stream Attention (SCA) module to narrow the performance gap between infrared and visible videos. The SCA emphasizes informative snippets and focuses on the more discriminative stream automatically. Then we adopt a snippet-level classifier to obtain action scores for all snippets and link continuous snippets into final detections. All these modules are trained in an end-to-end manner. We collect an Infrared action Detection (InfDet) dataset obtained in the dark and conduct extensive experiments to verify the effectiveness of the proposed method. Experimental results show that our proposed method surpasses state-of-the-art temporal action detection methods designed for visible videos, and it also achieves the best performance compared with other infrared action recognition methods on both InfAR and Infrared-Visible datasets.

Keyword :

Feature extraction Image recognition Infrared video Optical imaging Proposals selective cross-stream attention Streaming media Task analysis temporal action detection Three-dimensional displays

Cite:

Copy from the list or Export to your reference management。

GB/T 7714 Chen, Xu , Gao, Chenqiang , Li, Chaoyu et al. Infrared Action Detection in the Dark via Cross-Stream Attention Mechanism [J]. | IEEE TRANSACTIONS ON MULTIMEDIA , 2022 , 24 : 288-300 .
MLA Chen, Xu et al. "Infrared Action Detection in the Dark via Cross-Stream Attention Mechanism" . | IEEE TRANSACTIONS ON MULTIMEDIA 24 (2022) : 288-300 .
APA Chen, Xu , Gao, Chenqiang , Li, Chaoyu , Yang, Yi , Meng, Deyu . Infrared Action Detection in the Dark via Cross-Stream Attention Mechanism . | IEEE TRANSACTIONS ON MULTIMEDIA , 2022 , 24 , 288-300 .
Export to NoteExpress RIS BibTex
10| 20| 50 per page
< Page ,Total 20 >

Export

Results:

Selected

to

Format:
FAQ| About| Online/Total:1494/170510983
Address:XI'AN JIAOTONG UNIVERSITY LIBRARY(No.28, Xianning West Road, Xi'an, Shaanxi Post Code:710049) Contact Us:029-82667865
Copyright:XI'AN JIAOTONG UNIVERSITY LIBRARY Technical Support:Beijing Aegean Software Co., Ltd.