Details - 西安交通大学机构知识库

Query：

学者姓名：王进军

Refining：

Year

2022 (2)
2021 (5)
2020 (11)
2019 (7)
2018 (8)
2017 (8)
2016 (15)
2015 (16)
2014 (4)

Submit Unfold

Type

会议论文 (44)
期刊论文 (32)

Submit Unfold

Indexed by

EI (73)
Scopus (60)
SCIE (28)
CPCI-S (24)
CSCD (3)
PKU (3)

Submit Unfold

Source

NEUROCOMPUTING (8)
17th Pacific-Rim Conference on Multimedia, PCM 2016 (7)
PATTERN RECOGNITION (6)
IEEE TRANSACTIONS ON IMAGE PROCESSING (4)
5th International Conference on Information Science, Computer Technology and Transportation, ISCTT 2020 (3)
25th International Joint Conference on Artificial Intelligence, IJCAI 2016 (2)
Chinese Automation Congress (CAC) (2)
IEEE First International Conference on Multimedia Big Data (2)
IEEE International Conference on Image Processing (ICIP) (2)
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2)
IMAGE AND VISION COMPUTING (2)
14th European Conference on Computer Vision (ECCV) (1)
17th IEEE/CVF International Conference on Computer Vision, ICCV 2019 (1)
2014 ACM Conference on Multimedia, MM 2014 (1)
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (1)
2020 Chinese Automation Congress, CAC 2020 (1)
2020 IEEE International Conference on Artificial Intelligence and Computer Applications, ICAICA 2020 (1)
2020 IEEE International Conference on Image Processing, ICIP 2020 (1)
2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020 (1)
2021 4th International Conference on Advanced Algorithms and Control Engineering, ICAACE 2021 (1)
22nd International Conference on Pattern Recognition (ICPR) (1)
23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020 (1)
23rd International Conference on MultiMedia Modeling (MMM) (1)
24th International Conference on Pattern Recognition (ICPR) (1)
25th IEEE International Conference on Image Processing (ICIP) (1)
26th IEEE International Conference on Image Processing, ICIP 2019 (1)
2nd China Symposium on Cognitive Computing and Hybrid Intelligence, CCHI 2019 (1)
30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (1)
3DTV Conference The True Vision Capture Transmission and Display of 3D Video 3DTV CON (1)
7th International Conference on Internet Multimedia Computing and Service, ICIMCS 2015 (1)
ACM International Conference on Multimedia (MM) (1)
Annual Summit and Conference of Asia-Pacific-Signal-and-Information-Processing-Association (APSIPA) (1)
COMPUTER VISION - ECCV 2022, PT XVIII (1)
Faguang Xuebao/Chinese Journal of Luminescence (1)
Guangxue Xuebao/Acta Optica Sinica (1)
IEEE 9th International Conference on Semantic Computing (1)
IEEE ACCESS (1)
IEEE International Conference on Multimedia and Expo (ICME) (1)
IEEE TRANSACTIONS ON MULTIMEDIA (1)
IEEE Transactions on Circuits and Systems for Video Technology (1)
IEEE Transactions on Image Processing (1)
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (1)
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (1)
INTERNATIONAL JOURNAL OF COMPUTER VISION (1)
International Joint Conference on Neural Networks (IJCNN) (1)
Jisuanji Xuebao/Chinese Journal of Computers (1)
Pattern Recognition (1)

Submit Unfold

Complex

First Comm (65)
First Author (4)
Reprint Author (28)
Reprint Comm (60)
CAS 1 (3)
CAS 2 (23)

Submit Unfold

Co-Author

Gong, Yihong (46)
Zheng, Nanning (24)
Zhou, Sanping (22)
Zhang, Shizhou (17)
Liang, Yudong (12)
Zhang, Shun (11)
Hou, Qiqi (9)
Wan, Xingyu (8)
Cheng, De (5)
Liu, Nan (5)
Huang, Wenli (4)
Li, Chengqi (4)
Ren, Zhigang (4)
Shi, Weiwei (4)
Wang, Le (4)
Xin, Xiaomeng (4)
Yang, Bo (4)
Bai, Ruibin (3)
Cai, Fudong (3)
Deng, Ye (3)
Li, Mengliu (3)
Lin, Weiyao (3)
Liu, Huanyun (3)
Li, Wenpeng (3)
Lv, Changfeng (3)
Meng, Deyu (3)
Sun, Yongli (3)
Tao, Xiaoyu (3)
Wei, Xing (3)
Xu, Han (3)
Chen, Chuyang (2)
Cheng, Lele (2)
Huang, Dong (2)
Hui, Siqi (2)
Li, Yubing (2)
Shuai, Minwei (2)
Wang, Fei (2)
Wang, Jiayun (2)
Wang, Xia (2)
Wang, Zelun (2)
Yang, Ze (2)
Zhang, Jimuyang (2)
Zhang, Xinzi (2)
Zhao, Qing (2)
Zhou, Yu (2)
Ahuja, Narendra (1)
Cai, Qing (1)
Cao, Jiakai (1)
Cao, Junliang (1)
Chai, Zhenhua (1)
Deng, Shunming (1)
Du, Shaoyi (1)
Fang, Chao (1)
Fang, Chaowei (1)
Gong, Yi-Hong (1)
Guo, Guoxin (1)
Huang, Jia-Bin (1)
Huang, Zeyi (1)
Jiang, Huaizu (1)
Kong, Zhifeng (1)
Lan, Xuguang (1)
Lim, Jongwoo (1)
Liu, Hongyuan (1)
Liu, Wei (1)
Liu, Yuehu (1)
Li, Xu (1)
Long, Yilin (1)
Lu, Jiwen (1)
Luo, Chuanfei (1)
Ma, Yangyang (1)
Meng, Rongye (1)
Meng, Xiaoliang (1)
Rong, Na (1)
Shi, Dahu (1)
Shi, Rui (1)
Shizhou, Zhang (1)
Shu, Jun (1)
Su, Hang (1)
Sun, Guangze (1)
Timofte, Radu (1)
Wang, Jixin (1)
Wang, Xiaoliang (1)
Wang, Xiao-Liang (1)
Wang, Yuechen (1)
Wen, Liqiang (1)
Wu, Jianxin (1)
Wu, Xindi (1)
Xiao, Jing (1)
Xia, Yong (1)
Xie, Ruji (1)
Xing, Yixin (1)
Xu, Chenyang (1)
Yang, Ming-Hsuan (1)
Yang, Xiangru (1)
Yu, Qinghua (1)
Zeng, Ming (1)
Zhang, Jing-Wen (1)

Submit Unfold

Language

English (73)
Chinese (3)

Submit

Clean All

Select All Export Sort by：

Default

Default
Title
Year
WOS Cited Count
Impact factor
Ascending
Descending

< Page ，Total 8 >

Hourglass Attention Network for Image Inpainting CPCI-S Scopus

期刊论文 | 2022 , 13678 , 483-501 | COMPUTER VISION - ECCV 2022, PT XVIII

Deng, Ye | Hui, Siqi | Meng, Rongye | Zhou, Sanping | Wang, Jinjun

SCOPUS Cited Count： 9

Abstract&Keyword Cite

Abstract ：

Benefiting from the powerful ability of convolutional neural networks (CNNs) to learn semantic information and texture patterns of images, learning-based image inpainting methods have made noticeable breakthroughs over the years. However, certain inherent defects (e.g. local prior, spatially sharing parameters) of CNNs limit their performance when encountering broken images mixed with invalid information. Compared to convolution, attention has a lower inductive bias, and the output is highly correlated with the input, making it more suitable for processing images with various breakage. Inspired by this, in this paper we propose a novel attention-based network (transformer), called hourglass attention network (HAN) for image inpainting, which builds an hourglass-shaped attention structure to generate appropriate features for complemented images. In addition, we design a novel attention called Laplace attention, which introduces a Laplace distance prior for the vanilla multi-head attention, allowing the feature matching process to consider not only the similarity of features themselves, but also distance between features. With the synergy of hourglass attention structure and Laplace attention, our HAN is able to make full use of hierarchical features to mine effective information for broken images. Experiments on several benchmark datasets demonstrate superior performance by our proposed approach. The code can be found at github.com/dengyecode/hourglassattention.

Keyword ：

Attention Image inpainting Transformer

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Deng, Ye , Hui, Siqi , Meng, Rongye et al. Hourglass Attention Network for Image Inpainting [J]. \| COMPUTER VISION - ECCV 2022, PT XVIII , 2022 , 13678 : 483-501 .
MLA	Deng, Ye et al. "Hourglass Attention Network for Image Inpainting" . \| COMPUTER VISION - ECCV 2022, PT XVIII 13678 (2022) : 483-501 .
APA	Deng, Ye , Hui, Siqi , Meng, Rongye , Zhou, Sanping , Wang, Jinjun . Hourglass Attention Network for Image Inpainting . \| COMPUTER VISION - ECCV 2022, PT XVIII , 2022 , 13678 , 483-501 .
Export to	NoteExpress RIS BibTex

Multi-Reception and Multi-Gradient Discriminator for Image Inpainting EI SCIE Scopus

期刊论文 | 2022 , 10 , 131579-131591 | IEEE ACCESS

Huang, Wenli | Deng, Ye | Hui, Siqi | Wang, Jinjun

SCOPUS Cited Count： 1

Abstract&Keyword Cite

Abstract ：

Many deep learning methods for image inpainting rely on the encoder-decoder architecture to estimate missing contents. When guidance information from uncorrupted regions cannot be adequately represented or utilized, the encoder may have difficulty handling the rich surrounding or background pixels, and the decoder could not recover visually sophisticated or realistic content. This paper proposes an effective multi-scale optimization network to alleviate these issues and generate coherent results with fine details. It adaptively encodes multi-receptive fields feature maps and puts multi-scale outputs into a discriminator to guide training. Specifically, we propose a Multi-Receptive feature maps & masks Selective Fusion (MRSF) operator that can adaptively extract features in different receptive fields to handle sophisticated destroyed images. Then a multi-gradient discriminator (MGD) module uses the intermediate features of the discriminator to guide the generator to produce results with natural textures and semantically real contents. Experiments on several benchmark datasets demonstrate that the proposed method can synthesize more realistic and coherent image content.

Keyword ：

Image inpainting Image reconstruction multi-gradient discriminator multi-receptive feature maps and masks selective fusion

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Huang, Wenli , Deng, Ye , Hui, Siqi et al. Multi-Reception and Multi-Gradient Discriminator for Image Inpainting [J]. \| IEEE ACCESS , 2022 , 10 : 131579-131591 .
MLA	Huang, Wenli et al. "Multi-Reception and Multi-Gradient Discriminator for Image Inpainting" . \| IEEE ACCESS 10 (2022) : 131579-131591 .
APA	Huang, Wenli , Deng, Ye , Hui, Siqi , Wang, Jinjun . Multi-Reception and Multi-Gradient Discriminator for Image Inpainting . \| IEEE ACCESS , 2022 , 10 , 131579-131591 .
Export to	NoteExpress RIS BibTex

Collaborative Attention Network for Person Re-identification EI

会议论文 | 2021 , 1848 (1) | 2021 4th International Conference on Advanced Algorithms and Control Engineering, ICAACE 2021

Abstract&Keyword Cite

Abstract ：

The quality of visual feature representation has always been a key factor in many computer vision tasks. In the person re-identification (Re-ID) problem, combining global and local features to improve model performance is becoming a popular method, because previous works only used global features alone, which is very limited at extracting discriminative local patterns from the obtained representation. Some existing works try to collect local patterns explicitly slice the global feature into several local pieces in a handcrafted way. By adopting the slicing and duplication operation, models can achieve relatively higher accuracy but we argue that it still does not take full advantage of partial patterns because the rule and strategy local slices are defined. In this paper, we show that by firstly over-segmenting the global region by the proposed multi-branch structure, and then by learning to combine local features from neighbourhood regions using the proposed Collaborative Attention Network (CAN), the final feature representation for Re-ID can be further improved. The experiment results on several widely-used public datasets prove that our method outperforms many existing state-of-the-art methods. © Published under licence by IOP Publishing Ltd.

Keyword ：

Physics

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Li, Wenpeng , Sun, Yongli , Wang, Jinjun et al. Collaborative Attention Network for Person Re-identification [C] . 2021 .
MLA	Li, Wenpeng et al. "Collaborative Attention Network for Person Re-identification" . (2021) .
APA	Li, Wenpeng , Sun, Yongli , Wang, Jinjun , Cao, Junliang , Xu, Han , Yang, Xiangru et al. Collaborative Attention Network for Person Re-identification . (2021) .
Export to	NoteExpress RIS BibTex

Tracking Beyond Detection: Learning a Global Response Map for End-to-End Multi-Object Tracking EI SCIE

期刊论文 | 2021 , 30 , 8222-8235 | IEEE TRANSACTIONS ON IMAGE PROCESSING

Wan, Xingyu | Cao, Jiakai | Zhou, Sanping | Wang, Jinjun | Zheng, Nanning

Abstract&Keyword Cite

Abstract ：

Most of the existing Multi-Object Tracking (MOT) approaches follow the Tracking-by-Detection and Data Association paradigm, in which objects are firstly detected and then associated in the tracking process. In recent years, deep neural network has been utilized to obtain more discriminative appearance features for cross-frame association, and noticeable performance improvement has been reported. On the other hand, the Tracking-by-Detection framework is yet not completely end-to-end, which leads to huge computation and limited performance especially in the inference (tracking) process. To address this problem, we present an effective end-to-end deep learning framework which can directly take image-sequence/video as input and output the located and tracked objects of learned types. Specifically, a novel global response network is learned to project multiple objects in the image-sequence/video into a continuous response map, and the trajectory of each tracked object can then be easily picked out. The overall process is similar to how a detector inputs an image and outputs the bounding boxes of each detected object. Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieves state-of-the-art performance on several tracking metrics.

Keyword ：

Data models deep neural network Feature extraction global response map Measurement Multi-object tracking Object detection Target tracking Task analysis Trajectory

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Wan, Xingyu , Cao, Jiakai , Zhou, Sanping et al. Tracking Beyond Detection: Learning a Global Response Map for End-to-End Multi-Object Tracking [J]. \| IEEE TRANSACTIONS ON IMAGE PROCESSING , 2021 , 30 : 8222-8235 .
MLA	Wan, Xingyu et al. "Tracking Beyond Detection: Learning a Global Response Map for End-to-End Multi-Object Tracking" . \| IEEE TRANSACTIONS ON IMAGE PROCESSING 30 (2021) : 8222-8235 .
APA	Wan, Xingyu , Cao, Jiakai , Zhou, Sanping , Wang, Jinjun , Zheng, Nanning . Tracking Beyond Detection: Learning a Global Response Map for End-to-End Multi-Object Tracking . \| IEEE TRANSACTIONS ON IMAGE PROCESSING , 2021 , 30 , 8222-8235 .
Export to	NoteExpress RIS BibTex

Hierarchical and Interactive Refinement Network for Edge-Preserving Salient Object Detection SCIE

期刊论文 | 2021 , 30 , 1-14 | IEEE TRANSACTIONS ON IMAGE PROCESSING

WoS CC Cited Count： 4

Abstract&Keyword Cite

Abstract ：

Salient object detection has undergone a very rapid development with the blooming of Deep Neural Network (DNN), which is usually taken as an important preprocessing procedure in various computer vision tasks. However, the down-sampling operations, such as pooling and striding, always make the final predictions blurred at edges, which has seriously degenerated the performance of salient object detection. In this paper, we propose a simple yet effective approach, i.e., Hierarchical and Interactive Refinement Network (HIRN), to preserve the edge structures in detecting salient objects. In particular, a novel multi-stage and dual-path network structure is designed to estimate the salient edges and regions from the low-level and high-level feature maps, respectively. As a result, the predicted regions will become more accurate by enhancing the weak responses at edges, while the predicted edges will become more semantic by suppressing the false positives in background. Once the salient maps of edges and regions are obtained at the output layers, a novel edge-guided inference algorithm is introduced to further filter the resulting regions along the predicted edges. Extensive experiments on several benchmark datasets have been conducted, in which the results show that our method significantly outperforms a variety of state-of-the-art approaches.

Keyword ：

edge-guided inference Feature extraction Hierarchical and Interactive Refinement Network Image edge detection Inference algorithms Object detection Prediction algorithms Salient object detection Semantics Training

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Zhou, Sanping , Wang, Jinjun , Wang, Le et al. Hierarchical and Interactive Refinement Network for Edge-Preserving Salient Object Detection [J]. \| IEEE TRANSACTIONS ON IMAGE PROCESSING , 2021 , 30 : 1-14 .
MLA	Zhou, Sanping et al. "Hierarchical and Interactive Refinement Network for Edge-Preserving Salient Object Detection" . \| IEEE TRANSACTIONS ON IMAGE PROCESSING 30 (2021) : 1-14 .
APA	Zhou, Sanping , Wang, Jinjun , Wang, Le , Zhang, Jimuyang , Wang, Fei , Huang, Dong et al. Hierarchical and Interactive Refinement Network for Edge-Preserving Salient Object Detection . \| IEEE TRANSACTIONS ON IMAGE PROCESSING , 2021 , 30 , 1-14 .
Export to	NoteExpress RIS BibTex

Single-Image super-resolution-When model adaptation matters EI SCIE

期刊论文 | 2021 , 116 | PATTERN RECOGNITION

WoS CC Cited Count： 2

Abstract&Keyword Cite

Abstract ：

In recent years, impressive advances have been made in single-image super-resolution. Deep learning is behind much of this success. Deep(er) architecture design and external prior modeling are the key ingredients. The internal contents of the low-resolution input image are neglected with deep modeling, despite earlier works that show the power of using such internal priors. In this paper, we propose a variation of deep residual convolutional neural networks, which has been carefully designed for robustness and efficiency in both learning and testing. Moreover, we propose multiple strategies for model adaptation to the internal contents of the low-resolution input image and analyze their strong points and weaknesses. By trading runtime and using internal priors, we achieve improvements from 0.1 to 0.3 dB PSNR over the reported results on standard datasets. Our adaptation especially favors images with repetitive structures or high resolutions. It indicates a more practical usage when our adaption approach applies to sequences or videos in which adjacent frames are strongly correlated in their contents. Moreover, the approach can be combined with other simple techniques, such as back-projection and enhanced prediction, to realize further improvements. (c) 2021 Published by Elsevier Ltd.

Keyword ：

Deep convolutional neural network Internal prior Model adaptation Projection skip connection

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Liang, Yudong , Timofte, Radu , Wang, Jinjun et al. Single-Image super-resolution-When model adaptation matters [J]. \| PATTERN RECOGNITION , 2021 , 116 .
MLA	Liang, Yudong et al. "Single-Image super-resolution-When model adaptation matters" . \| PATTERN RECOGNITION 116 (2021) .
APA	Liang, Yudong , Timofte, Radu , Wang, Jinjun , Zhou, Sanping , Gong, Yihong , Zheng, Nanning . Single-Image super-resolution-When model adaptation matters . \| PATTERN RECOGNITION , 2021 , 116 .
Export to	NoteExpress RIS BibTex

Multinetwork Collaborative Feature Learning for Semisupervised Person Reidentification EI SCIE Scopus

期刊论文 | 2021 | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

SCOPUS Cited Count： 10

Abstract&Keyword Cite

Abstract ：

Person reidentification (Re-ID) aims at matching images of the same identity captured from the disjoint camera views, which remains a very challenging problem due to the large cross-view appearance variations. In practice, the mainstream methods usually learn a discriminative feature representation using a deep neural network, which needs a large number of labeled samples in the training process. In this article, we design a simple yet effective multinetwork collaborative feature learning (MCFL) framework to alleviate the data annotation requirement for person Re-ID, which can confidently estimate the pseudolabels of unlabeled sample pairs and consistently learn the discriminative features of input images. To keep the precision of pseudolabels, we further build a novel self-paced collaborative regularizer to extensively exchange the weight information of unlabeled sample pairs between different networks. Once the pseudolabels are correctly estimated, we take the corresponding sample pairs into the training process, which is beneficial to learn more discriminative features for person Re-ID. Extensive experimental results on the Market1501, DukeMTMC, and CUHK03 data sets have shown that our method outperforms most of the state-of-the-art approaches.

Keyword ：

Collaboration Collaborative work Deep neural network (DNN) Estimation Feature extraction multinetwork collaborative feature learning (MCFL) Neural networks person reidentification (Re-ID) Semisupervised learning Training

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Zhou, Sanping , Wang, Jinjun , Shu, Jun et al. Multinetwork Collaborative Feature Learning for Semisupervised Person Reidentification [J]. \| IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2021 .
MLA	Zhou, Sanping et al. "Multinetwork Collaborative Feature Learning for Semisupervised Person Reidentification" . \| IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2021) .
APA	Zhou, Sanping , Wang, Jinjun , Shu, Jun , Meng, Deyu , Wang, Le , Zheng, Nanning . Multinetwork Collaborative Feature Learning for Semisupervised Person Reidentification . \| IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2021 .
Export to	NoteExpress RIS BibTex

Temporal aggregation with clip-level attention for video-based person re-identification EI Scopus

会议论文 | 2020 , 3365-3373 | 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020

Li, Mengliu | Xu, Han | Wang, Jinjun | Li, Wenpeng | Sun, Yongli

SCOPUS Cited Count： 5

Abstract&Keyword Cite

Abstract ：

Video-based person re-identification (Re-ID) methods can extract richer features than image-based ones from short video clips. The existing methods usually apply simple strategies, such as average/max pooling, to obtain the tracklet-level features, which has been proved hard to aggregate the information from all video frames. In this paper, we propose a simple yet effective Temporal Aggregation with Clip-level Attention Network (TACAN) to solve the temporal aggregation problem in a hierarchal way. Specifically, a tracklet is firstly broken into different numbers of clips, through a two-stage temporal aggregation network we can get the tracklet-level feature representation. A novel min-max loss is introduced to learn both a clip-level attention extractor and a clip-level feature representer in the training process. Afterwards, the resulting clip-level weights are further taken to average the clip-level features, which can generate a robust tracklet-level feature representation at the testing stage. Experimental results on four benchmark datasets, including the MARS, iLIDS-VID, PRID-2011 and DukeMTMC-VideoReID, show that our TACAN has achieved significant improvements as compared with the state-of-the-art approaches. © 2020 IEEE.

Keyword ：

Air traffic control Computer vision

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Li, Mengliu , Xu, Han , Wang, Jinjun et al. Temporal aggregation with clip-level attention for video-based person re-identification [C] . 2020 : 3365-3373 .
MLA	Li, Mengliu et al. "Temporal aggregation with clip-level attention for video-based person re-identification" . (2020) : 3365-3373 .
APA	Li, Mengliu , Xu, Han , Wang, Jinjun , Li, Wenpeng , Sun, Yongli . Temporal aggregation with clip-level attention for video-based person re-identification . (2020) : 3365-3373 .
Export to	NoteExpress RIS BibTex

Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification CPCI-S

会议论文 | 2020 , 3365-3373 | IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Li, Mengliu | Xu, Han | Wang, Jinjun | Li, Wenpeng | Sun, Yongli

Abstract&Keyword Cite

Abstract ：

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Li, Mengliu , Xu, Han , Wang, Jinjun et al. Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification [C] . 2020 : 3365-3373 .
MLA	Li, Mengliu et al. "Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification" . (2020) : 3365-3373 .
APA	Li, Mengliu , Xu, Han , Wang, Jinjun , Li, Wenpeng , Sun, Yongli . Temporal Aggregation with Clip-level Attention for Video-based Person Re-identification . (2020) : 3365-3373 .
Export to	NoteExpress RIS BibTex

Spatio-temporal Collaborative Convolution for Video Action Recognition EI Scopus

会议论文 | 2020 , 554-558 | 2020 IEEE International Conference on Artificial Intelligence and Computer Applications, ICAICA 2020

Li, Xu | Wen, Liqiang | Wang, Jinjun | Zeng, Ming

Abstract&Keyword Cite

Abstract ：

Although video action recognition has achieved great progress in recent years, it is still a challenging task due to the huge computational complexity. Designing a lightweight network is a feasible solution, but it may reduce the spatio-temporal information modeling capability. In this paper, we propose a novel novel spatio-temporal collaborative convolution (denote as 'STC-Conv'), which can efficiently encode spatio-temporal information. STC-Conv collaboratively learn spatial and temporal feature in one convolution filter kernel. In short, temporal convolution and spatial convolution are integrated in the one STC convolution kernel, which can effectively reduce the model complexity and improve the computational efficiency. STC-Conv is a universal convolution, which can be applied to the existing 2D CNNs, such as ResNet, DenseNet. The experimental results on the temporal-related dataset Something Something V1 prove the superiority of our method. Noticeably, STC-Conv enjoys more excellent performance than 3D CNNs at even lower computation cost than standard 2D CNNs. © 2020 IEEE.

Keyword ：

Artificial intelligence Complex networks Computational efficiency Convolution

Cite：

Copy from the list or Export to your reference management。

GB/T 7714	Li, Xu , Wen, Liqiang , Wang, Jinjun et al. Spatio-temporal Collaborative Convolution for Video Action Recognition [C] . 2020 : 554-558 .
MLA	Li, Xu et al. "Spatio-temporal Collaborative Convolution for Video Action Recognition" . (2020) : 554-558 .
APA	Li, Xu , Wen, Liqiang , Wang, Jinjun , Zeng, Ming . Spatio-temporal Collaborative Convolution for Video Action Recognition . (2020) : 554-558 .
Export to	NoteExpress RIS BibTex

10| 20| 50 per page

< Page ，Total 8 >

Type
Departments

All Years Choose Year From to