• Complex
  • Title
  • Author
  • Keyword
  • Abstract
  • Scholars
Search

Author:

Zhang, Shun (Zhang, Shun.) | Gong, Yi-Hong (Gong, Yi-Hong.) | Wang, Jin-Jun (Wang, Jin-Jun.) (Scholars:王进军)

Indexed by:

EI Scopus CSCD PKU Download Full text

Abstract:

As the important research achievement, deep convolutional neural networks have been widely applied to various fields such as computer vision, natural language processing, information retrieval, speech recognition, semantic understanding, and have attracted a wave of neural networks research from both academia and industry and have contributed to the development of artificial intelligence. The convolutional neural networks directly treat the original data as input, automatically learn the feature representations from a large number of training data. The convolutional neural networks have the characteristics of local connection, weight sharing and pooling operation, which can effectively decrease the network complexity and reduce the number of training parameters, so that the model has some certain invariance to translation, distortion and scale. Currently, many approaches of deep neural networks, including the increase of size and complexity of neural networks, the use of larger sets of training data, the improvement of neural network architecture and training methods, etc., have been proposed to simulate the complex hierarchical cognitive attributes of human brain and pull close the gap between the human brain and visual system, so that the machine has the capability to capture 'abstraction concepts'. The deep convolutional neural networks have been a great success in many computer vision tasks, such as image classification, object detection, face recognition, and person re-identification. In this paper, we first review the history of the development of convolutional neural networks, and briefly introduce M-P neuron model, Hubel-Wiesel model, Neocognitron, LeNet for handwriting recognition, and deep convolutional neural network for image classification in the ImageNet competition. Then we have a detailed analysis of the fundamental principle of deep convolutional neural networks, and introduce the mathematical representation and the respective functions of the convolution layer, the pooling layer and the fully connected layer. Besides, this paper focuses on the representative works of convolutional neural networks on the following three aspects, and demonstrates various technical methods in improving the accuracy of image classification using examples. In the aspect of increasing the number of neural networks' layers, the architectures of classical convolutional neural networks such as AlexNet, ZF-Net, VGG, GoogLeNet and ResNet are discussed and analyzed. In the aspect of increasing the amount of data, we introduce the difficulties of increasing the number of annotated samples by manual way, and the effect in improving the performance of convolutional neural networks by data augmentation. In the aspect of improving training methods, we introduce the generalized regularization techniques such as the L2 regularization, Dropout, DropConnect and Maxout, several frequently-used neuron activation functions such as the sigmoid function, the tanh function, the ReLU function, the LReLU function, and the PReLU function, several different loss functions such as the softmax loss, the hinge loss, the contrastive loss and the triplet loss, and the basic idea of the batch normalization technique. In the field of computer vision, this paper focuses on the more recent research progress of convolutional neural networks in image classification, object detection, face recognition, pedestrian recognition, image semantic segmentation, image captioning, image super resolution, human action recognition and image retrieval. From the prospective of the human visual cognitive mechanism, we analyze the relevant theoretical achievements of hierarchical processing in the visual system and 'global first' visual and cognitive process, and some theoretical implications for the current computational models. Finally, some remained problems and challenges of the brain-like intelligence research based on deep convolutional neural networks are concluded. © 2019, Science Press. All right reserved.

Keyword:

Brain Character recognition Complex networks Computer vision Convolution Deep learning Deep neural networks Face recognition Functions Image classification Image enhancement Image retrieval Image segmentation Multilayer neural networks Natural language processing systems Network architecture Network layers Neural networks Object detection Object recognition Scales (weighing instruments) Semantics Speech recognition

Author Community:

  • [ 1 ] [Zhang, Shun]School of Electronics and Information, Northwestern Polytechnical University, Xi'an; 710072, China
  • [ 2 ] [Gong, Yi-Hong]Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an; 710049, China
  • [ 3 ] [Wang, Jin-Jun]Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an; 710049, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

Jisuanji Xuebao/Chinese Journal of Computers

ISSN: 0254-4164

Year: 2019

Issue: 3

Volume: 42

Page: 453-482

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count: 135

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 13

FAQ| About| Online/Total:630/199579148
Address:XI'AN JIAOTONG UNIVERSITY LIBRARY(No.28, Xianning West Road, Xi'an, Shaanxi Post Code:710049) Contact Us:029-82667865
Copyright:XI'AN JIAOTONG UNIVERSITY LIBRARY Technical Support:Beijing Aegean Software Co., Ltd.