Query:
学者姓名:兰旭光
Refining:
Year
Type
Indexed by
Source
Complex
Co-Author
Language
Clean All
Abstract :
Depth maps acquired by either physical sensors or learning methods are often seriously distorted due to boundary distortion problems, including missing, fake, and misaligned boundaries (compared with RGB images). An RGB-guided depth map recovery method is proposed in this paper to recover true boundaries in seriously distorted depth maps. Therefore, a unified model is first developed to observe all these kinds of distorted boundaries in depth maps. Observing distorted boundaries is equivalent to identifying erroneous regions in distorted depth maps, because depth boundaries are essentially formed by contiguous regions with different intensities. Then, erroneous regions are identified by separately extracting local structures of RGB image and depth map with Gaussian kernels and comparing their similarity on the basis of the SSIM index. A depth map recovery method is then proposed on the basis of the unified model. This method recovers true depth boundaries by iteratively identifying and correcting erroneous regions in recovered depth map based on the unified model and a weighted median filter. Because RGB image generally includes additional textural contents compared with depth maps, texture-copy artifacts problem is further addressed in the proposed method by restricting the model works around depth boundaries in each iteration. Extensive experiments are conducted on five RGB-depth datasets including depth map recovery, depth super-resolution, depth estimation enhancement, and depth completion enhancement. The results demonstrate that the proposed method considerably improves both the quantitative and visual qualities of recovered depth maps in comparison with fifteen competitive methods. Most object boundaries in recovered depth maps are corrected accurately, and kept sharply and well aligned with the ones in RGB images.
Keyword :
boundary distortion Depth map recovery depth super-resolution texture copy artifacts
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Wang, Haotian , Yang, Meng , Lan, Xuguang et al. Depth Map Recovery Based on a Unified Depth Boundary Distortion Model [J]. | IEEE TRANSACTIONS ON IMAGE PROCESSING , 2022 , 31 : 7020-7035 . |
MLA | Wang, Haotian et al. "Depth Map Recovery Based on a Unified Depth Boundary Distortion Model" . | IEEE TRANSACTIONS ON IMAGE PROCESSING 31 (2022) : 7020-7035 . |
APA | Wang, Haotian , Yang, Meng , Lan, Xuguang , Zhu, Ce , Zheng, Nanning . Depth Map Recovery Based on a Unified Depth Boundary Distortion Model . | IEEE TRANSACTIONS ON IMAGE PROCESSING , 2022 , 31 , 7020-7035 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Despite the impressive progress achieved in robotic grasping, robots are not skilled in sophisticated tasks (e.g. search and grasp a specified target in clutter). Such tasks involve not only grasping but the comprehensive perception of the world (e.g. the object relationships). Recently, encouraging results demonstrate that it is possible to understand high-level concepts by learning. However, such algorithms are usually data-intensive, and the lack of data severely limits their performance. In this letter, we present a new dataset named REGRAD for the learning of relationships among objects and grasps. We collect the annotations of object poses, segmentations, grasps, and relationships for the target-driven relational grasping tasks. Our dataset is collected in both forms of 2D images and 3D point clouds. Moreover, since all the data are generated automatically, it is free to import new objects for data generation. We also released a real-world validation dataset to evaluate the sim-to-real performance of models trained on REGRAD. Finally, we conducted a series of experiments, showing that the models trained on REGRAD could generalize well to the realistic scenarios, in terms of both relationship and grasp detection. Our dataset and code could be found at.(1)
Keyword :
deep learning manipulation relationship perception for manipulation Robotic grasping robot vision
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Hanbo , Yang, Deyu , Wang, Han et al. REGRAD: A Large-Scale Relational Grasp Dataset for Safe and Object-Specific Robotic Grasping in Clutter [J]. | IEEE ROBOTICS AND AUTOMATION LETTERS , 2022 , 7 (2) : 2929-2936 . |
MLA | Zhang, Hanbo et al. "REGRAD: A Large-Scale Relational Grasp Dataset for Safe and Object-Specific Robotic Grasping in Clutter" . | IEEE ROBOTICS AND AUTOMATION LETTERS 7 . 2 (2022) : 2929-2936 . |
APA | Zhang, Hanbo , Yang, Deyu , Wang, Han , Zhao, Binglei , Lan, Xuguang , Ding, Jishiyu et al. REGRAD: A Large-Scale Relational Grasp Dataset for Safe and Object-Specific Robotic Grasping in Clutter . | IEEE ROBOTICS AND AUTOMATION LETTERS , 2022 , 7 (2) , 2929-2936 . |
Export to | NoteExpress RIS BibTex |
Abstract :
This paper presents INVIGORATE, a robot system that interacts with human through natural language and grasps a specified object in clutter. The objects may occlude, obstruct, or even stack on top of one another. INVIGORATE embodies several challenges: (i) infer the target object among other occluding objects, from input language expressions and RGB images, (ii) infer object blocking relationships (OBRs) from the images, and (iii) synthesize a multi-step plan to ask questions that disambiguate the target object and to grasp it successfully. We train separate neural networks for object detection, for visual grounding, for question generation, and for OBR detection and grasping. They allow for unrestricted object categories and language expressions, subject to the training datasets. However, errors in visual perception and ambiguity in human languages are inevitable and negatively impact the robot's performance. To overcome these uncertainties, we build a partially observable Markov decision process (POMDP) that integrates the learned neural network modules. Through approximate POMDP planning, the robot tracks the history of observations and asks disambiguation questions in order to achieve a near-optimal sequence of actions that identify and grasp the target object. INVIGORATE combines the benefits of model-based POMDP planning and data-driven deep learning. Preliminary experiments with INVIGORATE on a Fetch robot show significant benefits of this integrated approach to object grasping in clutter with natural language interactions. A demonstration video is available online.(1)
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Hanbo , Lu, Yunfan , Yu, Cunjun et al. INVIGORATE: Interactive Visual Grounding and Grasping in Clutter [C] . 2021 . |
MLA | Zhang, Hanbo et al. "INVIGORATE: Interactive Visual Grounding and Grasping in Clutter" . (2021) . |
APA | Zhang, Hanbo , Lu, Yunfan , Yu, Cunjun , Hsu, David , Lan, Xuguang , Zheng, Nanning . INVIGORATE: Interactive Visual Grounding and Grasping in Clutter . (2021) . |
Export to | NoteExpress RIS BibTex |
Abstract :
Grasping is an essential skill for robots to interact with humans and the environment. In this paper, we build a vision-based, robust, and real-time robotic grasping approach with fully convolutional neural network. The main component of our approach is a grasp detection network with oriented anchor boxes as detection priors. Because the orientation of detected grasps is significant, which determines the rotation angle configuration of the gripper, we propose the orientation anchor box mechanism to regress grasp angle based on predefined assumption instead of classification or regression without any priors. With oriented anchor boxes, the grasps can be predicted more accurately and efficiently. Besides, to accelerate the network training and further improve the performance of angle regression, angle matching is proposed during training instead of Jaccard index matching. Fivefold cross-validation results demonstrate that our proposed algorithm achieves an accuracy of 98.8% and 97.8% in image-wise split and object-wise split, respectively, and the speed of our detection algorithm is 67 frames per second (FPS) with GTX 1080Ti, outperforming all the current state-of-the-art grasp detection algorithms on Cornell Dataset both in speed and accuracy. Robotic experiments demonstrate the robustness and generalization ability in unseen objects and real-world environment, with the average success rate of 90.0% and 84.2% of familiar things and unseen things, respectively, on Baxter robot platform.
Keyword :
Angle matching Feature extraction fully convolutional neural network Grasping oriented anchor box Prediction algorithms real-time robotic grasping Real-time systems Robot kinematics Training
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Zhang, Hanbo , Zhou, Xinwen , Lan, Xuguang et al. A Real-Time Robotic Grasping Approach With Oriented Anchor Box [J]. | IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS , 2021 , 51 (5) : 3014-3025 . |
MLA | Zhang, Hanbo et al. "A Real-Time Robotic Grasping Approach With Oriented Anchor Box" . | IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 51 . 5 (2021) : 3014-3025 . |
APA | Zhang, Hanbo , Zhou, Xinwen , Lan, Xuguang , Li, Jin , Tian, Zhiqiang , Zheng, Nanning . A Real-Time Robotic Grasping Approach With Oriented Anchor Box . | IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS , 2021 , 51 (5) , 3014-3025 . |
Export to | NoteExpress RIS BibTex |
Abstract :
The development of machine learning algorithms are limited by the problems, such as weak generalization ability, poor robustness and lack of interpretability. In this paper, the important role of reasoning for machine learning human knowledge and logic, understanding and interpreting the world is illustrated. Firstly, the reasoning mechanism of the human brain is studied from cognitive maps, neurons and reward circuits, to brain-inspired intuitive reasoning, neural networks and reinforcement learning. Then, the current situation, progress and challenges of machine reasoning methods and their interrelationships are summarized, including intuitive reasoning, commonsense reasoning, causal reasoning and relational reasoning. Finally, the application prospects and future research directions of machine reasoning are analyzed. © 2021, Science Press. All right reserved.
Keyword :
Brain Learning algorithms Learning systems Reinforcement learning
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Ding, Mengyuan , Lan, Xuguang , Peng, Ru et al. Progress and Prospect of Machine Reasoning [J]. | Pattern Recognition and Artificial Intelligence , 2021 , 34 (1) : 1-13 . |
MLA | Ding, Mengyuan et al. "Progress and Prospect of Machine Reasoning" . | Pattern Recognition and Artificial Intelligence 34 . 1 (2021) : 1-13 . |
APA | Ding, Mengyuan , Lan, Xuguang , Peng, Ru , Zheng, Nanning . Progress and Prospect of Machine Reasoning . | Pattern Recognition and Artificial Intelligence , 2021 , 34 (1) , 1-13 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Many data sources, such as human poses, lie on low-dimensional manifolds that are smooth and bounded. Learning low-dimensional representations for such data is an important problem. One typical solution is to utilize encoder-decoder networks. However, due to the lack of effective regularization in latent space, the learned representations usually do not preserve the essential data relations. For example, adjacent video frames in a sequence may be encoded into very different zones across the latent space with holes in between. This is problematic for many tasks such as denoising because slightly perturbed data have the risk of being encoded into very different latent variables, leaving output unpredictable. To resolve this problem, we first propose a neighborhood geometric structure-preserving variational autoencoder (SP-VAE), which not only maximizes the evidence lower bound but also encourages latent variables to preserve their structures as in ambient space. Then, we learn a set of small surfaces to approximately bound the learned manifold to deal with holes in latent space. We extensively validate the properties of our approach by reconstruction, denoising, and random image generation experiments on a number of data sources, including synthetic Swiss roll, human pose sequences, and facial expression images. The experimental results show that our approach learns more smooth manifolds than the baselines. We also apply our approach to the tasks of human pose refinement and facial expression image interpolation where it gets better results than the baselines.
Keyword :
Bounded representation Decoding Image reconstruction Interpolation manifold learning Manifolds Noise reduction Principal component analysis Task analysis variational autoencoder (VAE)
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Chen, Xingyu , Wang, Chunyu , Lan, Xuguang et al. Neighborhood Geometric Structure-Preserving Variational Autoencoder for Smooth and Bounded Data Sources [J]. | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2021 . |
MLA | Chen, Xingyu et al. "Neighborhood Geometric Structure-Preserving Variational Autoencoder for Smooth and Bounded Data Sources" . | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2021) . |
APA | Chen, Xingyu , Wang, Chunyu , Lan, Xuguang , Zheng, Nanning , Zeng, Wenjun . Neighborhood Geometric Structure-Preserving Variational Autoencoder for Smooth and Bounded Data Sources . | IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS , 2021 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Deep reinforcement learning (DRL) algorithms have make remarkable progress in robot manipulation task in recent years. However, the success of completing the task relies heavily on the special design of reward function which requires engineering experience or domain-specific knowledge. To avoid complex reward shaping and make robot learning more general, it's of great essential to study the sparse-reward environments. In this paper, we present two types of challenging goal-conditioned sparse-reward tasks with 7-DoF robot arm, one is a target reaching task with obstacles, and the other is the dynamic object task where the target object moves at a certain speed. Based on the Hindsight Trust Region Policy Optimization (HTRPO) algorithm proposed by our research group, we studied the control performance on the two types of tasks with continuous high-dimensional state space. The results show that HTRPO can achieve more stable strategic performance, higher success rate and sample efficiency compared with its baseline algorithm TRPO and HPG. However, there still remains challenges in solving the tasks with high moving speed. © 2020 IEEE.
Keyword :
Deep learning Educational robots Reinforcement learning Robots
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Yang, Deyu , Zhang, Hanbo , Lan, Xuguang . Research on Complex Robot Manipulation Tasks Based on Hindsight Trust Region Policy Optimization [C] . 2020 : 4541-4546 . |
MLA | Yang, Deyu et al. "Research on Complex Robot Manipulation Tasks Based on Hindsight Trust Region Policy Optimization" . (2020) : 4541-4546 . |
APA | Yang, Deyu , Zhang, Hanbo , Lan, Xuguang . Research on Complex Robot Manipulation Tasks Based on Hindsight Trust Region Policy Optimization . (2020) : 4541-4546 . |
Export to | NoteExpress RIS BibTex |
Abstract :
It is challenging for reinforcement learning (RL) to solve the dynamic goal tasks of robot in sparse reward setting. Dynamic Hindsight Experience Replay (DHER) is a method to solve such problems. However, the learned policy DHER is easy to degrade, and the success rate is low, especially in complex environment. In order to help agents learn purposefully in dynamic goal tasks, avoid blind exploration, and improve the stability and robustness of policy, we propose a guided evaluation method named GEDHER, which assists the agent to learn under the guidance of evaluated expert demonstrations based on the DHER. In addition, We add the Gaussian noise in action sampling to balance the exploration and exploitation, preventing from falling into local optimal policy. Experiment results show that our method outperforms original DHER method in terms of both stability and success rate. © 2020, Springer Nature Switzerland AG.
Keyword :
Gaussian noise (electronic) Intelligent robots Reinforcement learning Robotics
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Feng, Chuzhen , Lan, Xuguang , Wan, Lipeng et al. A Guided Evaluation Method for Robot Dynamic Manipulation [C] . 2020 : 161-170 . |
MLA | Feng, Chuzhen et al. "A Guided Evaluation Method for Robot Dynamic Manipulation" . (2020) : 161-170 . |
APA | Feng, Chuzhen , Lan, Xuguang , Wan, Lipeng , Liang, Zhuo , Wang, Haoyu . A Guided Evaluation Method for Robot Dynamic Manipulation . (2020) : 161-170 . |
Export to | NoteExpress RIS BibTex |
Abstract :
The fundamental problem of Zero-Shot Learning (ZSL) is that the one-hot label space is discrete, which leads to a complete loss of the relationships between seen and unseen classes. Conventional approaches rely on using semantic auxiliary information, e.g. attributes, to re-encode each class so as to preserve the inter-class associations. However, existing learning algorithms only focus on unifying visual and semantic spaces without jointly considering the label space. More importantly, because the final classification is conducted in the label space through a compatibility function, the gap between attribute and label spaces leads to significant performance degradation. Therefore, this paper proposes a novel pathway that uses the label space to jointly reconcile visual and semantic spaces directly, which is named Attributing Label Space (ALS). In the training phase, one-hot labels of seen classes are directly used as prototypes in a common space, where both images and attributes are mapped. Since mappings can be optimized independently, the computational complexity is extremely low. In addition, the correlation between semantic attributes has less influence on visual embedding training because features are mapped into labels instead of attributes. In the testing phase, the discrete condition of label space is removed, and priori one-hot labels are used to denote seen classes and further compose labels of unseen classes. Therefore, the label space is very discriminative for the Generalized ZSL (GZSL), which is more reasonable and challenging for real-world applications. Extensive experiments on five benchmarks manifest improved performance over all of compared state-of-the-art methods.
Keyword :
Cats Correlation generalized zero-shot learning label space Projection learning Prototypes Semantics Testing Training Visualization
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Li, Jin , Lan, Xuguang , Long, Yang et al. A Joint Label Space for Generalized Zero-Shot Classification [J]. | IEEE TRANSACTIONS ON IMAGE PROCESSING , 2020 , 29 : 5817-5831 . |
MLA | Li, Jin et al. "A Joint Label Space for Generalized Zero-Shot Classification" . | IEEE TRANSACTIONS ON IMAGE PROCESSING 29 (2020) : 5817-5831 . |
APA | Li, Jin , Lan, Xuguang , Long, Yang , Liu, Yang , Chen, Xingyu , Shao, Ling et al. A Joint Label Space for Generalized Zero-Shot Classification . | IEEE TRANSACTIONS ON IMAGE PROCESSING , 2020 , 29 , 5817-5831 . |
Export to | NoteExpress RIS BibTex |
Abstract :
Generalized Zero-Shot Learning (GZSL) is a challenging topic that has promising prospects in many realistic scenarios. Using a gating mechanism that discriminates the unseen samples from the seen samples can decompose the GZSL problem to a conventional Zero-Shot Learning (ZSL) problem and a supervised classification problem. However, training the gate is usually challenging due to the lack of data in the unseen domain. To resolve this problem, in this paper, we propose a boundary based Out-of-Distribution (OOD) classifier which classifies the unseen and seen domains by only using seen samples for training. First, we learn a shared latent space on a unit hyper-sphere where the latent distributions of visual features and semantic attributes are aligned class-wisely. Then we find the boundary and the center of the manifold for each class. By leveraging the class centers and boundaries, the unseen samples can be separated from the seen samples. After that, we use two experts to classify the seen and unseen samples separately. We extensively validate our approach on five popular benchmark datasets including AWA1, AWA2, CUB, FLO and SUN. The experimental results show that our approach surpasses state-of-the-art approaches by a significant margin. © 2020, Springer Nature Switzerland AG.
Keyword :
Computer vision Semantics
Cite:
Copy from the list or Export to your reference management。
GB/T 7714 | Chen, Xingyu , Lan, Xuguang , Sun, Fuchun et al. A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning [C] . 2020 : 572-588 . |
MLA | Chen, Xingyu et al. "A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning" . (2020) : 572-588 . |
APA | Chen, Xingyu , Lan, Xuguang , Sun, Fuchun , Zheng, Nanning . A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning . (2020) : 572-588 . |
Export to | NoteExpress RIS BibTex |
Export
Results: |
Selected to |
Format: |