Paper Image

Target grasping using vision-language model

Published on:

28 September 2023

Primary Category:


Paper Authors:

Xinyu Chen,

Jian Yang,

Zonghan He,

Haobin Yang,

Qi Zhao,

Yuhui Shi


Key Details

Proposes QwenGrasp model combining vision-language and grasp networks

Vision-language model understands instructions and locates targets

Grasp network generates stable 6-DOF grasps on targets

Allows flexible natural language control of grasping

Can assess feasibility of instructions and handle errors

AI generated summary

Target grasping using vision-language model

This paper proposes a model called QwenGrasp that combines a large vision-language model with a 6-DOF grasp network for target-oriented robotic grasping. The vision-language model understands natural language instructions and locates target objects in a scene. The grasp network generates stable 6-DOF grasps. Together this allows flexible language control of grasping tasks.

Answers from this paper


No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up