In this paper, the problem of retrieving a target object from a cluttered environment through a mobile manipulator is considered. The task is solved by combining Task and Motion Planning; in detail, at a higher level, the task planner is in charge of planning the sequence of objects to relocate while, at a lower level, the motion planner is in charge of planning the robot movements taking into consideration robot and environment constraints. In particular, the latter provides feedback to the former about the feasibility of object sequences; this information is exploited to train a Reinforcement Learning agent that, according to an objects-weight based metrics, builds a dynamic decision-tree where each node represents a sequence of relocated objects, and edge values are weights updated via Q-learning-inspired algorithm. Three learning strategies differing in how the tree is explored are analysed. Moreover, each exploration approach is performed using two different tree-search methods: the Breadth first and Depth first techniques. Finally, the proposed learning strategies are numerically validated and compared in three scenarios of growing-complexity.
Robotic Weight-based Object Relocation in Clutter via Tree-based Q-learning Approach using Breadth and Depth Search Techniques
Giacomo Golluccio
;Daniele Di Vito;Alessandro Marino;Gianluca Antonelli
2021-01-01
Abstract
In this paper, the problem of retrieving a target object from a cluttered environment through a mobile manipulator is considered. The task is solved by combining Task and Motion Planning; in detail, at a higher level, the task planner is in charge of planning the sequence of objects to relocate while, at a lower level, the motion planner is in charge of planning the robot movements taking into consideration robot and environment constraints. In particular, the latter provides feedback to the former about the feasibility of object sequences; this information is exploited to train a Reinforcement Learning agent that, according to an objects-weight based metrics, builds a dynamic decision-tree where each node represents a sequence of relocated objects, and edge values are weights updated via Q-learning-inspired algorithm. Three learning strategies differing in how the tree is explored are analysed. Moreover, each exploration approach is performed using two different tree-search methods: the Breadth first and Depth first techniques. Finally, the proposed learning strategies are numerically validated and compared in three scenarios of growing-complexity.File | Dimensione | Formato | |
---|---|---|---|
ICAR2021.pdf
solo utenti autorizzati
Tipologia:
Documento in Pre-print
Licenza:
Copyright dell'editore
Dimensione
190.88 kB
Formato
Adobe PDF
|
190.88 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.