In this paper, the problem of retrieving a target object from a cluttered environment through a mobile manipulator is considered. The task is solved by combining Task and Motion Planning; in detail, at a higher level, the task planner is in charge of planning the sequence of objects to relocate while, at a lower level, the motion planner is in charge of planning the robot movements taking into consideration robot and environment constraints. In particular, the latter provides feedback to the former about the feasibility of object sequences; this information is exploited to train a Reinforcement Learning agent that, according to an objects-weight based metrics, builds a dynamic decision-tree where each node represents a sequence of relocated objects, and edge values are weights updated via Q-learning-inspired algorithm. Three learning strategies differing in how the tree is explored are analysed. Moreover, each exploration approach is performed using two different tree-search methods: the Breadth first and Depth first techniques. Finally, the proposed learning strategies are numerically validated and compared in three scenarios of growing-complexity.

Robotic Weight-based Object Relocation in Clutter via Tree-based Q-learning Approach using Breadth and Depth Search Techniques

Giacomo Golluccio
;
Daniele Di Vito;Alessandro Marino;Gianluca Antonelli
2021-01-01

Abstract

In this paper, the problem of retrieving a target object from a cluttered environment through a mobile manipulator is considered. The task is solved by combining Task and Motion Planning; in detail, at a higher level, the task planner is in charge of planning the sequence of objects to relocate while, at a lower level, the motion planner is in charge of planning the robot movements taking into consideration robot and environment constraints. In particular, the latter provides feedback to the former about the feasibility of object sequences; this information is exploited to train a Reinforcement Learning agent that, according to an objects-weight based metrics, builds a dynamic decision-tree where each node represents a sequence of relocated objects, and edge values are weights updated via Q-learning-inspired algorithm. Three learning strategies differing in how the tree is explored are analysed. Moreover, each exploration approach is performed using two different tree-search methods: the Breadth first and Depth first techniques. Finally, the proposed learning strategies are numerically validated and compared in three scenarios of growing-complexity.
2021
978-1-6654-3684-7
File in questo prodotto:
File Dimensione Formato  
ICAR2021.pdf

solo utenti autorizzati

Tipologia: Documento in Pre-print
Licenza: Copyright dell'editore
Dimensione 190.88 kB
Formato Adobe PDF
190.88 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11580/92258
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
social impact