bibtype J - Journal Article
ARLID 0602819
utime 20241217090339.0
mtime 20241216235959.9
SCOPUS 85210284955
DOI 10.1109/ACCESS.2024.3497589
title (primary) (eng) Knowledge Transfer in Deep Reinforcement Learning via an RL-Specific GAN-Based Correspondence Function
specification
page_count 15 s.
media_type E
serial
ARLID cav_un_epca*0461036
ISSN 2169-3536
title IEEE Access
volume_id 12
volume 1 (2024)
page_num 177204-177218
publisher
name Institute of Electrical and Electronics Engineers
keyword Deep learning
keyword Markov decision process
keyword reinforcement learning
keyword transfer learning
keyword knowledge transfer
author (primary)
ARLID cav_un_auth*0333672
name1 Ruman
name2 Marko
institution UTIA-B
department AS
full_dept (cz) Adaptivní systémy
full_dept (eng) Department of Adaptive Systems
department (cz) AS
department (eng) AS
country SK
garant K
fullinstit Ústav teorie informace a automatizace AV ČR, v. v. i.
author
ARLID cav_un_auth*0101092
name1 Guy
name2 Tatiana Valentine
institution UTIA-B
full_dept (cz) Adaptivní systémy
full_dept Department of Adaptive Systems
department (cz) AS
department AS
full_dept Department of Adaptive Systems
fullinstit Ústav teorie informace a automatizace AV ČR, v. v. i.
source
url https://library.utia.cas.cz/separaty/2024/AS/guy-0602819.pdf
cas_special
project
project_id CA21169
agency EU-COST
country XE
ARLID cav_un_auth*0452289
abstract (eng) Deep reinforcement learning has demonstrated superhuman performance in complex decision-making tasks, but it struggles with generalization and knowledge reuse—key aspects of true intelligence. This article introduces a novel approach that modifies Cycle Generative Adversarial Networks specifically for reinforcement learning, enabling effective one-to-one knowledge transfer between two tasks. Our method enhances the loss function with two new components: model loss, which captures dynamic relationships between source and target tasks, and Q-loss, which identifies states significantly influencing the target decision policy. Tested on the 2-D Atari game Pong, our method achieved 100% knowledge transfer in identical tasks and either 100% knowledge transfer or a 30% reduction in training time for a rotated task, depending on the network architecture. In contrast, using standard Generative Adversarial Networks or Cycle Generative Adversarial Networks led to worse performance than training from scratch in the majority of cases. The results demonstrate that the proposed method ensured enhanced knowledge generalization in deep reinforcement learning.
result_subspec WOS
RIV IN
FORD0 10000
FORD1 10200
FORD2 10201
reportyear 2025
num_of_auth 2
inst_support RVO:67985556
permalink https://hdl.handle.net/11104/0360153
cooperation
ARLID cav_un_auth*0478849
name Provozně ekonomická fakulta, Česká zemědělská univerzita v Praze
institution PEF CZU
country CZ
confidential S
mrcbC91 A
mrcbT16-e COMPUTERSCIENCEINFORMATIONSYSTEMS|ENGINEERINGELECTRICALELECTRONIC|TELECOMMUNICATIONS
mrcbT16-j 0.698
mrcbT16-D Q3
arlyear 2024
mrcbU14 85210284955 SCOPUS
mrcbU24 PUBMED
mrcbU34 WOS
mrcbU63 cav_un_epca*0461036 IEEE Access 12 1 2024 177204 177218 2169-3536 2169-3536 Institute of Electrical and Electronics Engineers