<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="style/detail_T.xsl"?>
<bibitem type="J">   <ARLID>0649859</ARLID> <utime>20260526110123.6</utime><mtime>20260525235959.9</mtime>   <SCOPUS>105034452028</SCOPUS> <WOS>001731023500044</WOS>  <DOI>10.1109/ACCESS.2026.3676711</DOI>           <title language="eng" primary="1">N-to-1 Knowledge Transfer in Reinforcement Learning via Adaptive Q-Function Selection</title>  <specification> <page_count>13 s.</page_count> <media_type>E</media_type> </specification>   <serial><ARLID>cav_un_epca*0461036</ARLID><ISSN>2169-3536</ISSN><title>IEEE Access</title><part_num/><part_title/><volume_id>14</volume_id><volume>1 (2026)</volume><page_num>45964-45976</page_num><publisher><place/><name>Institute of Electrical and Electronics Engineers</name><year/></publisher></serial>    <keyword>Reinforcement learning</keyword>   <keyword>transfer learning</keyword>   <keyword>multi-source knowledge transfer</keyword>   <keyword>adaptive policy selection</keyword>   <keyword>Q-function selection</keyword>    <author primary="1"> <ARLID>cav_un_auth*0333672</ARLID> <name1>Ruman</name1> <name2>Marko</name2> <institution>UTIA-B</institution> <full_dept language="cz">Adaptivní systémy</full_dept> <full_dept language="eng">Department of Adaptive Systems</full_dept> <department language="cz">AS</department> <department language="eng">AS</department> <country>SK</country>  <share>70%</share> <garant>K</garant> <fullinstit>Ústav teorie informace a automatizace AV ČR, v. v. i.</fullinstit> </author> <author primary="0"> <ARLID>cav_un_auth*0469825</ARLID> <name1>Guy</name1> <name2>Tatiana V.</name2> <institution>UTIA-B</institution> <full_dept language="cz">Adaptivní systémy</full_dept> <full_dept>Department of Adaptive Systems</full_dept> <department language="cz">AS</department> <department>AS</department> <country>CZ</country>  <share>30%</share> <garant>S</garant> <fullinstit>Ústav teorie informace a automatizace AV ČR, v. v. i.</fullinstit> </author>   <source> <url>https://library.utia.cas.cz/separaty/2026/AS/guy-0649859.pdf</url> </source> <source> <url>https://ieeexplore.ieee.org/document/11450364/</url>  </source>        <cas_special> <project> <project_id>101168272</project_id> <agency>EC</agency> <country>XE</country>   <ARLID>cav_un_auth*0492513</ARLID> </project> <project> <project_id>CA24136</project_id> <agency>EC</agency> <country>XE</country>  <ARLID>cav_un_auth*0504278</ARLID> </project>  <abstract language="eng" primary="1">This paper tackles multi-source (N-to-1) knowledge transfer in reinforcement learning (RL), where an agent must adaptively solve a new task by leveraging a library of pre-learned skills. We consider a setting where these skills are represented as multiple Q-functions and corresponding environment models, without any explicit labels to guide the selection. To address this scenario, we introduce a theoretically grounded method that dynamically selects the most suitable Q-function at each learning stage. Instead of relying on noisy, short-term signals, our approach makes a farsighted choice by simulating the long-term performance of each skill while simultaneously evaluating the trustworthiness of its underlying world model. This allows the agent to intelligently transition from relying on transferred knowledge to using its own newly acquired policy. The proposed method is evaluated in three scenarios: 1) selection among multiple Q-functions to solve a fixed RL task, 2) adaptation in a dynamically changing environment, and 3) switching from a partially learned Q-function to a newly learned one. In all cases, our method accelerates learning and demonstrates robust adaptation, confirming its effectiveness for scalable multi-source transfer in RL.</abstract>     <result_subspec>WOS</result_subspec> <RIV>BB</RIV> <FORD0>10000</FORD0> <FORD1>10200</FORD1> <FORD2>10201</FORD2>    <reportyear>2027</reportyear>      <num_of_auth>2</num_of_auth>  <inst_support> RVO:67985556 </inst_support>  <permalink>https://hdl.handle.net/11104/0378906</permalink>   <confidential>S</confidential>   <access>A</access>          <unknown tag="mrcbT16-e">TELECOMMUNICATIONS|ENGINEERING.ELECTRICAL&amp;ELECTRONIC|COMPUTERSCIENCE.INFORMATIONSYSTEMS</unknown> <unknown tag="mrcbT16-f">3.9</unknown> <unknown tag="mrcbT16-g">0.8</unknown> <unknown tag="mrcbT16-h">4.1</unknown> <unknown tag="mrcbT16-i">0.3457</unknown> <unknown tag="mrcbT16-j">0.67</unknown> <unknown tag="mrcbT16-k">294150</unknown> <unknown tag="mrcbT16-q">290</unknown> <unknown tag="mrcbT16-s">0.849</unknown> <unknown tag="mrcbT16-y">50.78</unknown> <unknown tag="mrcbT16-x">5.31</unknown> <unknown tag="mrcbT16-3">185847</unknown> <unknown tag="mrcbT16-4">Q1</unknown> <unknown tag="mrcbT16-5">3.200</unknown> <unknown tag="mrcbT16-6">13193</unknown> <unknown tag="mrcbT16-7">Q2</unknown> <unknown tag="mrcbT16-C">62.3</unknown> <unknown tag="mrcbT16-M">0.83</unknown> <unknown tag="mrcbT16-N">Q2</unknown> <unknown tag="mrcbT16-P">64.8</unknown> <arlyear>2026</arlyear>       <unknown tag="mrcbU14"> 105034452028 SCOPUS </unknown> <unknown tag="mrcbU24"> PUBMED </unknown> <unknown tag="mrcbU34"> 001731023500044 WOS </unknown> <unknown tag="mrcbU63"> cav_un_epca*0461036 IEEE Access Roč. 14 č. 1 2026 45964 45976 2169-3536 2169-3536 Institute of Electrical and Electronics Engineers </unknown> </cas_special> </bibitem>