bibtype J - Journal Article
ARLID 0543581
utime 20240103225949.4
mtime 20210701235959.9
WOS 000663694300001
SCOPUS 85108291105
DOI 10.1007/s13042-021-01358-w
title (primary) (eng) Towards on-line tuning of adaptive-agent’s multivariate meta-parameter
specification
page_count 15 s.
media_type P
serial
ARLID cav_un_epca*0461048
ISSN 1868-8071
title International Journal of Machine Learning and Cybernetics
volume_id 12
volume 9 (2021)
page_num 2717-2731
publisher
name Springer
keyword Bayesian learning
keyword Adaptive agent
keyword Meta-parameter tuning
keyword Fully probabilistic design
keyword Kullback–Leibler divergence
keyword Dynamic decision making
author (primary)
ARLID cav_un_auth*0101124
name1 Kárný
name2 Miroslav
institution UTIA-B
full_dept (cz) Adaptivní systémy
full_dept (eng) Department of Adaptive Systems
department (cz) AS
department (eng) AS
full_dept Department of Adaptive Systems
fullinstit Ústav teorie informace a automatizace AV ČR, v. v. i.
source
url http://library.utia.cas.cz/separaty/2021/AS/karny-0543581.pdf
source
url https://link.springer.com/article/10.1007/s13042-021-01358-w
cas_special
project
project_id LTC18075
agency GA MŠk
country CZ
ARLID cav_un_auth*0372050
project
project_id CA16228
agency The European Cooperation in Science and Technology (COST)
country XE
ARLID cav_un_auth*0372051
abstract (eng) A decision-making (DM) agent models its environment and quantifes its DM preferences. An adaptive agent models them locally nearby the realisation of the behaviour of the closed DM loop. Due to this, a simple tool set often sufces for solving complex dynamic DM tasks. The inspected Bayesian agent relies on a unifed learning and optimisation framework, which works well when tailored by making a range of case-specifc options. Many of them can be made of-line. These options concern the sets of involved variables, the knowledge and preference elicitation, structure estimation, etc. Still, some metaparameters need an on-line choice. This concerns, for instance, a weight balancing exploration with exploitation, a weight refecting agent’s willingness to cooperate, a discounting factor, etc. Such options infuence, often vitally, DM quality and their adaptive tuning is needed. Specifc ways exist, for instance, a data-dependent choice of a forgetting factor serving to tracking of parameter changes. A general methodology is, however, missing. The paper opens a pathway to it. The solution uses a hierarchical feedback exploiting a generic, DM-related, observable, mismodelling indicator. The paper presents and justifes the theoretical concept, outlines and illustrates its use.
result_subspec WOS
RIV BC
FORD0 20000
FORD1 20200
FORD2 20205
reportyear 2022
mrcbC52 4 A sml 4as 20231122145815.1
inst_support RVO:67985556
permalink http://hdl.handle.net/11104/0320766
confidential S
contract
name Licence to Publish
date 20210528
mrcbC86 3+4 Article Computer Science Artificial Intelligence
mrcbC91 C
mrcbT16-e COMPUTERSCIENCEARTIFICIALINTELLIGENCE
mrcbT16-j 0.597
mrcbT16-s 1.003
mrcbT16-D Q3
mrcbT16-E Q2
arlyear 2021
mrcbTft \nSoubory v repozitáři: karny-0543581-licence-IJMLC.pdf
mrcbU14 85108291105 SCOPUS
mrcbU24 PUBMED
mrcbU34 000663694300001 WOS
mrcbU63 cav_un_epca*0461048 International Journal of Machine Learning and Cybernetics 1868-8071 1868-808X Roč. 12 č. 9 2021 2717 2731 Springer