bibtype J - Journal Article
ARLID 0399560
utime 20240903170627.7
mtime 20131203235959.9
SCOPUS 84889006605
WOS 000328665200004
title (primary) (eng) Approximate Dynamic Programming Based on High Dimensional Model Representation
specification
page_count 18 s.
media_type P
serial
ARLID cav_un_epca*0297163
ISSN 0023-5954
title Kybernetika
volume_id 49
volume 5 (2013)
page_num 720-737
publisher
name Ústav teorie informace a automatizace AV ČR, v. v. i.
keyword approximate dynamic programming
keyword Bellman equation
keyword approximate HDMR minimization
keyword trust region problem
author (primary)
ARLID cav_un_auth*0234872
full_dept (cz) Matematická teorie rozhodování
full_dept (eng) Department of Decision Making Theory
department (cz) MTR
department (eng) MTR
full_dept Department of Decision Making Theory
name1 Pištěk
name2 Miroslav
institution UTIA-B
fullinstit Ústav teorie informace a automatizace AV ČR, v. v. i.
source
url http://library.utia.cas.cz/separaty/2013/AS/pistek-0399560.pdf
cas_special
project
ARLID cav_un_auth*0273082
project_id GAP102/11/0437
agency GA ČR
country CZ
abstract (eng) This article introduces an algorithm for implicit High Dimensional Model Representation (HDMR) of the Bellman equation. This approximation technique reduces memory demands of the algorithm considerably. Moreover, we show that HDMR enables fast approximate min- imization which is essential for evaluation of the Bellman function. In each time step, the problem of parametrized HDMR minimization is relaxed into trust region problems, all sharing the same matrix. Finding its eigenvalue decomposition, we effectively achieve estimates of all minima. Their full-domain representation is avoided by HDMR and then the same approach is used recursively in the next time step. An illustrative example of N-armed bandit problem is included. We assume that the newly established connection between approximate HDMR minimization and the trust region problem can be beneficial also to many other applications.
RIV BC
reportyear 2014
mrcbC52 4 O 4o 20231122135935.5
inst_support RVO:67985556
permalink http://hdl.handle.net/11104/0226953
mrcbT16-e COMPUTERSCIENCECYBERNETICS
mrcbT16-f 0.577
mrcbT16-g 0.098
mrcbT16-h 9.IV
mrcbT16-i 0.00191
mrcbT16-j 0.341
mrcbT16-k 655
mrcbT16-l 61
mrcbT16-s 0.348
mrcbT16-z ScienceCitationIndexExpanded
mrcbT16-4 Q2
mrcbT16-B 35.159
mrcbT16-C 31.250
mrcbT16-D Q3
mrcbT16-E Q3
arlyear 2013
mrcbTft \nSoubory v repozitáři: pistek-0399560.pdf
mrcbU14 84889006605 SCOPUS
mrcbU34 000328665200004 WOS
mrcbU63 cav_un_epca*0297163 Kybernetika 0023-5954 Roč. 49 č. 5 2013 720 737 Ústav teorie informace a automatizace AV ČR, v. v. i.