bibtype J - Journal Article
ARLID 0640730
utime 20260123172010.1
mtime 20251103235959.9
SCOPUS 105025201874
DOI 10.1137/24M1672341
title (primary) (eng) Discrete Time Dynamic Programming Using Tensor Trains
specification
page_count 24 s.
media_type P
serial
ARLID cav_un_epca*0257600
ISSN 1064-8275
title SIAM Journal on Scientific Computing
volume_id 47
volume 6 (2025)
publisher
name SIAM Society for Industrial and Applied Mathematics
keyword control design
keyword Bellman equation
keyword tensor train
author (primary)
ARLID cav_un_auth*0101212
name1 Tichavský
name2 Petr
institution UTIA-B
full_dept (cz) Stochastická informatika
full_dept (eng) Department of Stochastic Informatics
department (cz) SI
department (eng) SI
full_dept Department of Stochastic Informatics
share 40
garant K
fullinstit Ústav teorie informace a automatizace AV ČR, v. v. i.
author
ARLID cav_un_auth*0434606
name1 Straka
name2 O.
country CZ
share 30
author
ARLID cav_un_auth*0472250
name1 Punčochář
name2 I.
country CZ
share 30
source
source_size 759 kB
url https://library.utia.cas.cz/separaty/2025/SI/tichavsky-0640730.pdf
cas_special
project
project_id GA25-18070S
agency GA ČR
country CZ
ARLID cav_un_auth*0484465
project
project_id GA22-11101S
agency GA ČR
country CZ
ARLID cav_un_auth*0435406
abstract (eng) Discrete-time dynamic programming has many applications in decision-making and econometrics. In it, one is looking for a so-called value function that obeys a functional equation called the Bellman equation. The difficulty is that the number of variables of the value function can be very high, and a brute-force iteration of the Bellman equation is not feasible. Some authors solve this problem with deep neural networks, which have disadvantages. In this paper, we propose to handle the (sampled) value function in terms of a tensor train in a rectangular grid. Two novel techniques for the function interpolation were proposed. The decomposition has to be repeated in each Bellman iteration. Since the number of the tensor samples is still astronomically large, we propose to decompose the tensor using the TT-cross technique which only uses a fraction of the tensor elements. In this way, it is possible to find approximate solutions to the problem in dimensions where the traditional methods fail. Next, we propose a smoothing operation that may improve the convergence and a novel way of computing the approximation error and estimating the time when the iteration should be halted.\nThe method's performance is demonstrated in the example of the linear quadratic controller, where the ideal solution is known as the ground truth. Next, the proposed technique is applied to the problem of active fault detection, and its performance is compared to that of the neural network technique.
result_subspec WOS
RIV BB
FORD0 10000
FORD1 10100
FORD2 10103
reportyear 2026
num_of_auth 3
inst_support RVO:67985556
permalink https://hdl.handle.net/11104/0371250
confidential S
mrcbC91 A
mrcbT16-e MATHEMATICS.APPLIED
mrcbT16-f 3.7
mrcbT16-g 0.6
mrcbT16-h 12.3
mrcbT16-i 0.01528
mrcbT16-j 1.669
mrcbT16-k 17896
mrcbT16-q 169
mrcbT16-s 1.634
mrcbT16-y 44.76
mrcbT16-x 2.59
mrcbT16-3 2359
mrcbT16-4 Q1
mrcbT16-5 2.300
mrcbT16-6 254
mrcbT16-7 Q1
mrcbT16-C 89.9
mrcbT16-M 1.67
mrcbT16-N Q1
mrcbT16-P 89.9
arlyear 2025
mrcbU14 105025201874 SCOPUS
mrcbU24 PUBMED
mrcbU34 WOS
mrcbU63 cav_un_epca*0257600 SIAM Journal on Scientific Computing Roč. 47 č. 6 2025 A3161 A3184 1064-8275 1095-7197 SIAM Society for Industrial and Applied Mathematics