UTIA - Library

bibtype

J - Journal Article

ARLID

0346161

utime

20240903170621.6

mtime

20100914235959.9

WOS

000280425000019

title (primary) (eng)

Identification of Optimal Policies in Markov Decision Processes

specification

page_count	13 s.

serial

ARLID

cav_un_epca*0297163

ISSN

0023-5954

title

Kybernetika

volume

3 (2010)

page_num

558-570

publisher

name	Ústav teorie informace a automatizace AV ČR, v. v. i.

keyword

finite state Markov decision processes

keyword

discounted and average costs

keyword

elimination of suboptimal policies

author (primary)

ARLID	cav_un_auth*0101196
name1	Sladký
name2	Karel
full_dept (cz)	Ekonometrie
full_dept (eng)	Department of Econometrics
department (cz)	E
department (eng)	E
institution	UTIA-B
full_dept	Department of Econometrics
fullinstit	Ústav teorie informace a automatizace AV ČR, v. v. i.

source

url	http://library.utia.cas.cz/separaty/2010/E/sladky-identification of optimal policies in markov decision processes.pdf

cas_special

project

project_id	GA402/08/0107
agency	GA ČR
country	CZ
ARLID	cav_un_auth*0240545

project

project_id	GA402/07/1113
agency	GA ČR
ARLID	cav_un_auth*0228801

research

CEZ:AV0Z10750506

abstract (eng)

In this note we focus attention on identifying optimal policies and on elimination suboptimal policies minimizing optimality criteria in discrete-time Markov decision processes with finite state space and compact action set. We present unified approach to value iteration algorithms that enables to generate lower and upper bounds on optimal values, as well as on the current policy. Using the modified value iterations it is possible to eliminate suboptimal actions and to identify an optimal policy or nearly optimal policies in a finite number of steps without knowing precise values of the performance function.

action

ARLID	cav_un_auth*0263053
name	International Conference on Mathematical Methods in Economy and Industry
place	České Budějovice
dates	15.06.2009-18.06.2009
country	CZ

reportyear

2011

RIV

mrcbC52

4 O 4o 20231122134115.3

permalink

http://hdl.handle.net/11104/0187257

mrcbT16-e

COMPUTERSCIENCECYBERNETICS

mrcbT16-f

0.562

mrcbT16-g

0.219

mrcbT16-h

8.1

mrcbT16-i

0.00125

mrcbT16-j

0.22

mrcbT16-k

463

mrcbT16-l

mrcbT16-q

mrcbT16-s

0.323

mrcbT16-y

20.57

mrcbT16-x

0.48

mrcbT16-4

mrcbT16-B

27.15

mrcbT16-C

23.684

mrcbT16-D

mrcbT16-E

arlyear

2010

mrcbTft

\nSoubory v repozitáři: 0346161.pdf

mrcbU34

000280425000019 WOS

mrcbU63

cav_un_epca*0297163 Kybernetika 0023-5954 46 2010 č. 3 2010 558 570 Ústav teorie informace a automatizace AV ČR, v. v. i.