bibtype C - Conference Paper (international conference)
ARLID 0480036
utime 20240103214747.0
mtime 20171019235959.9
WOS 000427151400117
title (primary) (eng) Risk-Sensitive Optimality in Markov Games
specification
page_count 6 s.
media_type E
serial
ARLID cav_un_epca*0477966
ISBN 978-80-7435-678-0
title Proceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017)
page_num 684-689
publisher
place Hradec Králové
name University of Hradec Králové
year 2017
keyword two-person Markov games
keyword communicating Markov chains
keyword risk-sensitive optimality
keyword dynamic programming
author (primary)
ARLID cav_un_auth*0101196
full_dept (cz) Ekonometrie
full_dept (eng) Department of Econometrics
department (cz) E
department (eng) E
full_dept Department of Econometrics
share 50%
name1 Sladký
name2 Karel
institution UTIA-B
garant K
fullinstit Ústav teorie informace a automatizace AV ČR, v. v. i.
author
ARLID cav_un_auth*0353160
share 50%
name1 Martínez Cortés
name2 V. M.
country MX
source
url http://library.utia.cas.cz/separaty/2017/E/sladky-0480036.pdf
cas_special
project
ARLID cav_un_auth*0292652
project_id GA13-14445S
agency GA ČR
abstract (eng) The article is devoted to risk-sensitive optimality in Markov games. Attention is focused on Markov games evolving on communicating Markov chains with two-players with opposite aims. Considering risk-sensitive optimality criteria means that total reward generated by the game is evaluated by exponential utility function with a given risk-sensitive coefficient. In particular, the first player (resp. the secondplayer) tries to maximize (resp. minimize) the long-run risk sensitive average reward. Observe that if the second player is dummy, the problem is reduced to finding optimal policy of the Markov decision chain with the risk-sensitive optimality. Recall that for the risk sensitivity coefficient equal to zero we arrive at traditional optimality criteria. In this article, connections between risk-sensitive and risk-neutral Markov decisionchains and Markov games models are studied using discrepancy functions. Explicit formulae for bounds on the risk-sensitive average long-run reward are reported. Policy iteration algorithm for finding suboptimal policies of both players is suggested. The obtained results are illustrated on numerical example.
action
ARLID cav_un_auth*0346896
name MME 2017. International Conference Mathematical Methods in Economics /35./
dates 20170913
mrcbC20-s 20170915
place Hradec Králové
country CZ
RIV AH
FORD0 50000
FORD1 50200
FORD2 50202
reportyear 2018
num_of_auth 2
presentation_type PR
inst_support RVO:67985556
permalink http://hdl.handle.net/11104/0276771
cooperation
ARLID cav_un_auth*0351946
name Department of Mathematics, Autonomous Metropolitan University, Iztapalapa Campus, Mexico
institution UAM
country MX
confidential S
mrcbC86 n.a. Proceedings Paper Economics|Operations Research Management Science|Mathematics Interdisciplinary Applications|Social Sciences Mathematical Methods
mrcbC86 n.a. Proceedings Paper Economics|Operations Research Management Science|Mathematics Interdisciplinary Applications|Social Sciences Mathematical Methods
mrcbC86 n.a. Proceedings Paper Economics|Operations Research Management Science|Mathematics Interdisciplinary Applications|Social Sciences Mathematical Methods
arlyear 2017
mrcbU12 ISBN 978-80-7435-678-0
mrcbU14 SCOPUS
mrcbU24 PUBMED
mrcbU34 000427151400117 WOS
mrcbU63 cav_un_epca*0477966 Proceedings of the 35th International Conference Mathematical Methods in Economics (MME 2017) 978-80-7435-678-0 684 689 Hradec Králové University of Hradec Králové 2017