| bibtype |
A -
Abstract
|
| ARLID |
0638655 |
| utime |
20260203092255.9 |
| mtime |
20250905235959.9 |
| title
(primary) (eng) |
Adapting Value Iteration with Adversarial Risk Analysis for Secure Reinforcement Learning |
| specification |
| page_count |
1 s. |
| media_type |
E |
|
| serial |
| ARLID |
cav_un_epca*0639791 |
| title
|
DYNALIFE : Theoretical Biology Meets Big Data: Dynamical Systems, Machine Learning, and Applications in Life Sciences |
| publisher |
| place |
Istanbul |
| name |
Yildiz Technical University |
| year |
2025 |
|
|
| keyword |
Adversarial Machine Learning |
| keyword |
Adversarial Risk Analysis |
| keyword |
Reinforcement Learning |
| keyword |
Dynamic Programming |
| author
(primary) |
| ARLID |
cav_un_auth*0491463 |
| name1 |
Ružejnikov |
| name2 |
Jurij |
| institution |
UTIA-B |
| full_dept (cz) |
Adaptivní systémy |
| full_dept (eng) |
Department of Adaptive Systems |
| department (cz) |
AS |
| department (eng) |
AS |
| country |
CZ |
| share |
100 |
| garant |
K |
| fullinstit |
Ústav teorie informace a automatizace AV ČR, v. v. i. |
|
| source |
|
| cas_special |
| project |
| project_id |
CA21169 |
| agency |
EU-COST |
| country |
XE |
| ARLID |
cav_un_auth*0452289 |
|
| abstract
(eng) |
The increasing automation of processes via machine learning (ML) algorithms necessitates robust systems, particularly in reinforcement learning (RL) where agents learn through interaction. However, RL systems are highly vulnerable to adversarial attacks that manipulate rewards or observations, compromising their integrity. Traditional adversarial machine learning (AML) often focuses on supervised learning or relies on common knowledge assumption, which is unsuitable for scenarios where adversaries actively conceal information. Moreover, standard single-agent RL methods often falter in dynamic setting with adapting opponents due to the emergent non-stationarity. This work employs framework for enhancing the security of RL agents by leveraging Adversarial Risk Analysis (ARA), by modifying the standard Markov Decision Process (MDP) framework to explicitly account for the presence of an adversary affecting state transition and reward dynamics. The modification incorporates the decision maker's (DM’s) subjective beliefs about potential adversarial actions. This enables the DM to use Bayesian principles for predicting adversarial actions in dynamic interactions. The framework adapts the value iteration process by forecasting and averaging over the predicted actions based on DM’s beliefs, thereby aiming to enhance policy robustness.To model the DM's uncertainty regarding the adversary's strategy, this approach focuses on methods for belief elicitation and updating. Specifically, the DM maintains and updates a probability distribution over the adversary's likely actions based on past observations. For instance, Bayesian methods with Dirichlet priors can be used to estimate action frequencies in discrete settings, allowing the DM to continuously refine its understanding and adapt its strategy even when the adversary's true policy is unknown and potentially evolving. The primary contribution of this work is a methodology demonstrating how ARA facilitates secure RL by enabling the explicit modelling, prediction, and adaptation to interfering adversarial actions within the augmented MDP formulation. The efficacy and robustness of this approach is tested through a Coin Game experiment. This strategic environment serves as a testbed for adversarial interaction. By operating under the modified MDP, which explicitly models and adapts to the opponent's behaviour, agent aims to manage adversarial interactions more effectively than standard dynamic programming baselines that typically do not incorporate such explicit opponent modelling. This highlights the crucial benefits of incorporating accurate adversarial prediction models and underscores the framework's potential for robustness even in the presence of partially misspecified opponent models. This work offers a principled and practical pathway towards developing more secure and resilient RL agents in adversarial settings. |
| action |
| ARLID |
cav_un_auth*0491465 |
| name |
DYNALIFE 2025 : Conference on QUANTUM INFORMATION AND DECISION MAKING IN LIFE SCIENCES |
| dates |
20250428 |
| mrcbC20-s |
20250429 |
| place |
Prague |
| country |
CZ |
|
| RIV |
BB |
| FORD0 |
10000 |
| FORD1 |
10100 |
| FORD2 |
10101 |
| reportyear |
2026 |
| num_of_auth |
1 |
| presentation_type |
PO |
| inst_support |
RVO:67985556 |
| permalink |
https://hdl.handle.net/11104/0370225 |
| mrcbC61 |
1 |
| cooperation |
| ARLID |
cav_un_auth*0322033 |
| name |
Česká zemědělská univerzita v Praze, Provozně ekonomická fakulta |
| institution |
PEF ČZU |
| country |
CZ |
|
| confidential |
S |
| arlyear |
2025 |
| mrcbU02 |
A2 |
| mrcbU14 |
SCOPUS |
| mrcbU24 |
PUBMED |
| mrcbU34 |
WOS |
| mrcbU56 |
online kniha abstraktů |
| mrcbU63 |
cav_un_epca*0639791 DYNALIFE : Theoretical Biology Meets Big Data: Dynamical Systems, Machine Learning, and Applications in Life Sciences Istanbul Yildiz Technical University 2025 |
|