BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS

Ariel Neufeld; Julian Sester

doi:10.1017/jpr.2024.88

BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS

Ariel Neufeld^*, Julian Sester^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

Abstract

In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein ball.

Original language	English
Journal	Journal of Applied Probability
DOIs	https://doi.org/10.1017/jpr.2024.88
Publication status	Accepted/In press - 2024
Externally published	Yes

Bibliographical note

Publisher Copyright:
© The Author(s), 2024.

ASJC Scopus Subject Areas

Statistics and Probability
General Mathematics
Statistics, Probability and Uncertainty

Keywords

distributionally robust optimization
Markov decision process
reinforcement learning
Wasserstein uncertainty

Access to Document

10.1017/jpr.2024.88

Cite this

@article{d890b885ec7f4d55929bcf5d8513d085,

title = "BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS",

abstract = "In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein ball.",

keywords = "distributionally robust optimization, Markov decision process, reinforcement learning, Wasserstein uncertainty",

author = "Ariel Neufeld and Julian Sester",

note = "Publisher Copyright: {\textcopyright} The Author(s), 2024.",

year = "2024",

doi = "10.1017/jpr.2024.88",

language = "English",

journal = "Journal of Applied Probability",

issn = "0021-9002",

publisher = "University of Sheffield",

}

TY - JOUR

T1 - BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS

AU - Neufeld, Ariel

AU - Sester, Julian

N1 - Publisher Copyright: © The Author(s), 2024.

PY - 2024

Y1 - 2024

N2 - In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein ball.

AB - In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein ball.

KW - distributionally robust optimization

KW - Markov decision process

KW - reinforcement learning

KW - Wasserstein uncertainty

UR - http://www.scopus.com/inward/record.url?scp=85210098729&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85210098729&partnerID=8YFLogxK

U2 - 10.1017/jpr.2024.88

DO - 10.1017/jpr.2024.88

M3 - Article

AN - SCOPUS:85210098729

SN - 0021-9002

JO - Journal of Applied Probability

JF - Journal of Applied Probability

ER -

BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS

Abstract

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Access to Document

Other files and links

Fingerprint

Cite this