BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS

Ariel Neufeld*, Julian Sester*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein ball.

Original languageEnglish
JournalJournal of Applied Probability
DOIs
Publication statusAccepted/In press - 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author(s), 2024.

ASJC Scopus Subject Areas

  • Statistics and Probability
  • General Mathematics
  • Statistics, Probability and Uncertainty

Keywords

  • distributionally robust optimization
  • Markov decision process
  • reinforcement learning
  • Wasserstein uncertainty

Fingerprint

Dive into the research topics of 'BOUNDING THE DIFFERENCE BETWEEN THE VALUES OF ROBUST AND NON-ROBUST MARKOV DECISION PROBLEMS'. Together they form a unique fingerprint.

Cite this