PAS: Probably Approximate Safety Verification of Reinforcement Learning Policy Using Scenario Optimization

Arambam James Singh; Arvind Easwaran

PAS: Probably Approximate Safety Verification of Reinforcement Learning Policy Using Scenario Optimization

Arambam James Singh, Arvind Easwaran

Nanyang Technological University

Research output: Contribution to journal › Conference article › peer-review

2 Citations (Scopus)

Abstract

With the advancement of machine learning based automation in the current digital world, the problem of safety verification of such systems is becoming crucial, especially in safety-critical domains like self-driving cars, robotics, etc. Reinforcement learning (RL) is an emerging machine learning technique with many applications, including in safety-critical domains. The classical safety verification approach of making a binary decision on determining whether a system is safe or unsafe is particularly challenging for an RL system. Such an approach generally requires prior knowledge about the system, e.g., the transition model of the system, the set of unsafe states in the environment, etc., which are typically unavailable in a standard RL setting. Instead, this paper addresses the safety verification problem from a quantitative safety perspective, i.e., we quantify the safe behavior of the policy in terms of probability. We formulate the safety verification problem as a chance-constrained optimization using the technique of barrier certificate. We then use a sampling based approach called scenario optimization to solve the chance-constrained problem, which gives the desired probabilistic guarantee on the safe behavior of the policy. Our extensive empirical evaluation shows the validity and robustness of our approach in three RL domains.

Original language	English
Pages (from-to)	1745-1753
Number of pages	9
Journal	Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume	2024-May
Publication status	Published - 2024
Externally published	Yes
Event	23rd International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2024 - Auckland, New Zealand Duration: May 6 2024 → May 10 2024

Bibliographical note

Publisher Copyright:
© 2024 International Foundation for Autonomous Agents and Multiagent Systems.

ASJC Scopus Subject Areas

Artificial Intelligence
Software
Control and Systems Engineering

Keywords

reinforcement Learning
Safety verification
scenario optimization

Cite this

@article{319f3d56fe974a20a30bb4622be5115b,

title = "PAS: Probably Approximate Safety Verification of Reinforcement Learning Policy Using Scenario Optimization",

abstract = "With the advancement of machine learning based automation in the current digital world, the problem of safety verification of such systems is becoming crucial, especially in safety-critical domains like self-driving cars, robotics, etc. Reinforcement learning (RL) is an emerging machine learning technique with many applications, including in safety-critical domains. The classical safety verification approach of making a binary decision on determining whether a system is safe or unsafe is particularly challenging for an RL system. Such an approach generally requires prior knowledge about the system, e.g., the transition model of the system, the set of unsafe states in the environment, etc., which are typically unavailable in a standard RL setting. Instead, this paper addresses the safety verification problem from a quantitative safety perspective, i.e., we quantify the safe behavior of the policy in terms of probability. We formulate the safety verification problem as a chance-constrained optimization using the technique of barrier certificate. We then use a sampling based approach called scenario optimization to solve the chance-constrained problem, which gives the desired probabilistic guarantee on the safe behavior of the policy. Our extensive empirical evaluation shows the validity and robustness of our approach in three RL domains.",

keywords = "reinforcement Learning, Safety verification, scenario optimization",

author = "Singh, \{Arambam James\} and Arvind Easwaran",

note = "Publisher Copyright: {\textcopyright} 2024 International Foundation for Autonomous Agents and Multiagent Systems.; 23rd International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2024 ; Conference date: 06-05-2024 Through 10-05-2024",

year = "2024",

language = "English",

volume = "2024-May",

pages = "1745--1753",

journal = "Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS",

issn = "1548-8403",

}

TY - JOUR

T1 - PAS

T2 - 23rd International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2024

AU - Singh, Arambam James

AU - Easwaran, Arvind

PY - 2024

Y1 - 2024

N2 - With the advancement of machine learning based automation in the current digital world, the problem of safety verification of such systems is becoming crucial, especially in safety-critical domains like self-driving cars, robotics, etc. Reinforcement learning (RL) is an emerging machine learning technique with many applications, including in safety-critical domains. The classical safety verification approach of making a binary decision on determining whether a system is safe or unsafe is particularly challenging for an RL system. Such an approach generally requires prior knowledge about the system, e.g., the transition model of the system, the set of unsafe states in the environment, etc., which are typically unavailable in a standard RL setting. Instead, this paper addresses the safety verification problem from a quantitative safety perspective, i.e., we quantify the safe behavior of the policy in terms of probability. We formulate the safety verification problem as a chance-constrained optimization using the technique of barrier certificate. We then use a sampling based approach called scenario optimization to solve the chance-constrained problem, which gives the desired probabilistic guarantee on the safe behavior of the policy. Our extensive empirical evaluation shows the validity and robustness of our approach in three RL domains.

AB - With the advancement of machine learning based automation in the current digital world, the problem of safety verification of such systems is becoming crucial, especially in safety-critical domains like self-driving cars, robotics, etc. Reinforcement learning (RL) is an emerging machine learning technique with many applications, including in safety-critical domains. The classical safety verification approach of making a binary decision on determining whether a system is safe or unsafe is particularly challenging for an RL system. Such an approach generally requires prior knowledge about the system, e.g., the transition model of the system, the set of unsafe states in the environment, etc., which are typically unavailable in a standard RL setting. Instead, this paper addresses the safety verification problem from a quantitative safety perspective, i.e., we quantify the safe behavior of the policy in terms of probability. We formulate the safety verification problem as a chance-constrained optimization using the technique of barrier certificate. We then use a sampling based approach called scenario optimization to solve the chance-constrained problem, which gives the desired probabilistic guarantee on the safe behavior of the policy. Our extensive empirical evaluation shows the validity and robustness of our approach in three RL domains.

KW - reinforcement Learning

KW - Safety verification

KW - scenario optimization

UR - http://www.scopus.com/inward/record.url?scp=85196398827&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85196398827&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85196398827

SN - 1548-8403

VL - 2024-May

SP - 1745

EP - 1753

JO - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS

JF - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS

Y2 - 6 May 2024 through 10 May 2024

ER -

PAS: Probably Approximate Safety Verification of Reinforcement Learning Policy Using Scenario Optimization

Abstract

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Other files and links

Fingerprint

Cite this