Markov Decision Processes with Formal Verification: Mathematical Guarantees for Safe Reinforcement Learning
  • Author(s): Syed Khundmir Azmi
  • Paper ID: 1711043
  • Page: 418-428
  • Published Date: 30-09-2021
  • Published In: Iconic Research And Engineering Journals
  • Publisher: IRE Journals
  • e-ISSN: 2456-8880
  • Volume/Issue: Volume 5 Issue 3 September-2021
Abstract

This study investigates the application of Markov Decision Processes (MDPs) in conjunction with formal verification to enhance the safety of reinforcement learning (RL) systems. The primary focus of this work is to develop approaches that provide mathematical assurances for the safe exploration of the RL environment, a significant challenge in autonomous decision-making systems. The paper examines the application of formal verification in ensuring that RL agents adhere to the specified safety restrictions while learning the optimal policies. The primary goals are to develop mathematical models that quantify safety risks and apply these models in practical settings. The observations indicate that there are substantive developments in offering verifiable safety guarantees during the exploration process, thereby reducing the chances of catastrophic failures. This work is an addition to the expanding body of safe RL by combining formal methods with MDPs, which is a new way of accomplishing reliable, safe, and efficient learning in non-trivial settings.

Keywords

Markov Decision Processes, Formal Verification, Reinforcement Learning, Safety Constraints, Mathematical Guarantees, Safe Exploration, Autonomous Decision-Making, Provable Safety, Formal Methods, Optimal Policies.

Citations

IRE Journals:
Syed Khundmir Azmi "Markov Decision Processes with Formal Verification: Mathematical Guarantees for Safe Reinforcement Learning" Iconic Research And Engineering Journals Volume 5 Issue 3 2021 Page 418-428

IEEE:
Syed Khundmir Azmi "Markov Decision Processes with Formal Verification: Mathematical Guarantees for Safe Reinforcement Learning" Iconic Research And Engineering Journals, 5(3)