Predictive Technology Lab > Papers > 2003 > Partially Observed Stochastic Shortest Path Problems with Approximate Solution by Neuro-Dynamic Programming

Partially Observed Stochastic Shortest Path Problems with Approximate Solution by Neuro-Dynamic Programming

Table of contents
No headers

We analyze a class of Markov decision processes with imperfect state information that evolve on an infinite time horizon and have a total cost criterion. Particularly, we are interested in problems with stochastic shortest path sturcture, assuming (1) the existence of a policy that guarantees termination with probability one and (2) the property that any policy that fails to guarantee termination has infinite expected cost from some initial state. We also  assume that termination is perfectly recognized. In this paper, we expand upon arguments (given in [11]) for establishing the existence, uniqueness, and characterization of stationary optimal policies and the convergence of value and policy iteration. We also present an illustrative example, involving the search for partially observed target which moves randomly on a grid, and we develop a simulation-based algorithm (based on neuro-dynamic programming techniques) for computing policies that approximately minimize the expected number of stages to complete the search.


Tag page

Files 1

FileSizeDateAttached by 
 PartiallyObserved.pdf
No description
114.09 kB07:30, 11 Jun 2008AdminActions
You must login to post a comment.