mhahsler / pomdp Goto Github PK

View Code? Open in Web Editor NEW

14.0 5.0 5.0 2.82 MB

R package for Partially Observable Markov Decision Processes

R 84.61% TeX 2.71% C++ 12.68%

control-theory markov-decision-processes optimization

pomdp's People

Contributors

Stargazers

Watchers

Forkers

xijunke mkmrabby billpine sjulee xiaodong332

pomdp's Issues

Issue after updating to 1.0.0-1

Hello Michael,

I just updated my previous pomdp 0.99.3 to the new 1.0.0-1 and a r shiny dashboard tool that I made now throws this error:

Output created: C:/Users/Emile/AppData/Local/Temp/RtmpKs7Zol/file12614629879f5/xxxx.html
Warning: Error in sprintf: invalid format '%.7f'; use format %s for character objects
3:
1: rmarkdown::run

Not sure if there is anything that can be suggested. Is this an issue that can be pursued?

Thanks so much.

Max number of episodes

Hello,

Can I ask how many episodes does the package support?

I can run up to 4 episodes and I get an error when I add more episodes. This is the error:

Error in .is_timedependent(model, "transition_prob") :
Inconsistent POMDP specification. Field transition_prob does not contain data for the appropriate number of episodes.

Thanks,
Hanie

I put my POMDP object as an argument in the function (transition_matrix(objPOMDP)$myAction) and the resulting matrix has rows and columns permutated. I have four states and three actions. It appears the states are being reordered to [2, 3, 4, 1] which incidentally then matches an alphabetical ordering of the four states, but not the state factor ordering.

This is a real problem because the row and column names of transition_matrix(objPOMDP)$myAction are not reordered [2, 3, 4, 1], so the result is wrong.

I am using Heatmap to plot it, and for it to be correct I need the following code:
heatmap(transition_matrix(objPOMDP)$myAction)[c(2, 3, 4, 1), c(2, 3, 4, 1)], labRow = objPOMDP$model$states, labCol = objPOMDP$model$states, revC = TRUE, Rowv = NA, Colv = NA).

Everything is correct when I use the following: objPOMDP$model$transition_prob[objPOMDP$model$transition_prob$action == "myAction",]

This issue does not arise when using observation_matrix; here, the result maintains the state factor ordering.

Any ideas how this could be happening? Is this a bug?

Thanks,
Emile

Convergence problem

Hi,

I have a finite horizon problem that is not converging. I read in the manual that I should put the initial value function of zero in order to guarantee the convergence. Where can I add this to the code? should I add it in here?
sol <- solve_POMDP(model = Total, discount = 0.99, method = "grid")
if yes how?

Thanks

Not converging problem

Hi,

I have a finite horizon problem that is not converging. What does it mean when it does not converge? what can I do for that to make it converge?

Best,
Hanie

missing beliefs stops plot_policy_graph

Hi Michael,

This is likely not a bug it may have been an oversight or a design choice. After updating to pomdp 0.99.3 the function plot_policy_graph stopped plotting whenever one of the nodes had no belief attached to it. It used to give a warning, such as, missing belief points for node(s): 1, 3, 4,..., Increase parameter fg_points, but it would still plot the resulting graph using a uniform distribution for the belief associated to each of those nodes. The updated version stops instead of giving a warning.

I know I can still produce the graphs by including beliefs = FALSE argument into the function, but that removes all of the beliefs for every node, including those for which there was a well-defined belief.

Is there a specific reason for this design choice? Any way of having a mixed beliefs argument option so plot_policy_graph would still plot the graphs using the old method of assigning a uniform distribution to nodes that did not have a well-defined belief?

Thanks for the consideration!

Importing transition probabilities from excel

Hello,

I am trying to use an imported excel file for my transition and observation probabilities, There are 1616 transition probability, and 4 16 observation probabilities that I imported. Here is the code:
Total <- POMDP(
name = "Total",
discount = 0.99,
horizon = c(first = 1, second = 1, third =1, fourth = 1),
states = c("s1", "s2", "s3", "s4", "s5","s6","s7", "s8", "s9","s10","s11","s12","s13","s14","s15","s16"),
actions = c("Physio", "Gait", "cardiovascular", "nothing"),
observations = c("o1", "o2", "o3", "o4"),
start = "uniform",
transition_prob = list(
first = list(
"Physio" = as.matrix(PF_THA_Trans_0),
"Gait" = as.matrix(PF_THA_Trans_0),
"cardiovascular" = as.matrix(PF_THA_Trans_0),
"nothing" = "uniform"),
second = list(
"Physio" = as.matrix(PF_THA_Trans_1),
"Gait" = as.matrix(PF_THA_Trans_1),
"cardiovascular" = as.matrix(PF_THA_Trans_1),
"nothing" = "uniform"),
third = list(
"Physio" = as.matrix(PF_THA_Trans_2),
"Gait" = as.matrix(PF_THA_Trans_2),
"cardiovascular" = as.matrix(PF_THA_Trans_2),
"nothing" = "uniform"),
fourth = list(
"Physio" = as.matrix(PF_THA_Trans_3),
"Gait" = as.matrix(PF_THA_Trans_3),
"cardiovascular" = as.matrix(PF_THA_Trans_3),
"nothing" = "uniform")),

observation_prob = list(
first = list(
"Physio" = as.matrix(PF_THA_obs_1),
"Gait" = as.matrix(PF_THA_obs_1),
"cardiovascular" = as.matrix(PF_THA_obs_1),
"nothing" = "uniform"),
second = list(
"Physio" = as.matrix(PF_THA_obs_2),
"Gait" = as.matrix(PF_THA_obs_2),
"cardiovascular" = as.matrix(PF_THA_obs_2),
"nothing" = "uniform"),
third = list(
"Physio" = as.matrix(PF_THA_obs_3),
"Gait" = as.matrix(PF_THA_obs_3),
"cardiovascular" = as.matrix(PF_THA_obs_3),
"nothing" = "uniform"),
fourth = list(
"Physio" = as.matrix(PF_THA_obs_4),
"Gait" = as.matrix(PF_THA_obs_4),
"cardiovascular" = as.matrix(PF_THA_obs_4),
"nothing" = "uniform")),

the reward helper expects: action, start.state, end.state, observation, value

reward = rbind(
R_("Physio", "s1", v = 22.5),
R_("Physio", "s2", v = 67.5),
R_("Physio", "s3", v = 112.5),
R_("Physio", "s4", v = 157.5),
R_("Physio", "s5", v = 202.5),
R_("Physio", "s6", v = 247.5),
R_("Physio", "s7", v = 337.5),
R_("Physio", "s8", v = 382.5),
R_("Physio", "s9", v = 427.5),
R_("Physio", "s10", v = 472.5),
R_("Physio", "s11", v = 517.5),
R_("Physio", "s12", v = 562.5),
R_("Physio", "s13", v = 607.5),
R_("Physio", "s14", v = 652.5),
R_("Physio", "s15", v = 697.5),
R_("Physio", "s16", v = 742.5),
R_("Gait", "s1", v = 22.5),
R_("Gait", "s2", v = 67.5),
R_("Gait", "s3", v = 112.5),
R_("Gait", "s4", v = 157.5),
R_("Gait", "s5", v = 202.5),
R_("Gait", "s6", v = 247.5),
R_("Gait", "s7", v = 337.5),
R_("Gait", "s8", v = 382.5),
R_("Gait", "s9", v = 427.5),
R_("Gait", "s10", v = 472.5),
R_("Gait", "s11", v = 517.5),
R_("Gait", "s12", v = 562.5),
R_("Gait", "s13", v = 607.5),
R_("Gait", "s14", v = 652.5),
R_("Gait", "s15", v = 697.5),
R_("Gait", "s16", v = 742.5),
R_("cardiovascular", "s1", v = 22.5),
R_("cardiovascular", "s2", v = 67.5),
R_("cardiovascular", "s3", v = 112.5),
R_("cardiovascular", "s4", v = 157.5),
R_("cardiovascular", "s5", v = 202.5),
R_("cardiovascular", "s6", v = 247.5),
R_("cardiovascular", "s7", v = 337.5),
R_("cardiovascular", "s8", v = 382.5),
R_("cardiovascular", "s9", v = 427.5),
R_("cardiovascular", "s10", v = 472.5),
R_("cardiovascular", "s11", v = 517.5),
R_("cardiovascular", "s12", v = 562.5),
R_("cardiovascular", "s13", v = 607.5),
R_("cardiovascular", "s14", v = 652.5),
R_("cardiovascular", "s15", v = 697.5),
R_("cardiovascular", "s16", v = 742.5),
R_("nothing", "s1", v = 22.5),
R_("nothing", "s2", v = 67.5),
R_("nothing", "s3", v = 112.5),
R_("nothing", "s4", v = 157.5),
R_("nothing", "s5", v = 202.5),
R_("nothing", "s6", v = 247.5),
R_("nothing", "s7", v = 337.5),
R_("nothing", "s8", v = 382.5),
R_("nothing", "s9", v = 427.5),
R_("nothing", "s10", v = 472.5),
R_("nothing", "s11", v = 517.5),
R_("nothing", "s12", v = 562.5),
R_("nothing", "s13", v = 607.5),
R_("nothing", "s14", v = 652.5),
R_("nothing", "s15", v = 697.5),
R_("nothing", "s16", v = 742.5)
),
max = TRUE,
)
Total
sol <- solve_POMDP(model = Total, discount = 0.99, method = "enum")
sol
policy(sol)

I got this error which I do not know what does it mean:
Error in NextMethod("[") : object 'i' not found

I know that all the data are the same for different actions for now but I don't think that is the problem.
Could you please help me with this?
Thanks in advance

solve_SARSOP executable failed

Hi Michael,

I've been using this package for a while now, and recently became interested in trying solve_SARSOP instead of solve_POMDP, but the commands data("Tiger")

sol <- solve_SARSOP(model = Tiger)
produce an error: Error in processx::run(path, strsplit(args, " ")[[1]], spinner = spinner, :
System command 'pomdpsol.exe' failed, exit status: -1073741515, stderr empty
Type .Last.error.trace to see where the error occured

I tried that and get this:
Stack trace:

pomdp:::solve_SARSOP(model = Tiger)
base:::do.call(sarsop::pomdpsol, c(list(model = model_file, output = policy_file, ...
(function (model, output = tempfile(), precision = 0.001, timeout = NULL, ...
sarsop:::exec_program("pomdpsol", args, stdout = stdout, stderr = stderr, ...
processx::run(path, strsplit(args, " ")[[1]], spinner = spinner, ...
throw(new_process_error(res, call = sys.call(), echo = echo, ...

x System command 'pomdpsol.exe' failed, exit status: -1073741515, stderr empty

Is this something you've come across in your development? Any help would be appreciated.

Regards.

Columns and rows of the outputs need to be reordered in function transition_matrix

First of all, thank you for this amazing package!

When I tried to extract a transition probability matrix from the specified model using transition_matrix function in the package, even though the order of column and row names match the pre-specified state names, the values don't match the order of state names. I think you need to add [states,states] after spreading the transition probability dataframe to reorder the elements of the transition probability matrix.

Thanks again!

POMDP_solver

could not find function "R_".
and
"object 'AV' not found."
Tigerproblem is run successfully but while running my code it is showing this error.

Error in transition probability

Hello,

Here is my code for the transition probability part. I got an error which I do not know what is the problem. I really appreciate if you help me with that. I am new to R actually using this package is the only reason I am using it. Thank you in advance. This is my transition probability for 4 different episodes:
transition_prob = list(
a = list(
"Physio" = rbind(c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2)),
"Gait" = rbind(c(0,0.4,0.2,0.2,0.2),
c(0,0,0.6,0.2,0.2),
c(0,0.2,0,0.6,0.2),
c(0,0.2,0.2,0,0.6),
c(0,0.2,0.2,0.6,0)),
"cardiovascular" = rbind(c(0.8,0,0.2,0,0),
c(0,0.8,0.2,0,0),
c(0.2,0,0.8,0,0),
c(0,0.2,0,0.8,0),
c(0,0.2,0,0,0.8)),
"nothing = identity"),
b = list(
"Physio" = rbind(c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2)),
"Gait" = rbind(c(0.4,0,0.2,0.2,0.2),
c(0,0.4,0.2,0.2,0.2),
c(0,0.2,0.4,0.2,0.2),
c(0,0.2,0.2,0.4,0.2),
c(0,0.2,0.2,0.2,0.4)),
"cardiovascular" = rbind(c(0.6,0.2,0.2,0,0),
c(0.2,0.6,0.2,0,0),
c(0.2,0.2,0.6,0,0),
c(0,0.2,0.2,0.6,0),
c(0,0.2,0.2,0,0.6)),
"nothing = identity"),
c = list(
"Physio" = rbind(c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2)),
"Gait" = rbind(c(0,0,0.8,0.2,0),
c(0,0,0,0.8,0.2),
c(0,0,0.8,0.,0.2),
c(0,0,0,0.8,0.2),
c(0,0,0,0,1)),
"cardiovascular" = rbind(c(0,0,0.2,0.8,0),
c(0,0,0.2,0,0.8),
c(0,0,0.8,0,0.2),
c(0,0,0.2,0.6,0.2),
c(0,0.2,0.2,0,0.6)),
"nothing = identity"),
d = list(
"Physio" = rbind(c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2),
c(0.2,0.2,0.2,0.2,0.2)),
"Gait" = rbind(c(0,00.6,0,0.2,0.2),
c(0,0,0.3,0.5,0.2),
c(0,0.3,0.5,0.,0.2),
c(0,0,0,0.8,0.2),
c(0,1,0,0,0)),
"cardiovascular" = rbind(c(0,0,0.3,0.5,0.2),
c(0,0,0.3,0,0.7),
c(0,0,3,0,0.7),
c(0,0,0.2,0.6,0.2),
c(0,0.2,0.2,0,0.6)),
"nothing = identity")),
And here is the error:
Error in format_fixed(transition_prob[[a]], digits) :
formating not implemented for NULL

Thanks again.

One test fails: `Error in `file(open = "w+b")`: cannot open the connection` (perhaps a bug in `processx` though)

Just for the info, though perhaps it is a an external bug:

R version 4.4.0 (2024-04-24) -- "Puppy Cup"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: powerpc-apple-darwin10.0.0d2 (32-bit)

> library("testthat")
> library("pomdp")
> 
> test_check("pomdp")
[ FAIL 1 | WARN 1 | SKIP 0 | PASS 116 ]

══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test-cpp.R:92:1'): (code run outside of `test_that()`) ──────────────
Error in `file(open = "w+b")`: cannot open the connection
Backtrace:
    ▆
 1. └─pomdp::solve_POMDP(Tiger, horizon = 5, discount = 1, method = "enum") at test-cpp.R:92:1
 2.   └─processx::run(...)
 3.     └─processx:::make_buffer()
 4.       └─base::file(open = "w+b")

[ FAIL 1 | WARN 1 | SKIP 0 | PASS 116 ]
Error: Test failures
Execution halted

Getting an error when finite horizon used

Hi Michael,

I'm writing with another issue. I am now using the most update version of R available for Windows, and with RStudio.

I'll upload a file with a minimum working example concurrently to this description.

When I run the source code to create my POMDP object "POMDP_Min" for this minimum working example, I am then able to run code from the console, for example
sol_POMDP_Inf <- solve_POMDP(POMDP_Min)
which seems to work fine. Now I can view the resulting policy graph (which is so great by the way). But I'm also interested in getting the Finite horizon view as a tree, but I'm getting an error when I try:
sol_POMDP_Fin <- solve_POMDP(POMDP_Min, discount = 1, horizon = 7)
I've also tried writing into the POMDP object a discount of 1 and horizon 7. It seems to be the finite horizon that's caused the error as I've tried using a discount less than 1 but still getting the error.

Based on my understanding of a POMDP, if the infinite horizon problem is getting solved with no issues, that the finite horizon problem should also be solvable, although I could be mistaken about this.

Thanks for your help!
Emile
minwexample_fin - Copy.R.txt

mhahsler / pomdp Goto Github PK

pomdp's People

Contributors

Stargazers

Watchers

Forkers

pomdp's Issues

the reward helper expects: action, start.state, end.state, observation, value

Recommend Projects

Recommend Topics

Recommend Org