The macq from ai-planning

Trace generation backward from a given goal state

A baseline backward search using depth-first search
From a given goal state, g
Let g be a unique state in the set G, where G is the set of all states that fit the description of a goal state.

Notes

One of the earliest examples
Focuses on operator refinement

References

Learning by Experimentation: The Operator Refinement Method

Bonet & Geffner (ECAI'20, arXiv'21)

Notes

Just takes state IDs and transitions as input
Produces lifted representation, and does so using SAT
Second one seems to improve on the first, and does so using ASP
We may want to stick with the first initially

References

Mirror TraceList in an ObservationList object

Created when you tokenize an entire TraceList
- Probably best to subclass TraceList
What's used in the extraction methods
Store the (class) type of observation tokens found in the traces

Notes

Assumes access to a simulator

References

Autonomous Learning of Action Models for Planning

More Detailed API/Structure

Functions (no class):

make_generator (from initial Usage: Generator)
generate (from initial Usage: Generate)
extract (from initial Usage: Extract)

Other Types:

state: list of fluents

Class: Generator

Attributes:
- determinism
  - model.determinism.FULLY
  - model.determinism.NON
  - model.determinism.PROB
- fluent_observability:
  - model.fluentobs.FULLY
  - model.fluentobs.PARTIAL
  - model.fluentobs.RANDOM
- parameterized: {atomic_fluents, atomic_actions}
- action_observability: {names, params, pre, eff}
- state_noise: some int between 0-100%
- action_noise: some int between 0-100%
- rationality
  - model.rational.YES
  - model.rational.NO
  - model.rational.BOUNDED
Functions:
- configure(length, use_goal, diversity, method, action_probs=None, missing_fluents=None)
  - action_probs: {action: {effect: prob, ...}, ...} - used only in non-deterministic models to (optionally) specify the probabilities of the effect of each action
  - missing_fluents - list of (optional) fluents the user wants to make invisible if using partial observability.
  - no parameter needed, but if fluent observability is set to RANDOM, then randomly generate which fluents are missing
  - no parameter needed, but remove the visibility of action names, parameters, preconditions and effects appropriately according to the action_observability attribute
  - no parameter needed, but take rationality into account

Class: Predicate

Attributes:
- name
- object(s)

Class: Effect

Attributes:
- name
- object(s)
- probability
- func

Class: Action

Attributes:
- name
- parameters/objects(that the action is acting on)
- preconditions
- effects

Class: Step

Attributes:
- action
- fluents (that the action is acting on)
- state (all fluents prior to action)

Class: ObservationToken

Attributes:
- method
Functions:
- init(method) - takes an enum to determine the method to use (or a function ref / lambda func?)
- tokenize(action, state) - use self.method to generate an obs token for the action-state pair

Class: Trace (Indexable Class) - Each index is a Step

Attributes:
- steps
- num_fluents
- fluents
- actions
Functions:
- get_prev_states(action)
- get_post_states(action)
- Or get_effects(action)
- get_total_cost()
- get_cost_range(start=0, end)
- get_usage(action)

Class: TraceList (Indexable Class)

Attributes:
- traces - list of Trace objects
- generator - the generator function used to generate the TraceList, can be None
Functions:
- generate_more(num) - uses the generator function to generate more Traces while preserving the original
- get_usage(action)
- iter
- getitem
- len
- ... list methods

Class: Model

Attributes:
- fluents
- actions
- initial_state
- goal
- determinism
  - model.determinism.FULLY
  - model.determinism.NON
  - model.determinism.PROB
- fluent_observability:
  - model.fluentobs.FULLY
  - model.fluentobs.PARTIAL
  - model.fluentobs.RANDOM
- parameterized: {atomic_fluents, atomic_actions}
- action_observability: {names, params, pre, eff}
- state_noise: true/false
- action_noise: true/false
- rationality
  - model.rational.YES
  - model.rational.NO
  - model.rational.BOUNDED
Functions:
- serialize()
- unserialize()
- to_pddl()

HANNA: Pasula et al. (ICAPS'04)

Notes

Probabilistic planning induction (STRIPS-like with prob-effects)

References

Learning Probabilistic Relational Planning Rules

Notes

Takes only initial/goal state in the extreme
Planning-based compilation

References

STRIPS Action Discovery

Non-det/Probabilistic Trace Generation

Parse non-det and PPDDL fragments
Generate traces that take the action uncertainty into account
Resulting traces should use the proper trace datastructures

DIEGO: Aineto et al. (ICAPS'18)

Notes

Uses planning to induce the STRIPS model.

References

Learning STRIPS Action Models with Classical Planning

Notes

Conservative approach to model acquisition

References

Efficient, Safe, and Probably Approximately Complete Learning of Action Models (IJCAI'17)

Notes

Not really an action model acquisition technique
Technique for aligning action theories (uses SMT and planner)
Might be useful to try and match pre-existing typical action models

References

D-VAL: An automatic functional equivalence validation tool for planning domain models

Notes

MaxSAT based
Worth trying to implement this one
Implement the more recent one.

References

Learning Action Models from Disordered and Noisy Plan Traces

Add trace generation papers from learning heuristics to README

May need a new section.

Papers are:

Predicting Optimal Solution Cost with Bidirectional Stratified Sampling
Learning Search-Space Specific Heuristics Using Neural Network (Issue #74)
Learning Heuristic Functions in Classical Planning
Neural Network Heuristics for Classical Planning: A Study of Hyperparameter Space
Learning heuristic functions for large state spaces

Create the high-level API for generation and acquisition

Arguments for Generation

number of traces (required)
max length; default inf
coverage -- generate randomly i.e. asymptotic coverage (default) or make sure everything is there at least once if possible within bound of number of traces
rationality bound -- optimal (default) to some %
- i.e., generated with something like WA*
max number of consecutive actions missing; default 0
- alternatively, or in addition to, % likelihood an action is dropped/masked out
% state variables missing; default 0; 100% (i.e., for partial observability)
generate state ID
- same states should generate the same ID's -- this reveals some information
% noise on state -- flips value to random mistake with this much probability; default 0 / max 50
- for generated state IDs, this flips the fluents and uses the new state for the ID
% noise on actions -- flips value to random mistake with this much probability; default 0
- likely needs to be done on the observation model, rather than the core actions

Notes

The original approach

References

Learning Planning Operators by Observation and Practice

Notes

Logic-based approach -- likely worth tackling

References

Learning Partially Observable Deterministic Action Models

Pretty state trace displays

Perhaps use the tabulate package to display traces in a nice way.

LUKE: Zettlemoyer et al. (AAAI'05)

Notes

stochastic version of the set-based approach for FOD

References

Learning Planning Rules in Noisy Stochastic Worlds

LOCM (ICAPS'09) LOCM2 (ICAPS'11), NLOCM (ICAPS'16)

Notes

no need for state traces, just action traces

References

Update Usage/README to Reflect Implementation

blocks_gen = generate.pddl.Generator(problem_id = 123)
Generator does not yet take a problem id, but instead takes the (string) file names of the domain and problem respectively.

# further configuration blocks_gen.configure( { 'length': 20, # 20 steps long 'use_goal': True, 'diversity': True, 'method': generate.pddl.modes.MC } )
Generator does not have this configuration method, as these attributes are specified within the child classes themselves. For example, the length of the traces can be specified in VanillaSampling. (Generator is just a base class that handles the parsing/grounding of a problem given the domain and problem pddl files; all other attributes would be specific to other methods of generation).

# generate 100 traces traces = generate.Generate(generator=blocks_gen, traces=100)
You would have to call generate.pddl.VanillaSampling here (the only generator fleshed out so far). VanillaSampling doesn't take the generator as a parameter because it is its child class anyways. As it stands, you just specify the paths to the domain and problem files, then the plan length and number of traces.

The rest should be ok!

TRAMP (AIJ'14)

Notes

Another logic-based one, but kind of niche

References

Action-model acquisition for planning via transfer learning

EXPO (ICML'94)

Notes

Early work on incremental refinement of operators

References

Learning by Experimentation: Incremental Refinement of Incomplete Planning Domains

Rationality-driven Precondition Reasoning

Idea

If an agent could execute fewer actions than were observed, then they would have done so. This can be used to infer some of the preconditions of the actions in a trace -- i.e., there must be a reason for every action in the plan, and (effects)x(precond) analysis can bring this insight to bear.

TODO

Dig deep on what Shlomo has done in the space.

Partially observable observation token

Subclass the State class with PartialState
Create a new observation token class that will use partial states instead of the full ones

CCN: Sreedharan et al. (ICML-WS'20)

Notes

Iterative process to find the symbols used in the domain unfolding
Assumes access to the simulator

References

Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Black Box Simulators

Vanilla State Sampling

Takes in the number of traces and trace length, and just returns random traces by uniformly sampling the applicable actions.

HANNA LUKE: Pasula et al. (JAIR'07)

Notes

Probabilistic setting
Logic-based (possibly worth focusing on)

References

Learning Symbolic Models of Stochastic Domains

Randomly generate goal state for trace generation

See paper "Learning Search-Space Specific Heuristics Using Neural Network" for one idea of how to generate a goal state

ARMS (AIJ'07, ICAPS'05)

Notes

max-sat based, states+actions

References

DUP (AAMAS'16, ACM TIST'20)

Notes

Plan completion rather than model acquisition.

References

CPISA (AIJ'17)

Notes

Not directly for model acquisition, but can be used in that setting.
Model-lite setup

References

Robust planning with incomplete domain models

Set up CI/CD for PR's to target the right Python versions

LIVE (IJCAI'89)

Notes

Interactive exploration to learn models.

References

Rule Creation and Rule Learning through Environmental Exploration

Coverage badge not generated if linting or tests fail

Until everything passes, the PR will have a "missing resource" badge instead of the coverage badge.

Need to run coverage before other checks, but getting the percentage is tricky as the output changes depending on what tests fail. The current parsing can assume consistent output as the tests all must pass before the "get coverage" job runs, so the format is always that of passing tests.

Time for Valid Traces

The user should be able to specify a constant that indicates the maximum amount of time it takes to find a trace (i.e. if it takes more than 30 seconds, give a warning). Necessary because the complexity of some problems can result in trace generation to fall into infinite loops in some edge cases.

Opmaker2 (ICAART'09)

Notes

Setting of partial observability

References

Automated acquisition of action knowledge

LOP (IJCAI'16)

Notes

LOCM for optimized plans

References

Domain Model Acquisition in the Presence of Static Relations in the LOP System

Refactor test suite

Currently, all the tests are in a single file that is getting excessively long. Going to split tests into separate files resembling the structure of the library.

Ie. tests for macq/trace/action.py will be in tests/trace/action.py

FD-like State Sampler

Takes into account the goal and heuristic in order to find the target sample length. Other notes:

Binomial distribution to compute the plan length
Resets if a deadend is hit
Actions sampled uniformly from those applicable
Returns if no actions are applicable

Code [here]

Trace generation with BFS

A baseline breadth-first search
Without a goal test

Variations may use a goal test

API Usage Ideas

General Ideas

Serialize and unserialize problem configuration
Optional parameter for the session ID, so the user can use their own files instead of the default PDDL problems
Maybe integrating with other aspects of the planning domains library? (For example, retrieving a domain and then listing the possible problems)
In general, using variables that autocomplete so the user doesn’t have to search the docs

Generate

make_generator is less ambiguous then Generator for the other generators
Let Generate take a model or add a function for generating from models
How does Generate know what type of generator it got?

Trace

Individual trace should be indexable by step (not fluent)
traces[0][step] == trace[step]
For structure, each Trace object consists of a list of dictionaries, where each index in the list is a step, and the dictionary for each step holds the action taken and the fluents true at that step
trace.get_action(action) - returns a list of dictionaries: each time the action was taken in the trace and its effects and/or state prior to action (could split in 2)
trace.get_cost(step=None) - cost up to step (default: whole trace)
trace.steps - number of steps
trace.num_fluents - number of fluents
trace.fluents - list of fluents (just the list of all fluents used in this particular trace; no true/false)
trace.actions - similarly, list of actions taken in this trace
trace.unused_actions() - returns a list of actions not used in this trace
trace.get_usage(action) - percentage of total actions the action appears in
trace.test_preconditions(action, [(fluent, true/false), ... ]) - return num/percentage action was taken with those conditions in the trace

Traces

name traces class TraceList to avoid ambiguity with Trace
sort traces by cost (in a min heap if traces > optimal threshold), then before returning, poll the heap so they are stored in a sorted list
traces.generate_more(num) - preserves current traces, converts list back to min heap if the list is large enough
traces.unused_actions() - unused actions
traces.get_usage(action) - percentage of traces the action appears in
traces.test_preconditions(action, [(fluent, true/false), ... ]) - return num/percentage action was taken with those conditions across all traces

Model

model.fluents
model.initial_state
model.goal
model.serialize()
model.unserialize()
model.solve_from(state, steps=None, traces=1) - generate trace(s) from a state

Features (Possible Ideas):

Determinism
- Non-determinism
  - Allow the user to specify the preferred effects of nondeterministic actions (i.e. would prefer that action A results in effect B) before generating traces
  - Or set probabilities
- Probabilistic
  - After the traces are generated, a function in Trace/Traces could give the user information on how these probabilities actually performed (i.e. the predicted probability of effect C of action A being executed was 30%, but when run it was closer to 50%).
Fluent Observability
- Some fluents missing
  - Allow the user to dynamically specify which fluents they would like to make invisible, so they can easily specify some fluents, generate some traces, change the fluents, generate more traces, etc.
- Random fluents missing
  - Similarly, allow the user to re-generate random missing fluents with ease
Action Observability
- Allow the user to specify:
  - Which actions are non-observable
  - Which parameters for a given action are unknown
  - Which actions have unknown preconditions or effects or both
State Noise
- Allow the user to specify the associated probability of a fluent being correct (defaults to 100% - no noise - if none specified)
Action Noise
- Same as above, but with actions

Trace generation via DFS

A baseline depth-first search
Without a goal test

A variation may include a goal test and cycle checking

LOUGA (PRKAW'18)

Notes

Genetic algorithm to do ARMS-like action model acquisition

References

LOUGA: Learning Planning Operators Using Genetic Algorithms

AIA (AAAI'21)

Notes

It's an interactive approach, not a static trace ingestion
Has a human-in-the-loop component

References

Agent Interrogation Algorithm

KIRA: Mourao et al. (UAI'12)

Notes

states+actions, but could be noisy/inaccurate

References

Learning STRIPS Operators from Noisy and Incomplete Observations (UAI'12)

ICARUS (AAAI-SS'18)

Notes

Built on a cognitive architecture

References

Learning Planning Operators from Episodic Traces

FAMA (AIJ'19)

Notes

References

Learning action models with minimal observability

Small updates to the tokenization

We need kwargs to come in on the tokenized method, to be passed to the observation constructor
Instead of just iterating over the steps, enumerate them and pass in the index in addition to the step

ai-planning / macq Goto Github PK

macq's People

Contributors

Stargazers

Watchers

Forkers

macq's Issues

Notes

References

Notes

References

Notes

References

Functions (no class):

Other Types:

Class: Generator

Class: Predicate

Class: Effect

Class: Action

Class: Step

Class: ObservationToken

Class: Trace (Indexable Class) - Each index is a Step

Class: TraceList (Indexable Class)

Class: Model

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Arguments for Generation

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Idea

TODO

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

Notes

References

General Ideas

Generate

Trace

Traces

Model

Features (Possible Ideas):

Notes

References

Notes

References

Notes

References

Notes