Giter Site home page Giter Site logo

costaware's People

Contributors

davidnkraemer avatar wessle avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

costaware's Issues

Make cost-aware versions of gym envs

Need some good cost-aware versions of Gym environments. Making one or two with intuitive cost functions based on Gym environments with positive rewards would be ideal. Cost functions for our current MountainCar and Acrobot CostAwareEnvs look pretty arbitrary.

Plotting with noise

In this issue, I want to start a conversation about how we can make descriptive plots when the data series we are making are quite noisy.

Problem

I am currently using synthetic data, but this can be revisited once we finish the remaining experiment scripts. The synthetic data has the form

y = signal(x) + noise(x)

where

signal(x) := L / (1 + exp(-k(x-x0)))

is the generalized logistic function (which is—roughly—what our models give us) and

noise(x) ~ N(mu, sigma)

Suppose I have two sets of signal/noise parameters, with a fixed number of realizations of each. Below, I plot the mean realizations for both sets, as well as the 95% (student) confidence intervals for each.

plot

Obviously there are stylistic questions to be addressed, but the plot actually looks pretty good. But look at the number of steps in the iterations. Let me show what happens when we go from 50 to 500 steps.

plot

It's starting to get hard to read this figure. Of course, we are actually working on the scale of 50,000 runs or even 500,000. Let's see those too.

plot
plot

These figures are basically junk.

Some solutions

One option on these large series is to do direct downsampling. Set a ds threshold, and then only plot the series like series[::ds]. This gives us something like (50,000 steps, ds=500):

plot

Alternatively, we could do moving averages. As an example, here is a simple moving average (window of 500):

plot

But it's unclear to me which of these is preferable, or if we should take an entirely different approach.

Seeding RandomMDPEnv

Issue

We need to be able to seed RandomMDPEnv so that, whenever identical seeds are provided, identical MDPEnvs are produced.

Question

Is this already possible with the current class definition?

Gaussian policy

We should create a Gaussian policy for use with the DeepACAgent.

TrialRunner for episodic envs

Issue

TrialRunner needs to be able to handle episodic environments to make it compatible with Gym-style environments. Continuing settings can then be handled by specifying a single, long episode.

Running scripts, saving to data directory

Issue

Need to decide on a standard way to run scripts in the scripts directory, then implement it. One of the main issues is how to refer to the data directory from within the script.

Suggestion

Use

import data

data_dir = data.__path__[0]

to get the local absolute path to data.

Question

Is this reasonable? Is there a better way, like doing import costaware and using costaware.data.__path__, instead?

run_experiment must be run from costaware directory

Issue

The references to the configs directory in scripts/run_experiment.py appear to assume the script is being run from the top-level costaware directory. It would be nice if these references were absolute paths to configs on the local machine.

Rearrangements

Need to do the following:

  • move main.utils.experiment to main.core and make corresponding changes elsewhere in the code
  • combine main.experimental.util into main.experimental.experimental_envs and move the environments defined therein into main.core.envs, then make corresponding changes elsewhere

Plotting noisy data

In this issue, I want to start a conversation about how we can make descriptive plots when the data series we are making are quite noisy.

Problem

I am currently using synthetic data, but this can be revisited once we finish the remaining experiment scripts. The synthetic data has the form

y = signal(x) + noise(x)

where

signal(x) := L / (1 + exp(-k(x-x0)))

is the generalized logistic function (which is—roughly—what our models give us) and

noise(x) ~ N(mu, sigma)

Suppose I have two sets of signal/noise parameters, with a fixed number of realizations of each. Below, I plot the mean realizations for both sets, as well as the 95% (student) confidence intervals for each.

plot

Obviously there are stylistic questions to be addressed, but the plot actually looks pretty good. But look at the number of steps in the iterations. Let me show what happens when we go from 50 to 500 steps.

plot

It's starting to get hard to read this figure. Of course, we are actually working on the scale of 50,000 runs or even 500,000. Let's see those too.

plot
plot

These figures are basically junk.

Some solutions

One option on these large series is to do direct downsampling. Set a ds threshold, and then only plot the series like series[::ds]. This gives us something like (50,000 steps, ds=500):

plot

Alternatively, we could do moving averages. As an example, here is a simple moving average (window of 500):

plot

But it's unclear to me which of these is preferable, or if we should take an entirely different approach.

Access to agent class names

Current setup

Each instantiation agent = AgentClassName() has an attribute agent.title = 'AgentClassName'. This is used in scripts/run_experiment.py to record the agent's class type when logging.

Suggestion

type(agent).__name__ == agent.title, so removing agent.title and instead using type(agent).__name__ avoids duplication.

Questions

@DavidNKraemer Is there another reason to keep agent.title? Is agent.title used anywhere other than scripts/run_experiment.py?

Renaming

Need to do the following:

  • rename main to costware and make corresponding changes throughout the code

ExperimentRunner class diagram

Background

We're creating an ExperimentRunner object that reads in an experiment_config file and launches a corresponding experiment. I just took a first crack at the class diagram, which can be found in the notes directory on the experiments branch.

Issue

@DavidNKraemer Suggestions? Comments? My UML usage may need correcting.

Plotter requires working LaTeX installation

@DavidNKraemer Plotter appears to require a working LaTeX installation by default. I agree we need to keep the ability to plot LaTeX, but is there a way to do this without requiring a working installation? I just use Overleaf for my LaTeX needs and want to avoid installing it locally, if possible. One possible workaround is to finally get around to #32.

Agents should be initialized with envs

Issue

We currently initialize agents in a hodgepodge of different ways depending on the dimension of the state and action spaces, the actions themselves, and the specific environment expected. The result is that the arguments we pass in to various agents are too diverse to make a more uniform interface for agent initialization.

Solution

Passing the created environment into the agent itself can help sidestep this, since the agent can inspect the environment and collect the required information internally.

TODO

  • All agents on the experiments branch need to be redefined to accept envs on initialization and collect the appropriate information.
  • All scripts on experiments depending on agents need to be altered to reflect the new agent definitions.
  • Plans must be made to fix any conflicts and rebase other branches on experiments once the experiments branch becomes master.

Plotting branch?

Issue:

There are two lines of ongoing development that will be based on the experiments branch:

  • plotting utilities
  • the Experiment object

Question:

@DavidNKraemer Should we make a sub-branch of experiments for development of plotting utilities?

Debug ExperimentRunner

Need to debug ExperimentRunner once we have working versions of the Env-, Agent-, and IOManagerConstructors.

Ratio computation

Ratios are currently being computed by taking averages over two fixed-length buffers at each timestep. For small buffers this is probably okay, but for larger ones this is an awful lot of additional computation. Is there a better way we could be doing this?

Implement ConfigManager

It would be nice to have a working version of this to test with ExperimentRunner once the latter is almost done being debugged.

Documentation!

We sorely need to add documentation throughout the repo.

Config file formats for trials and experiments

Issue

We need to come up with standard config file formats for specifying both trials and experiments (which are just collections of trials). These are important because we need to have an easy way of saving the (hyper-)parameters we used to generate data alongside the data itself. This will make it easy to identify and replicate experiments, if necessary.

Plotter handles some experiments inappropriately

Plotter appears to group data together by the name of the agent used. This makes it difficult to run an experiment that tests multiple hyperparameter configurations for a single type of agent, for example. This seems to be because

sns.lineplot(data=data, x='step', y='ratio', hue='agent',
             ci=self.confidence)

in Plotter.plot() forms groups using hue='agent'.

One flexible solution might be to group according to the name of subdirectories (e.g. AC_trials, Q_trials) instead of agents.

Trials appear to slow down

Issue

When running experiment_runner.py for longer periods, it seems like trials take an increasing amount of time to complete the same number of steps.

A proper linear softmax policy

Issue

The SoftmaxPolicy currently in use in the LinearACAgent uses atypical feature vectors.

Solution

We need to refactor so that LinearACAgent uses a classic, standard feature vector mapping in SoftmaxPolicy. One standard approach to try is a polynomial mapping: each state-action pair (s, a) gets mapped to the vector [s, a, s * a, 1] (well, this vector will actually be appropriately normalized, but you get the idea).

Fun with `experiment_runner_example.py`

I tried running examples/experiment_runner_example.py and encountered the following runtime error:

Traceback (most recent call last):
  File "examples/experiment_runner_example.py", line 18, in <module>
    default=f'{data.__path__[0]}/experiment_runner_example',
TypeError: '_NamespacePath' object is not subscriptable

I looked through this error and came to this interpretation: The data module has a dunder attribute __path__ which is a _NamespacePath object. Since the _NamespacePath smells like a list, we subscript it to get a hard path (a good idea in theory, given local machine compatibility issues). The problem is that _NamespacePath isn't actually a list and doesn't support subscripting.

My guess is that this is an error on my end, but I'm unsure where it might be coming from. It seems like just a Python problem. I worry that we may be having a conflict between Python 3.* versioning, in which case we need to pin down exactly what we are using.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.