wessle / costaware Goto Github PK
View Code? Open in Web Editor NEWRepository for cost-aware project code.
License: MIT License
Repository for cost-aware project code.
License: MIT License
Need some good cost-aware versions of Gym environments. Making one or two with intuitive cost functions based on Gym environments with positive rewards would be ideal. Cost functions for our current MountainCar and Acrobot CostAwareEnvs look pretty arbitrary.
In this issue, I want to start a conversation about how we can make descriptive plots when the data series we are making are quite noisy.
I am currently using synthetic data, but this can be revisited once we finish the remaining experiment scripts. The synthetic data has the form
y = signal(x) + noise(x)
where
signal(x) := L / (1 + exp(-k(x-x0)))
is the generalized logistic function (which is—roughly—what our models give us) and
noise(x) ~ N(mu, sigma)
Suppose I have two sets of signal/noise parameters, with a fixed number of realizations of each. Below, I plot the mean realizations for both sets, as well as the 95% (student) confidence intervals for each.
Obviously there are stylistic questions to be addressed, but the plot actually looks pretty good. But look at the number of steps in the iterations. Let me show what happens when we go from 50 to 500 steps.
It's starting to get hard to read this figure. Of course, we are actually working on the scale of 50,000 runs or even 500,000. Let's see those too.
These figures are basically junk.
One option on these large series is to do direct downsampling. Set a ds
threshold, and then only plot the series like series[::ds]
. This gives us something like (50,000 steps, ds=500
):
Alternatively, we could do moving averages. As an example, here is a simple moving average (window of 500):
But it's unclear to me which of these is preferable, or if we should take an entirely different approach.
We need to be able to seed RandomMDPEnv
so that, whenever identical seeds are provided, identical MDPEnvs
are produced.
Is this already possible with the current class definition?
We should create a Gaussian policy for use with the DeepACAgent.
TrialRunner needs to be able to handle episodic environments to make it compatible with Gym-style environments. Continuing settings can then be handled by specifying a single, long episode.
Need to decide on a standard way to run scripts in the scripts directory, then implement it. One of the main issues is how to refer to the data
directory from within the script.
Use
import data
data_dir = data.__path__[0]
to get the local absolute path to data
.
Is this reasonable? Is there a better way, like doing import costaware
and using costaware.data.__path__
, instead?
The references to the configs directory in scripts/run_experiment.py appear to assume the script is being run from the top-level costaware directory. It would be nice if these references were absolute paths to configs on the local machine.
Need to do the following:
main.utils.experiment
to main.core
and make corresponding changes elsewhere in the codemain.experimental.util
into main.experimental.experimental_envs
and move the environments defined therein into main.core.envs
, then make corresponding changes elsewhereNeed to make plotting utilities for experiment data.
Need a Ray Actor version of this class.
In this issue, I want to start a conversation about how we can make descriptive plots when the data series we are making are quite noisy.
I am currently using synthetic data, but this can be revisited once we finish the remaining experiment scripts. The synthetic data has the form
y = signal(x) + noise(x)
where
signal(x) := L / (1 + exp(-k(x-x0)))
is the generalized logistic function (which is—roughly—what our models give us) and
noise(x) ~ N(mu, sigma)
Suppose I have two sets of signal/noise parameters, with a fixed number of realizations of each. Below, I plot the mean realizations for both sets, as well as the 95% (student) confidence intervals for each.
Obviously there are stylistic questions to be addressed, but the plot actually looks pretty good. But look at the number of steps in the iterations. Let me show what happens when we go from 50 to 500 steps.
It's starting to get hard to read this figure. Of course, we are actually working on the scale of 50,000 runs or even 500,000. Let's see those too.
These figures are basically junk.
One option on these large series is to do direct downsampling. Set a ds
threshold, and then only plot the series like series[::ds]
. This gives us something like (50,000 steps, ds=500
):
Alternatively, we could do moving averages. As an example, here is a simple moving average (window of 500):
But it's unclear to me which of these is preferable, or if we should take an entirely different approach.
Each instantiation agent = AgentClassName()
has an attribute agent.title = 'AgentClassName'
. This is used in scripts/run_experiment.py to record the agent's class type when logging.
type(agent).__name__ == agent.title
, so removing agent.title
and instead using type(agent).__name__
avoids duplication.
@DavidNKraemer Is there another reason to keep agent.title
? Is agent.title
used anywhere other than scripts/run_experiment.py?
We need to make a proper Dockerfile or build procedure to make sure our Python versions and other dependencies are all the same.
Need to do the following:
main
to costware
and make corresponding changes throughout the codeWe're creating an ExperimentRunner
object that reads in an experiment_config
file and launches a corresponding experiment. I just took a first crack at the class diagram, which can be found in the notes
directory on the experiments
branch.
@DavidNKraemer Suggestions? Comments? My UML usage may need correcting.
Now that we know the Q-learning algorithm works reasonably well on cost-aware gym environments, we should try to get it working on our portfolio management environment.
Get it working on some reasonable examples.
@DavidNKraemer Plotter
appears to require a working LaTeX installation by default. I agree we need to keep the ability to plot LaTeX, but is there a way to do this without requiring a working installation? I just use Overleaf for my LaTeX needs and want to avoid installing it locally, if possible. One possible workaround is to finally get around to #32.
We need to merge experiments
into master
and get everything into publishable shape.
We should try to get the DeepACAgent working on our cost-aware gym environments. First step would be to create Gaussian policy to use with the agent.
We currently initialize agents in a hodgepodge of different ways depending on the dimension of the state and action spaces, the actions themselves, and the specific environment expected. The result is that the arguments we pass in to various agents are too diverse to make a more uniform interface for agent initialization.
Passing the created environment into the agent itself can help sidestep this, since the agent can inspect the environment and collect the required information internally.
experiments
branch need to be redefined to accept envs on initialization and collect the appropriate information.experiments
depending on agents need to be altered to reflect the new agent definitions.experiments
once the experiments
branch becomes master
.There are two lines of ongoing development that will be based on the experiments
branch:
Experiment
object@DavidNKraemer Should we make a sub-branch of experiments
for development of plotting utilities?
Need to debug ExperimentRunner once we have working versions of the Env-, Agent-, and IOManagerConstructors.
Ratios are currently being computed by taking averages over two fixed-length buffers at each timestep. For small buffers this is probably okay, but for larger ones this is an awful lot of additional computation. Is there a better way we could be doing this?
It would be nice to have a working version of this to test with ExperimentRunner once the latter is almost done being debugged.
We sorely need to add documentation throughout the repo.
We need to come up with standard config file formats for specifying both trials and experiments (which are just collections of trials). These are important because we need to have an easy way of saving the (hyper-)parameters we used to generate data alongside the data itself. This will make it easy to identify and replicate experiments, if necessary.
We should come up with code style guidelines for the repository and write them down somewhere. I suggest following PEP8 and emulating @DavidNKraemer's style otherwise.
Plotter
appears to group data together by the name of the agent used. This makes it difficult to run an experiment that tests multiple hyperparameter configurations for a single type of agent, for example. This seems to be because
sns.lineplot(data=data, x='step', y='ratio', hue='agent',
ci=self.confidence)
in Plotter.plot()
forms groups using hue='agent'
.
One flexible solution might be to group according to the name of subdirectories (e.g. AC_trials
, Q_trials
) instead of agents.
Get DeepRVIQLearningAgent working on the cost-aware gym environments.
When running experiment_runner.py
for longer periods, it seems like trials take an increasing amount of time to complete the same number of steps.
The SoftmaxPolicy
currently in use in the LinearACAgent
uses atypical feature vectors.
We need to refactor so that LinearACAgent
uses a classic, standard feature vector mapping in SoftmaxPolicy
. One standard approach to try is a polynomial mapping: each state-action pair (s, a)
gets mapped to the vector [s, a, s * a, 1]
(well, this vector will actually be appropriately normalized, but you get the idea).
I tried running examples/experiment_runner_example.py
and encountered the following runtime error:
Traceback (most recent call last):
File "examples/experiment_runner_example.py", line 18, in <module>
default=f'{data.__path__[0]}/experiment_runner_example',
TypeError: '_NamespacePath' object is not subscriptable
I looked through this error and came to this interpretation: The data
module has a dunder attribute __path__
which is a _NamespacePath
object. Since the _NamespacePath
smells like a list
, we subscript it to get a hard path (a good idea in theory, given local machine compatibility issues). The problem is that _NamespacePath
isn't actually a list
and doesn't support subscripting.
My guess is that this is an error on my end, but I'm unsure where it might be coming from. It seems like just a Python problem. I worry that we may be having a conflict between Python 3.* versioning, in which case we need to pin down exactly what we are using.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.