kostis-s-z / exploring_meta Goto Github PK

Experiments on Model-Agnostic Meta-Learning on Few-Shot Image Classification and Meta-RL (Meta-World)

License: MIT License

Python 100.00%

exploring_meta's Introduction

Hey there 🙃

Machine Learning Engineer with a Computer Science background, experience in Deep Learning research and building ML platforms. Concerns on privacy, sustainability, ethics 🦜

📚 Publications

🎯 Projects, Presentations, Blog Posts

🎓 M.Sc Thesis on Meta-Learning

exploring_meta's People

Contributors

Stargazers

Watchers

Forkers

laknath

exploring_meta's Issues

MAML.adapt() problematic???

MAML-VPG and MAML-PPO seem not to be working well.

MAML-TRPO seems fine.

A difference between these implementations is that the first two use

    learner.adapt(loss)

whereas TRPO uses:

    gradients = torch.autograd.grad(loss, learner.parameters(),
                                    retain_graph=second_order,
                                    create_graph=second_order,
                                    allow_unused=anil)


    learner = l2l.algorithms.maml.maml_update(learner, inner_lr, gradients)

Upgrade to latest L2L

Add download parameter to mini_imagenet

Fix model building

Currently its Python 3.7 dependent since the model is built through a dictionary that assumes order of elements appended.

ANIL doesn't seem to work (at least on Mini ImageNet)

Maybe try Omniglot?
Is there a way to know its not just a hyperparameter tuning issue?
Make sure the "features" (the body of the network) does actually change (use CCA withing training)

MAML/ANIL-PPO is unstable

Fix entry point of scripts

Currently the scripts do not work unless you run the experiments through PyCharm and specify the root folder as Sources root

New Meta-World API breaks code

Quick fix is to install a previous build
pip install git+https://github.com/rlworkgroup/metaworld.git@58546ff25211883ca14d036b3516fe63382c6071#egg=metaworld

[Procgen] Sampler of PPO1 & TRPO always chooses random action

Fix cherry-rl dependency

My fork of cherry-rl which enables saving success metrics is not compatible with Particles2D

Update README

I can't render the metaworld.

After I modify the path and run the program, the following error occurs. Can someone help me solve it?

Fix not installing mujoco

MAML/ANIL - VPG is unstable. Move to its own branch

Merge DiagNormalPolicy and DiagNormalPolicyANIL

Running script needs to first export PYTHONPATH

Disable done signal and find another way to terminate episode in Runner

Possible bugs with the #21 PR

weights[1:].add_(-1.0, dones[:-1])
->
weights[1:] = dones[:-1] - 1.0
p.data.add_(-stepsize, u.data)
->
p.data = u.data + (-stepsize)

        for train_episodes in train_replays:
            new_policy = fast_adapt_trpo_a2c(new_policy, train_episodes, baseline,
                                             fast_lr, gamma, tau, first_order=False, device=device)

        for train_episodes in train_replays:
            # Calculate loss & fit the value function
            loss = trpo_a2c_loss(train_episodes, new_policy, baseline, params['gamma'], params['tau'], device)

            # First or Second order derivatives
            gradients = torch.autograd.grad(loss, new_policy.parameters(),
                                            retain_graph=True,  # First order = False
                                            create_graph=True)

            # Perform a MAML update of all the parameters in the model variable using the gradients above
            new_policy = l2l.algorithms.maml.maml_update(new_policy, params['inner_lr'], gradients)

Remove MAML module wrapper to in maml_trpo and anil_trpo

MAML-PPO
MAML-TRPO
ANIL-TRPO
ANIL-PPO

Make code available to repo to test.

Add vision policies in repo

instead of modifying l2l

The output of the net is not corrected.

When I train my agent, I find that the action always be the -1 or 1 (after clipping)，I wonder how I can solve this problem. Is the lr too large?