Giter Site home page Giter Site logo

gwinndr / musictransformer-pytorch Goto Github PK

View Code? Open in Web Editor NEW
206.0 1.0 47.0 113 KB

MusicTransformer written for MaestroV2 using the Pytorch framework for music generation

License: MIT License

Python 100.00%
pytorch transformer maestro mit python music-generation music-transformer

musictransformer-pytorch's Introduction

Music Transformer

Open in Colab

Currently supports Pytorch >= 1.2.0 with Python >= 3.6

There is now a much friendlier Google Colab version of this project courtesy of Alex!

About

This is a reproduction of the MusicTransformer (Huang et al., 2018) for Pytorch. This implementation utilizes the generic Transformer implementation introduced in Pytorch 1.2.0 (https://pytorch.org/docs/stable/nn.html#torch.nn.Transformer).

Generated Music:

Some various music results (midi and mp3) are in the following Google Drive folder:
https://drive.google.com/drive/folders/1qS4z_7WV4LLgXZeVZU9IIjatK7dllKrc?usp=sharing

See the results section for the model hyperparameters used for generation.

Mp3 results were played through a Kawai MP11SE. In order to play .mid files, we used Midi Editor which is free to use and open source.

TODO

  • Write own midi pre-processor (sustain pedal errors with jason's)
    • Support any midi file beyond Maestro
  • Fixed length song generation
  • Midi augmentations from paper
  • Multi-GPU support

How to run

  1. Download the Maestro dataset (we used v2 but v1 should work as well). You can download the dataset here. You only need the MIDI version if you're tight on space.

  2. Run git submodule update --init --recursive to get the MIDI pre-processor provided by jason9693 et al. (https://github.com/jason9693/midi-neural-processor), which is used to convert the MIDI file into discrete ordered message types for training and evaluating.

  3. Run preprocess_midi.py -output_dir <path_to_save_output> <path_to_maestro_data>, or run with --help for details. This will write pre-processed data into folder split into train, val, and test as per Maestro's recommendation.

  4. To train a model, run train.py. Use --help to see the tweakable parameters. See the results section for details on model performance.

  5. After training models, you can evaluate them with evaluate.py and generate a MIDI piece with generate.py. To graph and compare results visually, use graph_results.py.

For the most part, you can just leave most arguments at their default values. If you are using a different dataset location or other such things, you will need to specify that in the arguments. Beyond that, the average user does not have to worry about most of the arguments.

Training

As an example to train a model using the parameters specified in results:

python train.py -output_dir rpr --rpr 

You can additonally specify both a weight and print modulus that determine what epochs to save weights and what batches to print. The weights that achieved the best loss and the best accuracy (separate) are always stored in results, regardless of weight modulus input.

Evaluation

You can evaluate a model using;

python evaluate.py -model_weights rpr/results/best_acc_weights.pickle --rpr

Your model's results may vary because a random sequence start position is chosen for each evaluation piece. This may be changed in the future.

Generation

You can generate a piece with a trained model by using:

python generate.py -output_dir output -model_weights rpr/results/best_acc_weights.pickle --rpr

The default generation method is a sampled probability distribution with the softmaxed output as the weights. You can also use beam search but this simply does not work well and is not recommended.

Pytorch Transformer

We used the Transformer class provided since Pytorch 1.2.0 (https://pytorch.org/docs/stable/nn.html#torch.nn.Transformer). The provided Transformer assumes an encoder-decoder architecture. To make it decoder-only like the Music Transformer, you use stacked encoders with a custom dummy decoder. This decoder-only model can be found in model/music_transformer.py.

At the time this reproduction was produced, there was no Relative Position Representation (RPR) (Shaw et al., 2018) support in the Pytorch Transformer code. To account for the lack of RPR support, we modified Pytorch 1.2.0 Transformer code to support it. This is based on the Skew method proposed by Huang et al. which is more memory efficient. You can find the modified code in model/rpr.py. This modified Pytorch code will not be kept up to date and will be removed when Pytorch provides RPR support.

Results

We trained a base and RPR model with the following parameters (taken from the paper) for 300 epochs:

  • learn_rate: None
  • ce_smoothing: None
  • batch_size: 2
  • max_sequence: 2048
  • n_layers: 6
  • num_heads: 8
  • d_model: 512
  • dim_feedforward: 1024
  • dropout: 0.1

The following graphs were generated with the command:

python graph_results.py -input_dirs base_model/results?rpr_model/results -model_names base?rpr

Note, multiple input models are separated with a '?'

Loss Results Graph

Accuracy Results Graph

Learn Rate Results Graph

Best loss for base model: 1.99 on epoch 250
Best loss for rpr model: 1.92 on epoch 216

Discussion

The results were overall close to the results from the paper. Huang et al. reported a loss of around 1.8 for the base and rpr models on Maestro V1. We use Maestro V2 and perform no midi augmentations as they had discussed in their paper. Furthermore, there are issues with how sustain is handled which can be observed by listening to some pre-processed midi files. More refinement with the addition of those augmentations and fixes may yield the loss results in line with the paper.

musictransformer-pytorch's People

Contributors

cnelias avatar gwinndr avatar kwikwag avatar marshallr10 avatar myrickbr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

musictransformer-pytorch's Issues

Bug when sequences are smaller than max_seq

Hi!
Small bug to correct, line 92 of the e_piano.py file:

tgt[raw_len] = TOKEN_END throws an error because raw_len is out of bounds. This should be raw_len-1. This is not a problem with the maestro dataset, which probably contains more than max_seq=2048 events for all performances, but this can show up in custom datasets.

Thank you very much :)

Hey Damon,

I just wanted to say hi and thank you for your fantastic work on Music Transformer PyTorch repo/code.

It works great and I was able to reproduce the results.

I have created a nice Google Colab with your repo/code, so I wanted to invite you to check it out. And if you like, feel free to add it to your repo so that people can try it easily and quickly.

https://github.com/asigalov61/SuperPiano/blob/master/Super_Piano_3.ipynb

Sincerely,

Alex

Use validation set when selecting the best epoch

Need to select the best epoch based on the validation set and use that best epoch to get the final accuracy and loss.

Using the test set to select the best epoch, biases the model towards the test set. The test set is supposed to represent "in the wild" data and evaluate how well the final model performs on yet-to-be-seen data.

Hello, I would like to know the current project MusicTransformer-Pytorch and Optimus-VIRTUOSO in music generation which effect is good, or your sample data set where to find it, I want to test or train your project to see the effect, I am a beginner, the current know not much, would like to ask the two big brothers guidance, thanks.

Hello, I would like to know the current project MusicTransformer-Pytorch and Optimus-VIRTUOSO in music generation which effect is good, or your sample data set where to find it, I want to test or train your project to see the effect, I am a beginner, the current know not much, would like to ask the two big brothers guidance, thanks.

Change the target length and raise an issue

When I tried to generate midi longer and I change the parameter target_seq_length[in argument_funcs.py] to 4096(default 1024), the issues can be reproduced:
RuntimeError: The size of tensor a (2049) must match the size of tensor b (2048) at non-singleton dimension 0

I want to know it is the Transformer's limit or just the MusicTransformer settings cause the problem?

pre-trained version?

Hi! Is there a way to access the weights to get a pre-trained version of the network? I have searched the repo but couldn't find anything, and I cannot train such a network on my small laptop.

Continue training error and trained model

In train.py line 137
for epoch in range(start_epoch, args.epochs):
When I want to continue to train based on a model weight, there will be an error. Maybe it should be start_epoch+args.epochs ?

I also woud like to ask if you can provide a trained weight, which can be very convenient.

Thank you

Working Google Colab version

Hey Damon,

I think I finally did it and I was able to make a fully working Google Colab that actually plays well. I used my TMIDI processors and I also streamlined the colab/implementation.

https://github.com/asigalov61/SuperPiano/blob/master/%5BTMIDI%5D_Super_Piano_3.ipynb

The only thing it does not have is the control_changes/program_changes/sustains (I still need to implement it in my processors) but it still plays pretty well IMHO on my dataset. Not sure about MAESTRO but you are welcome to try it.

Let me know if it is useful.

Thanks.

Alex.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.