Giter Site home page Giter Site logo

tomoyukiorita / cog-musicgen-remixer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sakemin/cog-musicgen-remixer

0.0 0.0 0.0 48.77 MB

Music remixer based on MusicGen-Chord

Home Page: https://replicate.com/sakemin/musicgen-remixer

License: Apache License 2.0

Python 98.07% Jupyter Notebook 1.93%

cog-musicgen-remixer's Introduction

Cog Implementation of MusicGen-Remixer

Replicate

MusicGen Remixer is an app based on MusicGen Chord, the modified version of Meta's MusicGen Melody model, which can generate music based on audio-based chord conditions or text-based chord conditions.

You can demo this model or learn how to use it with Replicate's API here.

Run with Cog

Cog is an open-source tool that packages machine learning models in a standard, production-ready container. You can deploy your packaged model to your own infrastructure, or to Replicate, where users can interact with it via web interface or API.

Prerequisites

Cog. Follow these instructions to install Cog, or just run:

sudo curl -o /usr/local/bin/cog -L "https://github.com/replicate/cog/releases/latest/download/cog_$(uname -s)_$(uname -m)"
sudo chmod +x /usr/local/bin/cog

Note, to use Cog, you'll also need an installation of Docker.

Step 1. Clone this repository

git clone https://github.com/sakemin/cog-musicgen-chord

Step 2. Run the model

To run the model, you need a local copy of the model's Docker image. You can satisfy this requirement by specifying the image ID in your call to predict like:

cog predict r8.im/sakemin/musicgen-remixer@sha256:a31601435459035e7c3b0b8e56db374ebf5ae483f2e9c7c25e41c6ac3761fe52 -i prompt="bossa nova" -i music_input=@/your/path/to/input/music.wav

For more information, see the Cog section here

Alternatively, you can build the image yourself, either by running cog build or by letting cog predict trigger the build process implicitly. For example, the following will trigger the build process and then execute prediction:

cog predict -i prompt="bossa nova" -i music_input=@/your/path/to/input/music.wav

Note, the first time you run cog predict, model weights and other requisite assets will be downloaded if they're not available locally. This download only needs to be executed once.

Run on replicate

Step 1. Ensure that all assets are available locally

If you haven't already, you should ensure that your model runs locally with cog predict. This will guarantee that all assets are accessible. E.g., run:

cog predict -i prompt="bossa nova" -i music_input=@/your/path/to/input/music.wav

Step 2. Create a model on Replicate.

Go to replicate.com/create to create a Replicate model. If you want to keep the model private, make sure to specify "private".

Step 3. Configure the model's hardware

Replicate supports running models on variety of CPU and GPU configurations. For the best performance, you'll want to run this model on an A100 instance.

Click on the "Settings" tab on your model page, scroll down to "GPU hardware", and select "A100". Then click "Save".

Step 4: Push the model to Replicate

Log in to Replicate:

cog login

Push the contents of your current directory to Replicate, using the model name you specified in step 1:

cog push r8.im/username/modelname

Learn more about pushing models to Replicate.


Prediction

Prediction Parameters

  • model_version: Model type. Computations take longer when using large or stereo models.
  • prompt: A description of the music you want to generate.
  • music_input: An audio file input for the remix.
  • multi_band_diffusion: If True, the EnCodec tokens will be decoded with MultiBand Diffusion. Not compatible with stereo models.
  • normalization_strategy: Strategy for normalizing audio.
  • beat_sync_threshold: When beat syncing, if the gap between generated downbeat timing and input audio downbeat timing is larger than beat_sync_threshold, consider the beats are not corresponding.
  • chroma_coefficient: Coefficient value multiplied to multi-hot chord chroma.
  • top_k: Reduces sampling to the k most likely tokens.
  • top_p: Reduces sampling to tokens with cumulative probability of p. When set to 0 (default), top_k sampling is used.
  • temperature: Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.
  • classifier_free_guidance: Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.
  • output_format: str = Output format for generated audio. "wav", "mp3"
  • seed: Seed for random number generator. If None or -1, a random seed will be used.

Multi-Band Diffusion

  • Multi-Band Diffusion(MBD) is used for decoding the EnCodec tokens.
  • If the tokens are decoded with MBD, than the output audio quality is better.
  • Using MBD takes more calculation time, since it has its own prediction sequence.

References

Licenses

cog-musicgen-remixer's People

Contributors

sakemin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.