Giter Site home page Giter Site logo

text-to-video-synthesis-colab's Introduction

๐Ÿฃ Please follow me for new updates https://twitter.com/camenduru
๐Ÿ”ฅ Please join our discord server https://discord.gg/k5BwmmvJJU

๐Ÿšฆ WIP ๐Ÿšฆ

Colab Type
Open In Colab Original
Open In Colab Diffusers (Recommended)
Open In Colab Watermark Remover

Tutorial Video

https://www.youtube.com/watch?v=b8D4am73e6I

maybe we can edit max_frames in /content/models/configuration.json for video duration

{   "framework": "pytorch",
    "task": "text-to-video-synthesis",
    "model": {
        "type": "latent-text-to-video-synthesis",
        "model_args": {
            "ckpt_clip": "open_clip_pytorch_model.bin",
            "ckpt_unet": "text2video_pytorch_model.pth",
            "ckpt_autoencoder": "VQGAN_autoencoder.pth",
            "max_frames": 16,
            "tiny_gpu": 1
        },
        "model_cfg": {
            "unet_in_dim": 4,
            "unet_dim": 320,
            "unet_y_dim": 768,
            "unet_context_dim": 1024,
            "unet_out_dim": 4,
            "unet_dim_mult": [1, 2, 4, 4],
            "unet_num_heads": 8,
            "unet_head_dim": 64,
            "unet_res_blocks": 2,
            "unet_attn_scales": [1, 0.5, 0.25],
            "unet_dropout": 0.1,
            "temporal_attention": "True",
            "num_timesteps": 1000,
            "mean_type": "eps",
            "var_type": "fixed_small",
            "loss_type": "mse"
        }
    },
    "pipeline": {
        "type": "latent-text-to-video-synthesis"
    }
}

Main Repo

https://www.modelscope.cn/models/damo/text-to-video-synthesis/summary
https://github.com/modelscope/modelscope
https://github.com/huggingface/diffusers

Models License

Apache License 2.0

Examples

A giraffe underneath a microwave.
A giraffe underneath a microwave.
A goldendoodle playing in a park by a lake.
A goldendoodle playing in a park by a lake.
A panda bear driving a car.
A panda bear driving a car.
A teddy bear running in New York City.
A teddy bear running in New York City.
Drone flythrough of a fast food restaurant
on a dystopian alien planet.
Drone flythrough of a fast food restaurant on a dystopian alien planet.
A dog wearing a Superhero outfit with red cape
flying through the sky.
A dog wearing a Superhero outfit with red cape flying through the sky.
Monkey learning to play the piano.
Monkey learning to play the piano.
A litter of puppies running through the yard.
A litter of puppies running through the yard.
Robot dancing in times square.
Robot dancing in times square.

Related Colab

https://github.com/camenduru/text2video-zero-colab

text-to-video-synthesis-colab's People

Contributors

camenduru avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.