hvision-nku / storydiffusion Goto Github PK

View Code? Open in Web Editor NEW

5.3K 5.3K 505.0 22.77 MB

Create Magic Story!

License: Apache License 2.0

Jupyter Notebook 98.62% Python 1.38%

storydiffusion's People

Contributors

Stargazers

Watchers

Forkers

camenduru eltociear kp-forks kustomzone peanutcocktail chenxwh tin2tin mkygogo murongtianfeng andybeyond zhenqicai fffiloni pierian-data deluair de30 micahjank sorokinvld yacineali74 vital121 pseudo-prophet juo2 phammanhhiep ghenghis wipwai rc0dby xiangweizheng codysnider jwthanh qmagix paulovieira-git phuocnguyenhuu jamesccoholan zfbok toannguyen247 rafael-ariascalles bhswallow dedkamaroz syntheticape princetrunks while-basic amehrez paperwave phillipgimmi tarekadam wemersiveadmin bobgoodi76 m1ndb0ts bearstonem sbusso jaraim guismow squareandcompass halr9000 roysh moqingxinai xyteam66 mishav78 mikecl2 ylz201 sikkgit nahidalam tungvuthanh zfd1 strategist922 gchenfly tutumomo xunnew amesianx oladapoduk hzwinsome hercules261188 extro24 gino2013 misterypoem michaeltse321 kevinroggur80105 as682 gilby56 decentralised-ai njgjnj30 fingerx jmngjyt7 hubin858130 jackzhousz john-rice yl90x0l6 echelon-ai zcfrank1st robinwlive hyzwz lovedonly onmygame jadouse5 4bsxlu5j zozozhu larpx liunix61 jags111 beimingmaster linlinyao1

storydiffusion's Issues

UnicodeDecodeError: 'gbk' codec can't decode byte 0xb2 in position 1972: illegal multibyte sequence

it seems unicode decoding error.

Using quantized version with the pipeline

Hello

I am trying to run the comic generation notebook but with quantized version to fit in my 8gb vram. using SSD-1B. However getting the below error :

The expanded size of the tensor (676) must match the existing size (2500) at non-singleton dimension 3. Target sizes: [2, 20, 676, 676]. Tensor sizes: [2500, 2500]

while running the line :

id_images = pipe(id_prompts, num_inference_steps = num_steps, guidance_scale=guidance_scale, height = height, width = width,negative_prompt = negative_prompt,generator = generator).images

Can you help solve this ?

Thanks

UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ..\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:455.)

用默认的案例，角色都是穿着同样的衣服，为什么自己写的提示词，角色的衣服就总是换呢，要怎样写提示词才能保持人物服装一致？

How much VRAM do I need to run this on Gradio?

RTX 3090/4090 can handle this?

And also, are you plan to release the weights on GIthub?

Thanks in advance!

在SD中找不到story diffusion

我已经通过网址下载tory diffusion了，并且在C:\stable-diffusion-webui\extensions\StoryDiffusion也安装了他，可是在webui界面却不显示找不到他是什么原因？

ValueError: cannot find context for 'fork'

Windows 11. Tried using conda exactly as per instructions. No errors during pip.

(storydiffusion) PS E:\hal\pinokio\api\StoryDiffusion\app> python .\app.py
Traceback (most recent call last):
  File "E:\hal\pinokio\api\StoryDiffusion\app\app.py", line 4, in <module>
    import spaces
  File "C:\Users\hal\miniconda3\envs\storydiffusion\lib\site-packages\spaces\__init__.py", line 10, in <module>
    from .zero.decorator import GPU
  File "C:\Users\hal\miniconda3\envs\storydiffusion\lib\site-packages\spaces\zero\decorator.py", line 18, in <module>
    from .wrappers import regular_function_wrapper
  File "C:\Users\hal\miniconda3\envs\storydiffusion\lib\site-packages\spaces\zero\wrappers.py", line 42, in <module>
    Process = multiprocessing.get_context('fork').Process
  File "C:\Users\hal\miniconda3\envs\storydiffusion\lib\multiprocessing\context.py", line 243, in get_context
    return super().get_context(method)
  File "C:\Users\hal\miniconda3\envs\storydiffusion\lib\multiprocessing\context.py", line 193, in get_context
    raise ValueError('cannot find context for %r' % method) from None
ValueError: cannot find context for 'fork'

utils

Miniconda installed
When running first block Comics via jupyter:

ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 14
12 from tqdm.auto import tqdm
13 from datetime import datetime
---> 14 from utils.gradio_utils import is_torch2_available
15 if is_torch2_available():
16 from utils.gradio_utils import
17 AttnProcessor2_0 as AttnProcessor

ModuleNotFoundError: No module named 'utils.gradio_utils'

Plans for 1920 x 1080?

Hi -
I'm curious if you can reveal any rough timelines of when 1920 x 1080 video may be available?
Thanks!

SG161222--RealVisXL_V4.0项目的vae中缺少diffusion_pytorch_model.bin文件该如何解决

报错如下

但是该项目中本来就没有diffusion_pytorch_model.bin文件

'gbk' codec can't decode byte 0xb2 in position 1972: illegal multibyte sequence

打开http://0.0.0.0:7860什么也没是错误的网页

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\transformers\utils\generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\transformers\utils\hub.py:123: FutureWarning: Using TRANSFORMERS_CACHE is deprecated and will be removed in v5 of Transformers. Use HF_HOME instead.
warnings.warn(
C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\transformers\utils\generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree.register_pytree_node(
A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
File "C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\xformers_init.py", line 55, in _is_triton_available
from xformers.triton.softmax import softmax as triton_softmax # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\xformers\triton\softmax.py", line 11, in
import triton
ModuleNotFoundError: No module named 'triton'
C:\Tools\MiniConda\envs\storydiff\Lib\site-packages\diffusers\utils\outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 7/7 [00:03<00:00, 2.21it/s]
successsfully load paired self-attention
number of the processor : 36
Running on local URL: http://0.0.0.0:7860

To create a public link, set share=True in launch().

请问是否支持自己更换绘画模型？比如自己下载的sd1.5的或者xl的模型

NameError: name 'pipe' is not defined

Running with gradio, getting this error:

Traceback (most recent call last):
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/queueing.py", line 501, in call_prediction
output = await route_utils.call_process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/route_utils.py", line 258, in call_process_api
output = await app.get_blocks().process_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/blocks.py", line 1710, in process_api
result = await self.call_function(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/blocks.py", line 1262, in call_function
prediction = await utils.async_iteration(iterator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/utils.py", line 517, in async_iteration
return await iterator.anext()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/utils.py", line 510, in anext
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/utils.py", line 493, in run_sync_iterator_async
return next(iterator)
^^^^^^^^^^^^^^
File "/data/misc/storydiffusion_env/lib/python3.11/site-packages/gradio/utils.py", line 676, in gen_wrapper
response = next(iterator)
^^^^^^^^^^^^^^
File "/data/misc/StoryDiffusion/gradio_app_sdxl_specific_id.py", line 505, in process_generation
del pipe
^^^^
NameError: name 'pipe' is not defined

multiple subjects?

Hello,

Thanks for your nice work!

Does the technique support multiple subjects/characters?

cheers

Does not support multiple trigger words in a single prompt.

ValueError: PhotoMaker currently does not support multiple trigger words in a single prompt.
Trigger word: img, Prompt: anime artwork illustrating The car on the road, near the forest . created by japanese anime studio. highly emotional. best quality, high resolution.

我看代码更新了，请问是支持上传多个图片，在一个图片里控制多个角色了么？请问该怎么写触发词？

How to support long-range video?

Thanks a lot for your excellent work.我想知道的是StoryDiffusion 是如何利用Animatediff 生成长视频的。Animatediff 一次默认生成16帧，假设长视频需要16*6帧，如何保障两个16帧之间的平滑过渡呢？原始的animatediff 似乎并没有提供这项功能。是用上一次生成的最后一帧作为下一次生成的起始帧嘛？有什么技巧嘛~（inference时直接copy 然后加噪嘛）

请问一次上传多个参考图片，不同角色触发词img在同一张图片里应该怎么才能把相应的图片对应的相应的人物身上？

2个或以上的角色互动不行

模型无法分清哪个是哪个, 表情同时作用到同一个画面的所有角色上

python .\gradio_app_sdxl_specific_id.py ERROR

python .\gradio_app_sdxl_specific_id.py
Traceback (most recent call last):
File "D:\StoryDiffusion\gradio_app_sdxl_specific_id.py", line 2, in
import gradio as gr
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio_init_.py", line 3, in
import gradio.simple_templates
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio_simple_templates_init.py", line 1, in
from .simpledropdown import SimpleDropdown
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio_simple_templates\simpledropdown.py", line 6, in
from gradio.components.base import FormComponent
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio\components_init_.py", line 40, in
from gradio.components.multimodal_textbox import MultimodalTextbox
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio\components\multimodal_textbox.py", line 28, in
class MultimodalTextbox(FormComponent):
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio\component_meta.py", line 198, in new
create_or_modify_pyi(component_class, name, events)
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\site-packages\gradio\component_meta.py", line 92, in create_or_modify_pyi
source_code = source_file.read_text()
File "C:\Users\aipc2\anaconda3\envs\storydiffusion\lib\pathlib.py", line 1135, in read_text
return f.read()
UnicodeDecodeError: 'cp950' codec can't decode byte 0xe2 in position 1970: illegal multibyte sequence

Unable to find model named diffusion_pytorch_model.bin full errors in ticket

I am having issues being able to download this model

anyone have idea how to fix

(storydiffusion) PS Z:\GIT\StoryDiffusion> & C:/Users/MindExpander/.conda/envs/storydiffusion/python.exe z:/GIT/StoryDiffusion/gradio_app_sdxl_specific_id.py
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.0.1+cpu)
Python 3.10.11 (you have 3.10.14)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
Loading pipeline components...: 14%|█████████████▏ | 1/7 [00:00<00:00, 10.25it/s]
Traceback (most recent call last):
File "z:\GIT\StoryDiffusion\gradio_app_sdxl_specific_id.py", line 430, in
pipe = StableDiffusionXLPipeline.from_pretrained(sd_model_path, torch_dtype=torch.float16, use_safetensors=True if use_va else False)
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1271, in from_pretrained
loaded_sub_model = load_sub_model(
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 525, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\diffusers\models\modeling_utils.py", line 777, in from_pretrained
model_file = _get_model_file(
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "C:\Users\MindExpander.conda\envs\storydiffusion\lib\site-packages\diffusers\utils\hub_utils.py", line 272, in _get_model_file
raise EnvironmentError(
OSError: Error no file named diffusion_pytorch_model.bin found in directory C:\Users\MindExpander.cache\huggingface\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\unet.
(storydiffusion) PS Z:\GIT\StoryDiffusion>

Final Comic Book is only Showing text for a few images

Hi there,

Please how come the comic only has captions for a few of the images and not all of them?
For instance the example below only has for about 2 images. How can i make sure it there are captions for all?

Error in MPS version , M3, 16G

Getting the following error with [Unstable][a little puppy with big eyes][Japanese Anime] Comic Descrptions: [puppy standing next to a pond]
File "..../miniconda3/envs/storydiffusion/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "..../miniconda3/envs/storydiffusion/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 522, in forward
return self.processor(
File "..../ml_ps/StoryDiffusion/gradio_app_sdxl_specific_id_mps.py", line 137, in call
hidden_states = self.call1(attn, hidden_states,encoder_hidden_states,attention_mask,temb)
File "...../ml_ps/StoryDiffusion/gradio_app_sdxl_specific_id_mps.py", line 197, in call1
hidden_states = F.scaled_dot_product_attention(
RuntimeError: The size of tensor a (576) must match the size of tensor b (1152) at non-singleton dimension 3

License

Hi,
Thank you so much for releasing StoryDiffision! The results are incredible!
I wanted to run this locally and was wondering what the license to use this was. Might it be possible to add an open source license?
Thank you!

AttributeError: 'ImageDraw' object has no attribute 'textsize'

how to solve this issue?

你好，请问这个需要很好的GPU硬件支持才可以运行吗？

问题1，如题。
问题2，它生成的资源有没有版权问题？

how many GPU?

Is it possible to create a Sable Diffusion plugin in the form of a1111?

I believe this is an exceptional project due to its ability to leverage SD's models while addressing the issue of inconsistent image generation in Stable Diffusion. Integrating it into a1111's Stable Diffusion would be a groundbreaking achievement and leave many people in awe.If a plugin for video generation is not feasible, could a plugin be developed for manga generation instead?

Running gradio under Windows

python gradio_app_sdxl_specific_id.py
gives
ValueError: The provided pretrained_model_name_or_path "/mnt/bn/yupengdata2/projects/PhotoMaker/RealVisXL_V4.0" is neither a valid local path nor a valid repo id. Please check the parameter.
which makes sense because that directory does not exist.
Changing the code to

models_dict = {
   "Juggernaut":"RunDiffusion/Juggernaut-XL-v8",
   "RealVision":"SG161222/RealVisXL_V4.0" ,
   "SDXL":"stabilityai/stable-diffusion-xl-base-1.0" ,
   "Unstable":"stablediffusionapi/sdxl-unstable-diffusers-y"
}

does get the models downloading when the script first starts, but then gives the error
OSError: Error no file named diffusion_pytorch_model.bin found in directory D:\.cache\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\vae.
How can we get the required models to download correctly? Thanks.

About RandSample

非常nice的工作，请问：
1）Consistent self-attention中的RandSample逻辑主要体现在哪些代码行？
2）Sampling tokens所需的batch内的不同图片token的来源是哪块呢？

另外好像发现两处比较明显的笔误：
1）Xk, Xq, and Xv stand for the query, key, and value used in attention calculation, respectively.
2）Algorithm 1中的images_features、images_tokens不统一

Line 430 Error

(storydiffusion) Ubuntu@0021-kci-prxmx10011:/StoryDiffusion$ ls
app.py fonts README.md
Comic_Generation.ipynb gradio_app_sdxl_specific_id.py requirements.txt
examples images utils
(storydiffusion) Ubuntu@0021-kci-prxmx10011:/StoryDiffusion$ python gradio_app_sdxl_specific_id.py
Loading pipeline components...: 14%|█▊ | 1/7 [00:00<00:00, 61.42it/s]
Traceback (most recent call last):
File "/home/Ubuntu/StoryDiffusion/gradio_app_sdxl_specific_id.py", line 430, in
pipe = StableDiffusionXLPipeline.from_pretrained(sd_model_path, torch_dtype=torch.float16, use_safetensors=True if use_va else False)
File "/home/Ubuntu/miniconda3/envs/storydiffusion/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/Ubuntu/miniconda3/envs/storydiffusion/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 1271, in from_pretrained
loaded_sub_model = load_sub_model(
File "/home/Ubuntu/miniconda3/envs/storydiffusion/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 525, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/home/Ubuntu/miniconda3/envs/storydiffusion/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3206, in from_pretrained
raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /home/Ubuntu/.cache/huggingface/hub/models--SG161222--RealVisXL_V4.0/snapshots/49740684ab2d8f4f5dcf6c644df2b33388a8ba85/text_encoder_2.

I got this error, any clue?

The advantage compared with ConsiStory.

Hi, StoryDiffusion is a nice work about customized generation and thanks for the open source code. Can you describe the differences and advantages between the proposed CSA and ConsiStory [1] ?

[1] Training-Free Consistent Text-to-Image Generation https://arxiv.org/abs/2402.03286

fails to install all necessary files

python3 -m venv venv
call .\venv\Scripts\activate
pip install -r requirements.txt
python app.py
python pygradio_app_sdxl_specific_id.py

conda create --name storydiffusion python=3.10
conda activate storydiffusion
pip install -U pip

pip install -r requirements.txt

result:

(venv) D:\ai\Video\StoryDiffusion>python gradio_app_sdxl_specific_id.py
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.0.1+cpu)
Python 3.10.11 (you have 3.10.11)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\ai\Video\StoryDiffusion\gradio_app_sdxl_specific_id.py", line 430, in
pipe = StableDiffusionXLPipeline.from_pretrained(sd_model_path, torch_dtype=torch.float16, use_safetensors=True if use_va else False)
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 1271, in from_pretrained
loaded_sub_model = load_sub_model(
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\diffusers\pipelines\pipeline_utils.py", line 525, in load_sub_model
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\transformers\modeling_utils.py", line 3206, in from_pretrained
raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory C:\Users\Windows.cache\huggingface\hub\models--SG161222--RealVisXL_V4.0\snapshots\49740684ab2d8f4f5dcf6c644df2b33388a8ba85\text_encoder.

(venv) D:\ai\Video\StoryDiffusion>python app.py
Traceback (most recent call last):
File "D:\ai\Video\StoryDiffusion\app.py", line 4, in
import spaces
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\spaces_init_.py", line 10, in
from .zero.decorator import GPU
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\spaces\zero\decorator.py", line 18, in
from .wrappers import regular_function_wrapper
File "D:\ai\Video\StoryDiffusion\venv\lib\site-packages\spaces\zero\wrappers.py", line 42, in
Process = multiprocessing.get_context('fork').Process
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\multiprocessing\context.py", line 243, in get_context
return super().get_context(method)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\multiprocessing\context.py", line 193, in get_context
raise ValueError('cannot find context for %r' % method) from None
ValueError: cannot find context for 'fork'

(venv) D:\ai\Video\StoryDiffusion>

Anyone successful with Radeon / AMD? (e.g.7900)

Has anyone successfully run this on Radeon/AMD?

Video Generation model

What a fantastic great work!

When will the Video Generation model be released?

can this work on Mac M3 ? how much ram required for that ?

Using reference Image

How can i use custom reference image in comic gen like Lecun example

换sdxl又要下10多g的文件，C盘空间不够了，有没有什么方法能不往C盘下啊？

Bug fixes in utils/utils.py

Utils has problems with text generation due to deprecated parameters.

Here are two bug fixes that I put in:

38c38
< width, _ = draw.textsize(test_line, font=font)

    width, _ = draw.textlength(test_line, font=font)

113c113,115
< pad_image = pad_image.resize(images[0].size, Image.ANTIALIAS)

pad_image = pad_image.resize(images[0].size, Image.Resampling.LANCZOS)

CUDA out of memory

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.00 MiB (GPU 0; 6.00 GiB total capacity; 5.07 GiB already allocated; 0 bytes free; 5.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

how can i do?
I tried use below code to set GPU and batch_size, and clean cache,but still had this error:
os.environ["CUDA_VISIABLE_DEVICES"] = "0"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"
if hasattr(torch.cuda, "empty_cache"):
torch.cuda.empty_cache()

[NC] does not seem to work

Prompts all prepended with [NC] and the captions are still applied.
[NC] A kitten outside
Classic Comic Style selected.
The final image still has text captions?

I change the Typesetting Styel to No Typesettings and run again.
This time the prompt images are created but no final panel comic image is created when it shows "Generation Finished"

The main reason I want no captions is that when you use longer prompts the text caption takes up over half the panel hiding the image behind.

These are using the default settings except setting a description for an image and the prompts.

How to define two persons with separate ref images and how to save and open a project ?

first problem: I can upload one ref image, or even several of the same person, but how can I introduce a second person and upload an image for this new guy?
would this be something like
adam img
eva img

second problem: Can I save the customized content of the gradio interface for creating my story under a project name ((and how can I open it later again?)

Startup fails

When starting up with python gradio_app_sdxl_specific_id_low_vram.py I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: './examples/Robert'

How much compute for training the motion predictor?

When uploading multiple reference pictures at a time, how can the trigger words img for different roles be used in the same picture to match the corresponding picture to the corresponding person?

Two-stage Long Videos Generation not implement yet?

Hey first of all congrats on that amazing Project :) i got the gradio_app_sdxl_specific_id.py running and it produce some nice comic strips. but where is the video stored? or is that feature not in the gradio demo and comes later?

thx janosch

Unable to use reference image

Hi there,

When i try to use a reference image i get the error:
PhotoMaker currently does not support multiple trigger words in a single prompt. Trigger word: img, Prompt: anime artwork illustrating driving on highway, festival billboards. created by japanese anime studio. highly emotional. best quality, high resolution.

I usually upload the image and then add 'man img' to the description.

Please how can this be fixed?

hvision-nku / storydiffusion Goto Github PK

storydiffusion's People

Contributors

Stargazers

Watchers

Forkers

storydiffusion's Issues

38c38 < width, _ = draw.textsize(test_line, font=font)

113c113,115 < pad_image = pad_image.resize(images[0].size, Image.ANTIALIAS)

Recommend Projects

Recommend Topics

Recommend Org

38c38
< width, _ = draw.textsize(test_line, font=font)

113c113,115
< pad_image = pad_image.resize(images[0].size, Image.ANTIALIAS)