Giter Site home page Giter Site logo

comfyui-mana-nodes's Introduction

ezgif com-optimize(2)

Static Badge Custom Badge

Welcome to the ComfyUI-Mana-Nodes project!

This collection of custom nodes is designed to supercharge text-based content creation within the ComfyUI environment.

Whether you're working on dynamic captions, transcribing audio, or crafting engaging visual content, Mana Nodes has got you covered.

If you like Mana Nodes, give our repo a ⭐ Star and πŸ‘€ Watch our repository to stay updated.

Installation

You can install Mana Nodes via the ComfyUI-Manager

Or simply clone the repo into the custom_nodes directory with this command:

git clone https://github.com/ForeignGods/ComfyUI-Mana-Nodes.git

and install the requirements using:

.\python_embed\python.exe -s -m pip install -r requirements.txt --user

If you are using a venv, make sure you have it activated before installation and use:

pip install -r requirements.txt

Nodes

βœ’οΈ Text to Image Generator

Required Inputs

font

To set the font and its styling you need to input πŸ†— Font Properties node here.

canvas

To configure the canvas input the πŸ–ΌοΈ Canvas Properties

text

Specifies the text to be rendered on the images. Supports multiline text input for rendering on separate lines.

  • For simple text: Input the text directly as a string.
  • For frame-specific text: Use a JSON-like format where each line specifies a frame number and the corresponding text. Example:
    "1": "Hello",
    "10": "World",
    "20": "End"
    

frame_count

Sets the amount of frames this node will output.

Optional Inputs

transcription

Input the transcription output from the 🎀 Speech Recognition node here. Based on this transcription data, πŸ–ΌοΈ Canvas Properties and πŸ†— Font Properties the text should be formatted in a way that builds up lines of words until there is no space on the canvas left (transcription_mode: fill, line).

highlight_font

Input a secondary font πŸ†— Font Properties, that is used to highlight the active caption (transcription_mode: fill, line). When manually setting the text the following syntax can be used to defined which word/character:

Hello <tag>World</tag>

Outputs

images

The generated images with the specified text and configurations, in common ComfyUI format (compatible with other nodes).

transcription_framestamps

Framestamps formatted based on canvas, font and transcription settings. Can be useful to manually correct errors by 🎀 Speech Recognition node. Example: Save this output with πŸ“ Save/Preview Text -> manually correct mistakes -> remove transcription input from βœ’οΈ Text to Image Generator node -> paste corrected framestamps into text input field of βœ’οΈ Text to Image Generator node.

πŸ†— Font Properties

Required Inputs

font_file

Fonts located in the custom_nodes\ComfyUI-Mana-Nodes\font_files\example_font.ttf or system font directories (supports .ttf, .otf, .woff, .woff2).

font_size

Either set single value font_size or input animation definition via the ⏰ Scheduled Values node. (Convert font_size to input)

font_color

Either set single color value (CSS3/Color/Extended color keywords) or input animation definition via the 🌈 Preset Color Animations node. (Convert font_color to input)

x_offset, y_offset

Either set single horiontal and vertical offset value or input animation definition via the ⏰ Scheduled Values node. (Convert x_offset/y_offset to input)

rotation

Either set single rotation value or input animation definition via the ⏰ Scheduled Values node. (Convert rotation to input)

rotation_anchor_x, rotation_anchor_y

Horizontal and vertical offsets of the rotation anchor point, relative to the texts initial position.

kerning

Spacing between characters of font.

border_width

Width of the text border.

border_color

Either set single color value (CSS3/Color/Extended color keywords) or input animation definition via the 🌈 Preset Color Animations node. (Convert border_color to input)

shadow_color

Either set single color value (CSS3/Color/Extended color keywords) or input animation definition via the 🌈 Preset Color Animations node. (Convert shadow_color to input)

shadow_offset_x, shadow_offset_y

Horizontal and vertical offset of the text shadow.

Outputs

font

Used as input on βœ’οΈ Text to Image Generator node for the font and highlight_font.

πŸ–ΌοΈ Canvas Properties

Required Inputs

height, width

Dimensions of the canvas.

background_color

Background color of the canvas. (CSS3/Color/Extended color keywords)

padding

Padding between image border and font.

line_spacing

Spacing between lines of text on the canvas.

Optional Inputs

images

Can be used to input images instead of using background_color.

Outputs

canvas

Used as input on βœ’οΈ Text to Image Generator node to define the canvas settings.

⏰ Scheduled Values

Screenshot 2024-04-27 at 17-07-10 ComfyUI

Required Inputs

frame_count

Sets the range of the x axis of the chart. (always starts at 1)

value_range

Sets the range of the y axis of the chart. (Example: 25, will would be ranging from -25 to 25) This can be changed by zooming via the mousewheel and will reset to the specified value if changed.

easing_type

Is used to generate values in between of the manually added values by the user by clicking the Generate Values button.

The available easing functions are:

  • linear
  • easeInQuad
  • easeOutQuad
  • easeInOutQuad
  • easeInCubic
  • easeOutCubic
  • easeInOutCubic
  • easeInQuart
  • easeOutQuart
  • easeInOutQuart
  • easeInQuint
  • easeOutQuint
  • easeInOutQuint
  • exponential

step_mode

The option single will force the chart to display every single tick/step on the chart. The option auto will automatically remove ticks/step to prevent overlapping.

animation_reset

Used to specify the reset behaviour of the animation.

  • word: animation will be reset when a new word is displayed, stays on last value when animation finished before word change.
  • line: animation will be reset when a new line is displayed, stays on last value when animation finished before line change.
  • never: animation will just run once and stop on last value. (Not affected by word or line change)
  • looped: animation will endlessly loop. (Not affected by word or line change)
  • pingpong: animation will first play forward then back and so on. (Not affected by word or line change)

scheduled_values

Adding Values: Click on the chart to add keyframes at specific points. Editing Values: Double-click on a keyframe to edit its frame and value. Deleting Values: Click on the delete button associated with each keyframe to remove it. Generating Values: Click on the "Generate Values" button to interpolate values between existing keyframes. Deleting Generated Values: Click on the "Delete Generated" button to remove all interpolated values.

Outputs

scheduled_values

Outputs a list of frame and value pairs and the animation_reset option. At the moment this output can be used to animate the following widgets (Convert property to input) of the πŸ†— Font Properties node:

  • font_size (font, higlight_font)
  • x_offset (font)
  • y_offset (font)
  • rotation (font)
🌈 Preset Color Animations

Required Inputs

color_preset

Currently the following color animation presets are available:

  • rainbow
  • sunset
  • grey
  • ocean
  • forest
  • fire
  • sky
  • earth

animation_duration

Sets the length of the animation measured as frames.

animation_reset

Used to specify the reset behaviour of the animation.

  • word: animation will be reset when a new word is displayed, stays on last value when animation finished before word change.
  • line: animation will be reset when a new line is displayed, stays on last value when animation finished before line change.
  • never: animation will just run once and stop on last value. (Not affected by word or line change)
  • looped: animation will endlessly loop. (Not affected by word or line change)
  • pingpong: animation will first play forward then back and so on. (Not affected by word or line change)

Outputs

scheduled_colors

Outputs a list of frame and color definitions and the animation_reset option. At the moment this output can be used to animate the following widgets (Convert property to input) of the πŸ†— Font Properties node:

  • font_color (font, higlight_font)
  • border_color (font, higlight_font)
  • shadow_color (font, higlight_font)
🎀 Speech Recognition

Converts spoken words in an audio file to text using a deep learning model.

Required Inputs

audio

Audio file path or URL.

wav2vec2_model

The Wav2Vec2 model used for speech recognition. (https://huggingface.co/models?search=wav2vec2)

spell_check_language

Language for the spell checker.

framestamps_max_chars

Maximum characters allowed until new framestamp line is created.

Optional Inputs

fps

Frames per second, used for synchronizing with video. (Default set to 30)

Outputs

transcription

Text transcription of the audio. (Should only be used as font2img transcription input)

raw_string

Raw string of the transcription without timestamps.

framestamps_string

Frame-stamped transcription.

timestamps_string

Transcription with timestamps.

Example Outputs

raw_string

Returns the transcribed text as one line.

THE GREATEST TRICK THE DEVIL EVER PULLED WAS CONVINCING THE WORLD HE DIDN'T EXIST

framestamps_string

Depending on the framestamps_max_chars parameter the sentece will be cleared and starts to build up again until max_chars is reached again.

  • In this example framestamps_max_chars is set to 25.
"27": "THE",
"31": "THE GREATEST",
"43": "THE GREATEST TRICK",
"73": "THE GREATEST TRICK THE",
"77": "DEVIL",
"88": "DEVIL EVER",
"94": "DEVIL EVER PULLED",
"127": "DEVIL EVER PULLED WAS",
"133": "CONVINCING",
"150": "CONVINCING THE",
"154": "CONVINCING THE WORLD",
"167": "CONVINCING THE WORLD HE",
"171": "DIDN'T",
"178": "DIDN'T EXIST",

timestamps_string

Returns all transcribed words, their start_time and end_time in json format as a string.

[
  {
    "word": "THE",
    "start_time": 0.9,
    "end_time": 0.98
  },
  {
    "word": "GREATEST",
    "start_time": 1.04,
    "end_time": 1.36
  },
  {
    "word": "TRICK",
    "start_time": 1.44,
    "end_time": 1.68
  },
...
]
🎞️ Split Video

Required Inputs

video

Path the video file.

frame_limit

Maximum number of frames to extract from the video.

frame_start

Starting frame number for extraction.

filename_prefix

Prefix for naming the extracted audio file. (relative to .\ComfyUI\output)

Outputs

frames

Extracted frames as image tensors.

frame_count

Total number of frames extracted.

audio_file

Path of the extracted audio file.

fps

Frames per second of the video.

height, width:

Dimensions of the extracted frames.

πŸŽ₯ Combine Video

Required Inputs

frames

Sequence of images to be used as video frames.

filename_prefix

Prefix for naming the video file. (relative to .\ComfyUI\output)

fps

Frames per second for the video.

Optional Inputs

audio_file

Audio file path or URL.

Outputs

video_file

Path to the created video file.

πŸ“£ Generate Audio (experimental)

Converts text to speech and saves the output as an audio file.

Required Inputs

text

The text to be converted into speech.

filename_prefix

Prefix for naming the audio file. (relative to .\ComfyUI\output)

This node uses a text-to-speech pipeline to convert input text into spoken words, saving the result as a WAV file. The generated audio file is named using the provided filename prefix and is stored relative to the .\ComfyUI-Mana-Nodes directory.

Model: https://huggingface.co/spaces/suno/bark

Foreign Language

Bark supports various languages out-of-the-box and automatically determines language from input text. When prompted with code-switched text, Bark will even attempt to employ the native accent for the respective languages in the same voice.

Example:

Buenos dΓ­as Miguel. Tu colega piensa que tu alemΓ‘n es extremadamente malo. But I suppose your english isn't terrible.

Non-Speech Sounds

Below is a list of some known non-speech sounds, but we are finding more every day.

[laughter]
[laughs]
[sighs]
[music]
[gasps]
[clears throat]
β€” or … for hesitations
β™ͺ for song lyrics
capitalization for emphasis of a word
MAN/WOMAN: for bias towards speaker

Example:

" [clears throat] Hello, my name is Suno. And, uh β€” and I like pizza. [laughs] But I also have other interests such as... β™ͺ singing β™ͺ."

Music

Bark can generate all types of audio, and, in principle, doesn’t see a difference between speech and music. Sometimes Bark chooses to generate text as music, but you can help it out by adding music notes around your lyrics.

Example:

β™ͺ In the jungle, the mighty jungle, the lion barks tonight β™ͺ

Speaker Prompts

You can provide certain speaker prompts such as NARRATOR, MAN, WOMAN, etc. Please note that these are not always respected, especially if a conflicting audio history prompt is given.

Example:

WOMAN: I would like an oatmilk latte please.
MAN: Wow, that's expensive!
πŸ“ Save/Preview Text

Required Inputs

string

The string to be written to the file.

filename_prefix

Prefix for naming the text file. (relative to .\output)

Example Workflows

LCM AnimateDiff Text Animation

Demo

Demo 1 Demo 2 Demo 3
demo1 demo2 demo3

Workflow

example_workflow_1.json

The values for the ⏰ Scheduled Values node cannot be imported yet (you have to add them yourself).

Screenshot 2024-04-28 at 19-18-01 ComfyUI

Speech Recognition Caption Generator

Demo

Turn on audio.

video_552.mp4

Workflow

example_workflow_2.json

TRANSCRIPTION

To-Do

  • Improve Speech Recognition
  • Improve Text to Speech
  • Node to download fonts from DaFont.com
  • SVG Loader/Animator
  • Text to Image Generator Alpha Channel
  • Add Font Support for non Latin Characters
  • 3D Effects, Bevel/Emboss, Inner Shading, Fade in/out
  • Find a better way to define color animations
  • Make more Font Properties animatable

Contributing

Your contributions to improve Mana Nodes are welcome!

If you have suggestions or enhancements, feel free to fork this repository, apply your changes, and create a pull request. For significant modifications or feature requests, please open an issue first to discuss what you'd like to change.

comfyui-mana-nodes's People

Contributors

amorano avatar ernestleft avatar foreigngods avatar quasimondo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

comfyui-mana-nodes's Issues

example_workflow_2.json

I've updated the code and trying to test speech2text however nothing happen ;-)
the node font2img turn to red boarder and thats it.

Check the screenshot:
image

in the log it say this:

SpellChecker module is NOT accessible.
Prompt executed in 83.97 seconds

what am I missing?

Error occurred when executing KSampler:

Can anybody help? Thanks.

Error occurred when executing KSampler:

module 'comfy.sample' has no attribute 'prepare_mask'

File "F:\ComfyUI\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-0246\utils.py", line 381, in new_func
res_value = old_func(*final_args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\nodes.py", line 1344, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\nodes.py", line 1314, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\custom_nodes\Comfyui-StableSR\nodes.py", line 69, in hook_sample
return original_sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample
return original_sample(*args, **kwargs) # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "F:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 234, in motion_sample
function_injections.inject_functions(model, params)
File "F:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 181, in inject_functions
self.orig_prepare_mask = comfy.sample.prepare_mask
^^^^^^^^^^^^^^^^^^^^^^^^^
workflow

Can not connect the speech regonition node to any audio loaders (and can not figure out how to load audio only mp3/4)

Hi, over on reddit i asked if anyone knows a node that would allow me to load in a mp3 with spoken story/bio and have it create a cool video with the text or an avatar talking. https://www.reddit.com/r/comfyui/comments/1dy83u6/comment/lc7u3zc/?context=3

I was linked to your github and the demos looked promising. Got the nodes installed but I can not find a guide on how to use it.
I tried the workflows, the second seems to be close to my goal....
mp3 --> extract the words --> to video with the words matched up to the audio.

However none of comfyui nodes that load audio could be linked to the audio file input of the speech recognition node. So i converted the mp3 to a mp4 using an online converter. sadly the split video node failed (likely as there is no video data... but the error code didnt say anything useful)

'''
Error occurred when executing Split Video:

Error in file C:\AI\ComfyUI_windows_portable\ComfyUI\input\video\dave.mp4, Accessing time t=296.04-296.09 seconds, with clip duration=296 seconds,

File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\video2audio_node.py", line 40, in run
audio, fps = self.extract_audio_with_moviepy(video_path, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\video2audio_node.py", line 71, in extract_audio_with_moviepy
audio.write_audiofile(full_path)
File "", line 2, in write_audiofile
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\AudioClip.py", line 206, in write_audiofile
return ffmpeg_audiowrite(self, filename, fps, nbytes, buffersize,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 2, in ffmpeg_audiowrite
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\io\ffmpeg_audiowriter.py", line 166, in ffmpeg_audiowrite
for chunk in clip.iter_chunks(chunksize=buffersize,
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\AudioClip.py", line 85, in iter_chunks
yield self.to_soundarray(tt, nbytes=nbytes, quantize=quantize,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 2, in to_soundarray
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\AudioClip.py", line 127, in to_soundarray
snd_array = self.get_frame(tt)
^^^^^^^^^^^^^^^^^^
File "", line 2, in get_frame
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(*new_a, **new_kw)
^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 93, in get_frame
return self.make_frame(t)
^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 136, in
newclip = self.set_make_frame(lambda t: fun(self.get_frame, t))
^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 187, in
return self.fl(lambda gf, t: gf(t_func(t)), apply_to,
^^^^^^^^^^^^^
File "", line 2, in get_frame
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(*new_a, **new_kw)
^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\Clip.py", line 93, in get_frame
return self.make_frame(t)
^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\io\AudioFileClip.py", line 77, in
self.make_frame = lambda t: self.reader.get_frame(t)
^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\moviepy\audio\io\readers.py", line 170, in get_frame
raise IOError("Error in file %s, "%(self.filename)+
'''

So how do we take a speech synthed audio file and turn it into a video with words like in your demo's?

'ImageDraw' object has no attribute 'textsize'

Error occurred when executing font2img:

'ImageDraw' object has no attribute 'textsize'

File "/home/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "/home/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "/home/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "/home/ComfyUI/custom_nodes/ComfyUI-Mana-Nodes/font2img_node.py", line 77, in run
images= self.generate_images(start_font_size, font_size_increment, frame_text_dict, rotation_increment, x_offset_increment, y_offset_increment, start_x_offset, end_x_offset, start_y_offset, end_y_offset, font_file, font_color, background_color, image_width, image_height, text_alignment, line_spacing, frame_count, input_images, anchor_x, anchor_y, rotate_around_center, kerning)
File "/home/ComfyUI/custom_nodes/ComfyUI-Mana-Nodes/font2img_node.py", line 309, in generate_images
text_width, text_height = self.calculate_text_block_size(draw, text, font, line_spacing, kerning)
File "/home/ComfyUI/custom_nodes/ComfyUI-Mana-Nodes/font2img_node.py", line 136, in calculate_text_block_size
char_width, char_height = draw.textsize(char, font=font)

How do I fix this error?

Multi Language Font Support

Looks very cool repo. thank you for making it.

Its not an issue, rather a question/request.

Any plans to add non-english fonts RTL support?

Example workflow

Could you upload an example workflow please, the image is unreachable,
Thanks

Character duplication

Hi, when i select French language, i have duplicated letters on most of every words.

Model ID error!

Traceback (most recent call last):
File "D:\Program\ComfyUI\server.py", line 443, in get_object_info
out[x] = node_info(x)
File "D:\Program\ComfyUI\server.py", line 420, in node_info
info['input'] = obj_class.INPUT_TYPES()
File "D:\Program\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\speech2text_node.py", line 21, in INPUT_TYPES
"wav2vec2_model": (cls.get_wav2vec2_models(), {"display": "dropdown", "default": "jonatasgrosman/wav2vec2-large-xlsr-53-english"}),
File "D:\Program\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\speech2text_node.py", line 233, in get_wav2vec2_models
model_names = [model['modelId'] for model in models]
File "D:\Program\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\speech2text_node.py", line 233, in
model_names = [model['modelId'] for model in models]
KeyError: 'modelId'

It's been working fine, connecting to huggingface, github, all of which are accessible.
Until now CMD outputs this error.

Line break in text2image prompt?

Hello! I am trying to display let's say 4 words one by one, but have them stay on screen one below the other.

LOREM
IPSUM
DOLOR
SIT

I can do this with a single frame but not when I try to schedule the frames and make an animation

"1": "LOREM"
"8": "LOREM IPSUM"
"16": "LOREM IPSUM DOLOR"
"24": "LOREM IPSUM DOLOR SIT"

this will just use a single line of text, while I want the words to stack vertically like in the example above.

Any way to control line breaks?

Or any other workaround that doesn't involve having 4 different text2image scheduled values for each row/word?

Cheers!

IMPORT FAILED - moviepi issues

Super excited to check out these nodes! I am on windows 11 running a relatively fresh version of ComfyUI embedded. I have installed directly and through the manager. I am seeing the following error when comfy boots up:

File "D:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\__init__.py", line 3, in <module>
    from .nodes.video2audio_node import video2audio
  File "D:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\nodes\video2audio_node.py", line 9, in <module>
    from moviepy.editor import VideoFileClip
ModuleNotFoundError: No module named 'moviepy'

When doing things manually I tried running the requirements with the following (with and without trusted-host):

..\..\..\python_embeded\python.exe -m pip install -trusted-host -r .\requirements.txt

Per the fixes here lengstrom/fast-style-transfer#129 I also tried moviepi directly with:
..\..\..\python_embeded\python.exe -m pip install --trusted-host pypi.python.org moviepy

FFMpeg is installed and added to my system PATH so I think I am good there.

Any help would be much appreciated

'NoneType' object has no attribute 'shape'

workflow-mana.json
Error occurred when executing KSampler:

'NoneType' object has no attribute 'shape'

File "/home/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "/home/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "/home/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "/home/ComfyUI/nodes.py", line 1375, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
File "/home/ComfyUI/nodes.py", line 1345, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
File "/home/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/animatediff/sampling.py", line 346, in motion_sample
latents = wrap_function_to_inject_xformers_bug_info(orig_comfy_sample)(model, noise, *args, **kwargs)
File "/home/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/animatediff/utils_model.py", line 360, in wrapped_function
return function_to_wrap(*args, **kwargs)
File "/home/ComfyUI/comfy/sample.py", line 100, in sample
samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "/home/ComfyUI/comfy/samplers.py", line 713, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
File "/home/ComfyUI/comfy/samplers.py", line 618, in sample
samples = sampler.sample(model_wrap, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "/home/ComfyUI/comfy/samplers.py", line 557, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/ComfyUI/comfy/k_diffusion/sampling.py", line 580, in sample_dpmpp_2m
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ComfyUI/comfy/samplers.py", line 281, in forward
out = self.inner_model(x, sigma, cond=cond, uncond=uncond, cond_scale=cond_scale, model_options=model_options, seed=seed)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self.call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
File "/home/ComfyUI/comfy/samplers.py", line 271, in forward
return self.apply_model(*args, **kwargs)
File "/home/ComfyUI/comfy/samplers.py", line 268, in apply_model
out = sampling_function(self.inner_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
File "/home/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/animatediff/sampling.py", line 385, in evolved_sampling_function
cond_pred, uncond_pred = sliding_calc_cond_uncond_batch(model, cond, uncond
, x, timestep, model_options)
File "/home/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/animatediff/sampling.py", line 494, in sliding_calc_cond_uncond_batch
sub_cond_out, sub_uncond_out = comfy.samplers.calc_cond_uncond_batch(model, sub_cond, sub_uncond, sub_x, sub_timestep, model_options)
File "/home/ComfyUI/comfy/samplers.py", line 197, in calc_cond_uncond_batch
c['control'] = control.get_control(input_x, timestep
, c, len(cond_or_uncond))
File "/home/ComfyUI/custom_nodes/ComfyUI-Advanced-ControlNet/adv_control/utils.py", line 468, in get_control_inject
return self.get_control_advanced(x_noisy, t, cond, batched_number)
File "/home/ComfyUI/custom_nodes/ComfyUI-Advanced-ControlNet/adv_control/control.py", line 32, in get_control_advanced
return self.sliding_get_control(x_noisy, t, cond, batched_number)
File "/home/ComfyUI/custom_nodes/ComfyUI-Advanced-ControlNet/adv_control/control.py", line 78, in sliding_get_control
control = self.control_model(x=x_noisy.to(dtype), hint=self.cond_hint, timesteps=timestep.float(), context=context.to(dtype), y=y)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/py3.10/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ComfyUI/comfy/cldm/cldm.py", line 295, in forward
assert y.shape[0] == x.shape[0]

raise NotImplementedError, 'emit must be implemented ' SyntaxError: invalid syntax

error message:

Collecting logging (from -r requirements.txt (line 10))
Using cached logging-0.4.9.6.tar.gz (96 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

Γ— python setup.py egg_info did not run successfully.
β”‚ exit code: 1
╰─> [21 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 14, in
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/setuptools/init.py", line 7, in
import _distutils_hack.override # noqa: F401
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/_distutils_hack/override.py", line 1, in
import('_distutils_hack').do_override()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/_distutils_hack/init.py", line 77, in do_override
ensure_local_distutils()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/_distutils_hack/init.py", line 63, in ensure_local_distutils
core = importlib.import_module('distutils.core')
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 22, in
from .dist import Distribution
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 12, in
import logging
File "/private/var/folders/xv/rmnk14792yb52bmmqqqt8vmc0000gn/T/pip-install-q5hkc71g/logging_b87396e952c24f2f8ffab55c4ca0a903/logging/init.py", line 618
raise NotImplementedError, 'emit must be implemented '
^
SyntaxError: invalid syntax
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

Γ— Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

I can't import logging, how can I solve it

'FreeTypeFont' object has no attribute 'getsize'

Help please :)

Error occurred when executing font2img:

'FreeTypeFont' object has no attribute 'getsize'

File "D:\ComfyUI00\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI00\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI00\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI00\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\font2img_node.py", line 81, in run
formatted_transcription = self.format_transcription(transcription, image_width, font_file, start_font_size,padding)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI00\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\font2img_node.py", line 118, in format_transcription
width = self.get_text_width(new_sentence, font_file, font_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\ComfyUI00\ComfyUI\custom_nodes\ComfyUI-Mana-Nodes\font2img_node.py", line 139, in get_text_width
text_width, _ = font.getsize(text)
^^^^^^^^^^^^

Potentially small error in time to frame aligment in font2img_node

I am not sure if this is just due to the way I use this node, but I was struggling with the issue that the very first word would never get rendered if it starts at start time 0.0. The reason seems to be in the format_transcription() method where it calculates

frame_number = round(start_time * transcription_fps)

It looks to me like it assumes that frame numbering starts at 0, but further down in the generate_images() code the first frame is 1:

for i in range(1, frame_count + 1):

So my fix is to add 1 to frame_number which seems to work:
frame_number = 1 + round(start_time * transcription_fps)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.