Giter Site home page Giter Site logo

Comments (6)

panicsteve avatar panicsteve commented on September 2, 2024 1

There appear to be 2 issues affecting Macs (MPS) in kohya 23.0.15:

  1. Different required torch/torchvision packages (see: https://www.reddit.com/r/StableDiffusion/comments/15izfrl/sdxl_lora_training_with_kohya_ss_on_apple_silicon/kvbas5s/)

Steps to resolve appear to be:

  • source venv/bin/activate
  • pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
  • Modify the first line of requirements_macos_arm64.txt to say "torch==2.4.0 torchvision==0.15.0" instead of the current values
  1. sd-scripts/library/train_util.py assumes CUDA exists on the system.

Steps to resolve appear to be:

  • Comment out these two lines at line 4970 of kohya_ss/sd-scripts/library/train_util.py:
    with torch.cuda.device(torch.cuda.current_device()):
        torch.cuda.empty_cache()

That enables me to get far enough to start training without an error or assert being thrown. I haven't been able to evaluate the results of the training yet.

from kohya_ss.

dajanaelez avatar dajanaelez commented on September 2, 2024

from kohya_ss.

dajanaelez avatar dajanaelez commented on September 2, 2024

crash user config_ kohya_sskohya_sskohya_ssvenvlibpython3.10… 2.pdf
I really hope you could have a Quick Look on the new issue I receive

from kohya_ss.

dajanaelez avatar dajanaelez commented on September 2, 2024

also main message comes with

I keep receiving error trying to train model locally Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'NSWindow should only be instantiated on the main thread!'

from kohya_ss.

spyroskotsakis avatar spyroskotsakis commented on September 2, 2024

Hello
Did anyone by any chance find a solution? I followed the tip from @panicsteve , but unfortunately, the training process stopped with the following error. "RuntimeError: User specified an unsupported autocast device_type 'mps'"
I'm using an M1 Ultra with 128GB
Thanks a lot!


22:19:47-865199 INFO     Start training Dreambooth...
22:19:47-866936 INFO     Validating lr scheduler arguments...
22:19:47-868933 INFO     Validating optimizer arguments...
22:19:47-870879 INFO     Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/log existence and writability... SUCCESS
22:19:47-872704 INFO     Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model existence and writability... SUCCESS
22:19:47-874695 INFO     Validating /Users/spk/Downloads/v1-5-pruned.safetensors existence... SUCCESS
22:19:47-875971 INFO     Validating /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images existence... SUCCESS
22:19:47-877613 INFO     Folder 20_downloads: 20 repeats found
22:19:47-879296 INFO     Folder 20_downloads: 4 images found
22:19:47-881052 INFO     Folder 20_downloads: 4 * 20 = 80 steps
22:19:47-882344 INFO     Regulatization factor: 1
22:19:47-883423 INFO     Total steps: 80
22:19:47-884603 INFO     Train batch size: 1
22:19:47-885183 INFO     Gradient accumulation steps: 1
22:19:47-885607 INFO     Epoch: 10
22:19:47-885961 INFO     Max train steps: 1600
22:19:47-886327 INFO     lr_warmup_steps = 160
22:19:47-888063 INFO     Saving training config to /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/siemens-model-v1_20240512-221947.json...
22:19:47-889541 INFO     Executing command: /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/accelerate launch --dynamo_backend no --dynamo_mode
                         default --mixed_precision fp16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2
                         /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py --config_file
                         /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml
22:19:47-902822 INFO     Command executed.
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
2024-05-12 22:19:55 WARNING  WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:                                              _cpp_lib.py:144
                                 PyTorch 2.0.0 with CUDA None (you have 2.3.0)
                                 Python  3.10.14 (you have 3.10.14)
                               Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
                               Memory-efficient attention, SwiGLU, sparse and more won't be available.
                               Set XFORMERS_MORE_DETAILS=1 for more details
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
2024-05-12 22:20:03 INFO     Loading settings from                                                                                                         train_util.py:4308
                             /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml...
                    INFO     /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947               train_util.py:4327
2024-05-12 22:20:03 INFO     prepare tokenizer                                                                                                             train_util.py:4861
                    INFO     update token length: 75                                                                                                       train_util.py:4884
                    INFO     prepare images.                                                                                                               train_util.py:1848
                    INFO     found directory /Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images/20_downloads contains 4 image  train_util.py:1773
                             files
                    INFO     80 train images with repeating.                                                                                               train_util.py:1891
                    INFO     0 reg images.                                                                                                                 train_util.py:1894
                    WARNING  no regularization images / 正則化画像が見つかりませんでした                                                                   train_util.py:1901
                    INFO     [Dataset 0]                                                                                                                   config_util.py:565
                               batch_size: 1
                               resolution: (512, 512)
                               enable_bucket: True
                               network_multiplier: 1.0
                               min_bucket_reso: 256
                               max_bucket_reso: 2048
                               bucket_reso_steps: 64
                               bucket_no_upscale: True

                               [Subset 0 of Dataset 0]
                                 image_dir: "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/images/20_downloads"
                                 image_count: 4
                                 num_repeats: 20
                                 shuffle_caption: False
                                 keep_tokens: 0
                                 keep_tokens_separator:
                                 secondary_separator: None
                                 enable_wildcard: False
                                 caption_dropout_rate: 0.0
                                 caption_dropout_every_n_epoches: 0
                                 caption_tag_dropout_rate: 0.0
                                 caption_prefix: None
                                 caption_suffix: None
                                 color_aug: False
                                 flip_aug: False
                                 face_crop_aug_range: None
                                 random_crop: False
                                 token_warmup_min: 1,
                                 token_warmup_step: 0,
                                 is_reg: False
                                 class_tokens: downloads
                                 caption_extension: .txt


                    INFO     [Dataset 0]                                                                                                                   config_util.py:571
                    INFO     loading image sizes.                                                                                                           train_util.py:974
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 2165.92it/s]
                    INFO     make buckets                                                                                                                   train_util.py:980
                    WARNING  min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size      train_util.py:999
                             automatically /
                             bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視
                             されます
                    INFO     number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)                                               train_util.py:1036
                    INFO     bucket 0: resolution (576, 320), count: 20                                                                                    train_util.py:1048
                    INFO     bucket 1: resolution (640, 384), count: 40                                                                                    train_util.py:1048
                    INFO     bucket 2: resolution (768, 320), count: 20                                                                                    train_util.py:1048
                    INFO     mean ar error (without repeats): 0.1001269544878931                                                                           train_util.py:1053
                    INFO     prepare accelerator                                                                                                              train_db.py:106
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/torch/amp/grad_scaler.py:131: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.
  warnings.warn(
accelerator device: mps
                    INFO     loading model for process 0/1                                                                                                 train_util.py:5053
                    INFO     load StableDiffusion checkpoint: /Users/spk/Downloads/v1-5-pruned.safetensors                                     train_util.py:5000
                    INFO     UNet2DConditionModel: 64, 8, 768, False, False                                                                             original_unet.py:1387
2024-05-12 22:20:10 INFO     loading u-net: <All keys matched successfully>                                                                                model_util.py:1009
                    INFO     loading vae: <All keys matched successfully>                                                                                  model_util.py:1017
2024-05-12 22:20:13 INFO     loading text encoder: <All keys matched successfully>                                                                         model_util.py:1074
                    INFO     Enable xformers for U-Net                                                                                                     train_util.py:3083
                    INFO     [Dataset 0]                                                                                                                   train_util.py:2418
                    INFO     caching latents.                                                                                                              train_util.py:1120
                    INFO     checking cache validity...                                                                                                    train_util.py:1130
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 199728.76it/s]
                    INFO     caching latents...                                                                                                            train_util.py:1171
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  2.57it/s]
prepare optimizer, data loader etc.
/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/bitsandbytes/cextension.py:34: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "
'NoneType' object has no attribute 'cadam32bit_grad_fp32'
2024-05-12 22:20:17 INFO     use 8-bit AdamW optimizer | {}                                                                                                train_util.py:4463
Traceback (most recent call last):
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py", line 529, in <module>
    train(args)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py", line 239, in train
    unet, text_encoder, optimizer, train_dataloader, lr_scheduler = accelerator.prepare(
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1213, in prepare
    result = tuple(
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1214, in <genexpr>
    self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1094, in _prepare_one
    return self.prepare_model(obj, device_placement=device_placement)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/accelerator.py", line 1280, in prepare_model
    autocast_context = get_mixed_precision_context_manager(self.native_amp, self.autocast_handler)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1534, in get_mixed_precision_context_manager
    return torch.autocast(device_type=state.device.type, dtype=torch.float16, **autocast_kwargs)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 241, in __init__
    raise RuntimeError(
RuntimeError: User specified an unsupported autocast device_type 'mps'
Traceback (most recent call last):
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
    simple_launcher(args)
  File "/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/venv/bin/python', '/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss/sd-scripts/train_db.py', '--config_file', '/Users/spk/Desktop/StableDiffusion/kohya/kohya_ss__training/model/config_dreambooth-20240512-221947.toml']' returned non-zero exit status 1.
22:20:18-195206 INFO     Training has ended.

from kohya_ss.

dajanaelez avatar dajanaelez commented on September 2, 2024

from kohya_ss.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.