Comments (3)
Yes, I found them yesterday, a litlle 150GB to clear. I try to understand the differnces between scheduler, vae etc. But testing them take a lot of time.
If you want to learn the interactions without doing what amounts to a bunch of obnoxious boilerplate programming that's been repeated hundreds of times, I'd strongly suggest taking a look at https://github.com/comfyanonymous/ComfyUI ... It's node based so you can see how things link up, and it kinda animates what step it's running in the graph. It's not fast on AMD compared to shark; 512x512 images generate at 4it/s (vs ~23 on shark with sd1.5 models and 26-27 for sd2.1-base, currently); Not fast but not unusable, and since you can ei(ther use the builtin preview or download TAESD and live-preview the UNET as it iterates (--preview-method=taesd / =auto) you can see how the images start off and form based on your sampler selection.
ComfyUI is way more likely to OOM your video card, but it'll just fall back onto shared system memory and get incredibly slow rather than throw an out-of-memory error so keep an eye on your memory use.
I tried OliveML / ONNX, but it's maybe 15% slower than Shark and has even less support at the moment.
Just a warning, two of the samplers (the DPMs I think) throw non-critical errors because they attempt to pass a non-DML tensor to the DirectML backend (shouldn't be too bad a fix), and the DDIM sampler just freezes the command prompt it was run in (who knows); it has to be force-quit so avoid it. A few other samplers state that they're falling back on CPU (really, scalar lerp isn't implemented?) but most of the code still runs on the GPU and there's no diffference. Lerp can probably be replaced with a supported op or the scalar value pulled from the result. The _gpu samplers don't work either, they seem to be NVidia specific.
Once you get a combination of models and LoRAs you like worked out you can load them up in shark and skip the recompile delays / run them at 5x the speed again.
from shark.
The LoRA needs to be compiled into the model when a new one is selected. Likewise the models themselves need to be recompiled when new ones are selected, the batch count (simultaneous images) is changed, the size of the image is changed, or a different VAE is selected. It does keep old compiled part+batch size+image size+model+lora+vae.vmfb files so any combination of them you've used at some point is probably still laying around. If you check the directory shark is in you can find the combinations you've already used pretty easily and use the same again to avoid a recompile. A few things like the scheduler files only get compiled once for a given size, not for every model.
The VMFBs normally get deleted when the --clear_all command line option is used or you delete them. I have 20GB of them right now and I've only run shark a little bit since the last clear so they're worth keeping an eye on. If you try out a lot of random model / LoRAS combinations you can easily be eating up a couple hundred gigs of space before you realize it.
from shark.
Yes, I found them yesterday, a litlle 150GB to clear. I try to understand the differnces between scheduler, vae etc. But testing them take a lot of time.
from shark.
Related Issues (20)
- App startup error when running with ROCm/HIP 5.5 installed on Windows HOT 4
- Shark 909 broken HOT 2
- Hires fix does not work. HOT 3
- SD: Txt2img errors loading .vmfb when ROCM device selected for generation HOT 12
- After pressing "Generate image" application starts to load something and then loses connection to the server
- Forever loading module and connection error HOT 7
- RX 7900 XTX "rocm://1 is not supported" HOT 13
- 256x256 image generation doesnt work HOT 6
- Very slow start
- Torch IR Pipeline Failure on Scaled Dot Product MHA HOT 2
- Unprecendented VAE Load Times HOT 4
- Stencils no longer work HOT 3
- cannot mmap an empty file HOT 3
- Add an option to save default settings for SD apps. HOT 3
- SD: Error invoking IREE compiler (vulkan) HOT 3
- SD/rocm: Error invoking IREE compiler/LLVM ERROR: Cannot select: 0x26a48d0b0b8 HOT 3
- Clip Skip for SD Txt2Img HOT 1
- SHARK 20231002.973 Errors HOT 4
- Error with nod.ai SHARK 20231005.978 HOT 4
- Img to Img crashing is regular and low Vram mode HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from shark.