Негізгі бет Ғылым және технология ComfyUI: RAVE for video transformation (vid2 vid)

Күн бұрын

ComfyUI: RAVE for video transformation (vid2 vid)

Рет қаралды 6,612

1 1

Пікірлер: 61

@epelfeld
7 ай бұрын
Best complex tutorial I saw, I succeeded to repeat everything even some nodes are already changed. Subscribed, appreciate your work, hope you will make more like this for nubes like me (there are a lot). Thank you PS. When I use jugernaut model I have out of memory message on Unsampler.
@koalanation
7 ай бұрын
Thanks! I try to get to the point and make it as simple as possible, considering these are not beginners tutorials. We are all noobs...specially because things move very fast and we all need to learn new things all the time... RAVE unsampler used a lot of memory, unfortunately. You can reduce the number of frames to be processed or reduce the image resolution. You can also try evolved sampling (from Animatediff evolved nodes) and see if it works but I have no tried it myself yet
@epelfeld
7 ай бұрын
@@koalanation thanks a lot, lower resolution works
@skycladsquirrel
8 ай бұрын
I'm dropping my new Ai music video today. Then I see this. lol Awesome video. Future's looking bright!
@koalanation
8 ай бұрын
Very cool videos you have on IG! Love how artists like you embrace these tools to make great stuff!
@skycladsquirrel
8 ай бұрын
🥰🙏@@koalanation
@xr3kTx
3 ай бұрын
This did wonders
@koalanation
3 ай бұрын
@@xr3kTx it is fun!
@xr3kTx
3 ай бұрын
@@koalanation I took great inspiration from your workflow because I need to understand the tools at play, I actually did this with SDXL. I am using a framecap of 100, however the face seems to glitch. Can you suggest anything for the face glitching? I did use ipadapter with style and composition transfer, but every few frames it seems to redo the context.
@koalanation
3 ай бұрын
@@xr3kTx I did not dare to use SDXL because of the GPU and VRAM requirements...besides, SDXL AnimateDiff is also difficult...with hotshot is ok, but then you are limited in a context window of 8 frames...not sure if testing with SD 1.5 is an option for you. You can always upscale and refine the output
@xr3kTx
3 ай бұрын
@@koalanation I have had better results with SDXL personally (I am using a lora and sdxl respects it more for my character + ip adapter for style), I am using RTX A6000 on runpod so resources are less of a concern, its the workflow that I need to improve.
@koalanation
3 ай бұрын
@@xr3kTx good to know. I may then give it another try...have you tried with free init in AD? Not sure how it will work with this setup, though. But it is a lot of trial and error, you know...
@zerox9646
8 ай бұрын
great work
@koalanation
8 ай бұрын
Thanks!
@maxfxgr
8 ай бұрын
Amazing video! keep them coming mate, Greetings from Greece!
@koalanation
8 ай бұрын
Ela! Thanks for the support!
@charnel3786
8 ай бұрын
Thanks for the tutorial
@ryanontheinside
8 ай бұрын
Amazing job
@koalanation
8 ай бұрын
Thank you! Cheers!
@drviolet396
8 ай бұрын
if I were to add an IPadapter where would you recommend to connect the model? at the upsampler level or after the animatediff connecting to the last ksampler?
@koalanation
8 ай бұрын
With IP Adapter, as you want to have control of the output, i would use it for the last ksampler, but no other reason than that...
@hamedsadeghizadeh6660
7 ай бұрын
thanks
@drviolet396
8 ай бұрын
can you elaborate what does actually the part of uspampling-noise generation? there is an empty text prompt connected to a controlnet and cfg is 1
@koalanation
8 ай бұрын
Hi! As I understand it, the unsampler is doing the reverse process as the Ksampler. So I assume not need to guide the prompt, that is why is blank? the github repo indicates best results with cfg of 1, but no reason why not to play with other values: github.com/BlenderNeko/ComfyUI_Noise?tab=readme-ov-file
@D3coify
7 ай бұрын
Thanks
@D3coify
7 ай бұрын
Oh so Depth makes the video more realistic?
@ZoSza-m7y
8 ай бұрын
is it possible to use a reference image instead of a prompt?
@koalanation
8 ай бұрын
In principle yes...but the idea of RAVE is to use the prompt to create something different. To use a reference image, using IP adapter may be a simpler solution. Check out other videos I have, like this one: kzitem.info/news/bejne/rJdqq4Kab2WHdaQsi=7usgz4pZnfVngOrn
@ZoSza-m7y
8 ай бұрын
Thank you@@koalanation
@aaagaming2023
8 ай бұрын
Is there any way to maintain consistency with input video? Creates a lot of extra fingers and stuff with humans. Would adding a second controlnet such as openpose help?
@koalanation
8 ай бұрын
Fingers are tricky...if you want better control you may want to use openpose or the meshgraphormer for hands....applying masks to the hands and using hed or lineart may also help. But this is more advanced and elaborated
@aaagaming2023
8 ай бұрын
@@koalanation Have you seen jboogx workflow for Animatediff? Im thinking about something like that but with RAVE instead.
@koalanation
8 ай бұрын
yes, I have seen it. It is very complete,. That setup is very complete, to do many things. The idea with RAVE is to give more power to the prompt, but I guess it can combine nicely. Good luck!
@aaagaming2023
8 ай бұрын
@@koalanation I think ideal for the usecase of consistent transform of realistic human would be CN's dwopenpose, depth and tile with RAVE and an AD pass after.
@koalanation
8 ай бұрын
Good idea!
@Stopsign002
8 ай бұрын
is there any reason to run the rave part of this process at higher than 12 steps? Also, it seems like I run out of vram if I run too many frames through the process (meaning I have to skip n frames). I would imagine this is expected?
@koalanation
8 ай бұрын
The Rave example uses 25. Decreasing it to 12 was working for me, at the end you want to find the sweet spot between speed and quality. The Rave Ksampler uses quite a lot of VRAM and depending on your machine you may need to reduce it. That seems to be one of the limitations of the implementation. Hopefully the developers set some trick to be able more frames....
@spacepxl
8 ай бұрын
If you're running a second pass through AnimateDiff, probably not necessary to go higher than 15 with DPM samplers. As for VRAM, the default is a grid_size of 3, which means you're diffusing a 3x3 grid, so for example if you're working at 512x512, it will actually be using a 1536x1536 image internally, which is just slower and more memory intensive than a batch of 9 512x512, no way around it. You can drop grid_size to 2 for more speed and less memory usage but less consistency.
@tonon_AI
6 ай бұрын
does Rave work for text to video too?
@koalanation
6 ай бұрын
I understand that RAVE is made for video to video...I do not think it will work if you connect an empty latent. For text to video I think it is better to directly use Animatediff. There are great examples out there.
@tonon_AI
6 ай бұрын
@@koalanation thanks! Yeah I use animatediff but the movements are not the same.
@JefHarrisnation
7 ай бұрын
I noticed the model versions of Realistic Vision and Juggernaut are SD 1.5. For this to work do I have to use the 1.5 versions or can I used the new SDXL versions of the models?
@koalanation
7 ай бұрын
According to the ComfyUI implementation GitHub, should work (with limitations). Thus, I guess you can. If you do, use the right ControlNet model versions (for SDXL). github.com/spacepxl/ComfyUI-RAVE
@JefHarrisnation
7 ай бұрын
@@koalanationThanks, will try.
@hitmanehsan
8 ай бұрын
i upgraded to 3090ti 24gig.how much cpu ram i need to do video to video SD? I have 32gig
@koalanation
6 ай бұрын
I think that should do...
@Nibot2023
7 ай бұрын
Edit: So you get a error when using XL models . Not sure what control nets / LoRa's to use to make this work flow work with an XL model. I when with a none XL model and it worked but it craps out on the upscale. Says it needs to reconnect and stalls out. I am hoping to find a way to make XL models work. This tutorial is cool but I am so new I do not understand the nodes. I also crash out on the unsampler portion too..says reconnecting with a close button. Is there a way to reboot it without closing the window to get it back online? The file crashes when you using 1280x720 footage. I set my resize to the size of the footage but I leave the factor to 1. I am not really understanding the upscale math in which you deduced resizing smaller to blow it up. Is there a way to have it use the aspect ratio you want and then upscale to 1920x1080? I get this warning when trying to que prompt it. Error occurred when executing BNK_Unsampler: mat1 and mat2 shapes cannot be multiplied (4235x2048 and 768x320)
@koalanation
7 ай бұрын
I see you found the issue with the mat1 and mat2 messages. Controlnets and the checkpoints need to be same version. Checkout huggingface.co/ckpt/controlnet-sdxl-1.0/tree/main or search in hugginface for specific CN. I have not tried myself the workflow with SDXL, so not sure if I am able to help you...RAVE is a nice tool but uses a crap lot of VRAM. For that reason I did not try SDXL. I find time I will try to update the workflow to adjust for SDXL.
@Nibot2023
7 ай бұрын
@@koalanation Rad! Thank you for taking the time to answer and giving a location for the control nets! I will let you know if I am successful in that area. Last question - I am curious what to do when Comfyui says "reconnecting" and on the pop up says close. The system is crashed on the upscale part. Is there way to keep your work but reboot it to continue on that portion? or do I just have to re-open comfyui like I have been doing and starting over?
@rayenmajoul
8 ай бұрын
does this work with SDXL models?
@koalanation
7 ай бұрын
I do not see why not...but I have not tested, to be honest
@andrejlopuchov7972
8 ай бұрын
For some reason my rtx 3090 got cuds error , like it run out of power
@koalanation
8 ай бұрын
RAVE uses quite a bit of VRAM. I only manage to get 96 frames with a 4090 24 GB. Sometimes less is better...Hopefully they give support to be able to reduce requirements...
@SageMolotov
8 ай бұрын
can we change the vram settings to low vram? would that solve this issue? my workflow failed at RAVE Ksampler (also ran out of VRAM and I have a 4090 with 16G Vram/ 64G ram. @@koalanation
@espedairsystems
8 ай бұрын
torch.cuda.OutOfMemoryError: Allocation on device 0 would exceed allowed memory. (out of memory) Currently allocated : 16.42 GiB Requested : 2.96 GiB Device limit : 23.66 GiB Free (according to CUDA): 30.12 MiB PyTorch limit (set by user-supplied memory fraction) Looks like my RTX 3090 can't take the pace with 24GB vRAM ... time to save for my 5090 with 48GB
@koalanation
8 ай бұрын
I get this error if I try to use to process too many frames. Try to reduce them and see if it works.
@koalanation
8 ай бұрын
It is indeed a thing worth to try....otherwise I am afraid less frames will make it work
@RhapsHayden
4 ай бұрын
Where would I add a custom trained lora? after the load checkpoint?
@koalanation
4 ай бұрын
Yes