-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Add support for Chroma Radiance #9682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Chroma Radiance #9682
Conversation
70a17b1
to
85bfc39
Compare
Can you show some generation examples? |
7e4045e
to
747abae
Compare
b64cc0b
to
19bd9b8
Compare
6e5f987
to
ca5d3ff
Compare
edit: No changes were needed since Radiance delegates to the Chroma |
Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node
Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe)
…be in line with existing VAE behavior
Cleanups/refactoring to reduce code duplication with Chroma.
Minor code cleanups and tooltip improvements
ca5d3ff
to
0828916
Compare
dit_config["nerf_max_freqs"] = 8 | ||
dit_config["nerf_tile_size"] = 32 | ||
dit_config["nerf_final_head_type"] = "conv" if f"{key_prefix}nerf_final_layer_conv.norm.scale" in state_dict_keys else "linear" | ||
dit_config["nerf_embedder_dtype"] = torch.float32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be in fp32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! Lodestone though it would be beneficial to run the embedder in 32-bit precision so that's why I did it that way. I let him know about your question, hopefully he will show up here to give you a better answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, that's just positional embedding part, the rest of the nerf head is in lower precision
Instead of adding a new node it might be best to just add the chroma radiance "vae" in the list of the "Load VAE" node Then you can remove ChromaRadianceLatentToImage, ChromaRadianceImageToLatent and ChromaRadianceStubVAE. |
Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1
@comfyanonymous Hopefully the changes in the last commit are in right direction, if not please let me know! |
Did some changes: #9851 The VAE should behave like an actual VAE meaning you can save it in the checkpoint and it should auto load. I also made any resolution work with the model. I also changed the name of the "vae" to "pixel_space" |
Nice, I didn't understand the VAE code there enough to mess with it and I wasn't sure if affecting existing code by trying to make things generic would get accepted. That's better than the approach I used.
Your change made it so it doesn't crash when the resolution isn't a multiple of 16 but... It definitely does not seem to like resolutions that aren't divisible by 16. ![]() ![]() I don't know if @lodestone-rock has any insights on how to actually make it work properly with resolutions that aren't multiples of 16, but based on what he told me in the past that is a requirement. |
hmm yeah the model works on 16x16 patch so rounding to the nearest multiple of 16 is the intended behaviour. the reason it sorta works is because im using "conv" to do ViT like patch computation instead of flattening each patch and do matmul on it. |
* Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1
* Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1
* Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1
This patch adds support for Chroma Radiance, a pixel-space model adapted from Chroma, which is showing very promising results (although it is relatively early in training). The performance is quite close to Chroma.
I've tried to reuse existing code as much as possible so this is fairly self-contained. A pretty good chunk of it is copied from the implementation at https://github.com/lodestone-rock/flow - I can't take any credit for those parts.
The model can be downloaded at: https://huggingface.co/lodestones/chroma-debug-development-only/tree/main/radiance
No difference from a normal Chroma workflow except you need to use the
ChromaRadianceStubVAE
node instead of loading a VAE. It's not a real VAE, it just moves the channel dimension and scales the input image values to -1..1 for encode and the reverse for decode. There are alsoChromaRadianceLatentToImage
andChromaRadianceImageToLatent
nodes that do the same thing which could be removed. Lodestone Rock (Chroma and Radiance creator) said he preferred leaving them, I don't have a preference.There is also a
ChromaRadianceOptions
node that allows overriding some parameters. The main one of interest is likelynerf_tile_size
: setting it as high as possible without OOMing is beneficial for performance. I've been told that Blackwell benefits from having it set to 64, the default is currently 32.I have a few more things I want to do before I'd consider it ready to merge (mostly minor stuff like better tooltips for the nodes) so this is starting out as a draft pull. Any comments/feedback are very welcome!
edit: Going to move this out of draft status now. Possible things to resolve before merging:
Chroma
module instead of just copying the code. I try to use code when possible, but this does mean future modifications to Chroma need to take Radiance using it into account. Also, I don't think there is an existing precedent for doing it this way in the existing code. I can change this if necessary.ChromaRadianceLatentToImage
andChromaRadianceImageToLatent
nodes to clean up the node list. They aren't doing anything unique, on the other hand they are just wrappers around the existing conversion code so maintaining them doesn't add much of a burden and removing them would possibly break existing workflows.ChromaRadianceStubVAE
- I'm not good at naming stuff so this may not be the best name. It would be kind of a pain to change now since it would break the existing workflows (though changing the display name would be free). If necessary this can be changed.Let me know if anything needs to be changed to make this mergeable. General comments/feedback are also welcome, of course.