-
Notifications
You must be signed in to change notification settings - Fork 443
Image preview #522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Image preview #522
Changes from all commits
Commits
Show all changes
54 commits
Select commit
Hold shift + click to select a range
e8ac336
fast latent image preview
stduhpf de9c492
fix posix compile
stduhpf ee4aef8
move latent preview code to a separate file
stduhpf 75a9abd
Latent preview support for img2img and img2vid
stduhpf 8dcb814
add latent-preview to .gitignore
stduhpf ef62078
Refactor latent preview + support tae/vae preview
stduhpf 2cedeb5
update usage
stduhpf be0a442
Fix build + add warning
stduhpf 31b0fdd
Disable preview by default in sdcpp too
stduhpf 95fd31c
Done not preload preview tensor when preview is disabled.
stduhpf cbd8c99
Fix VAE preview darkening
stduhpf c3d72c0
Increase context memory when loading multiple auto encoders
stduhpf 8059ac3
Increase context memory when previewing with auto encoder instead
stduhpf 8e6024f
fix compile warnings
stduhpf 19ac567
fix print-params
stduhpf 430f7d8
fix preview with unet inpaint models
stduhpf 2272068
do not spam pretty progress when using tiled vae/tae as preview
stduhpf eeca697
change log level of "processing %i tiles"
stduhpf beb0e91
Refactor preview to match the other callbacks
stduhpf d465a70
preview: new API
stduhpf 55ef7be
latent proj bias
stduhpf 86e5c49
Merge branch 'master' into image-preview
stduhpf a5278ce
fix merge issues
stduhpf 030aa3d
add wan latent projs
stduhpf 4c536b5
animated previews
stduhpf 7a0ab28
latent proj bias
stduhpf 3e0ef27
fix dup
stduhpf 2ba5a43
Merge branch 'master' into image-preview
stduhpf a57c3f4
Merge branch 'master' into image-preview
stduhpf 70a1611
Support latent2rgb preview for qwen image (via wan21)
stduhpf e2ce17d
Fix ctx memory pool size overwritten during merge
stduhpf 05bf92a
Merge branch 'master' into image-preview
stduhpf 1bae409
Merge remote-tracking branch 'origin' into image-preview
stduhpf f7b53e5
fix build and update help messages
stduhpf 0a59f36
update help message in readme
stduhpf 059f025
remove tensor shape spam
stduhpf 6563d46
Fix progress display
stduhpf fff9930
Merge branch 'master' into image-preview
stduhpf b1fc7cd
preview: support pixel space diffusion
stduhpf 31d36b2
include preview (and apply_mask) in speed stats properly
stduhpf 4e3500c
support noisy preview via API
stduhpf 27af5a4
missing includes
stduhpf 07c61f1
supports noisy preview in main
stduhpf f80f61a
fix tae-preview-only (bad merge issue)
stduhpf 6c68e39
format code
stduhpf fc2a71e
update help in readme
stduhpf 8a3346f
use bespoke latent to rgb projection to prevent licensing issues
stduhpf b5e73f9
fix sd3 null bias breaking build
stduhpf a50e2ce
Merge branch 'master' into image-preview
stduhpf c1226d6
use new ggml_ext function names
stduhpf 3db7fb1
Fix radiance proj support
stduhpf d18b2a1
Merge branch 'master' into image-preview
stduhpf 7a2de72
Merge branch 'master' into image-preview
stduhpf 044f0ed
format code
leejet File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,3 +12,4 @@ test/ | |
| output*.png | ||
| models* | ||
| *.log | ||
| preview.png | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,173 @@ | ||
| #include <cstddef> | ||
| #include <cstdint> | ||
| #include "ggml.h" | ||
|
|
||
| const float wan_21_latent_rgb_proj[16][3] = { | ||
| {0.015123f, -0.148418f, 0.479828f}, | ||
| {0.003652f, -0.010680f, -0.037142f}, | ||
| {0.212264f, 0.063033f, 0.016779f}, | ||
| {0.232999f, 0.406476f, 0.220125f}, | ||
| {-0.051864f, -0.082384f, -0.069396f}, | ||
| {0.085005f, -0.161492f, 0.010689f}, | ||
| {-0.245369f, -0.506846f, -0.117010f}, | ||
| {-0.151145f, 0.017721f, 0.007207f}, | ||
| {-0.293239f, -0.207936f, -0.421135f}, | ||
| {-0.187721f, 0.050783f, 0.177649f}, | ||
| {-0.013067f, 0.265964f, 0.166578f}, | ||
| {0.028327f, 0.109329f, 0.108642f}, | ||
| {-0.205343f, 0.043991f, 0.148914f}, | ||
| {0.014307f, -0.048647f, -0.007219f}, | ||
| {0.217150f, 0.053074f, 0.319923f}, | ||
| {0.155357f, 0.083156f, 0.064780f}}; | ||
| float wan_21_latent_rgb_bias[3] = {-0.270270f, -0.234976f, -0.456853f}; | ||
|
|
||
| const float wan_22_latent_rgb_proj[48][3] = { | ||
| {0.017126f, -0.027230f, -0.019257f}, | ||
| {-0.113739f, -0.028715f, -0.022885f}, | ||
| {-0.000106f, 0.021494f, 0.004629f}, | ||
| {-0.013273f, -0.107137f, -0.033638f}, | ||
| {-0.000381f, 0.000279f, 0.025877f}, | ||
| {-0.014216f, -0.003975f, 0.040528f}, | ||
| {0.001638f, -0.000748f, 0.011022f}, | ||
| {0.029238f, -0.006697f, 0.035933f}, | ||
| {0.021641f, -0.015874f, 0.040531f}, | ||
| {-0.101984f, -0.070160f, -0.028855f}, | ||
| {0.033207f, -0.021068f, 0.002663f}, | ||
| {-0.104711f, 0.121673f, 0.102981f}, | ||
| {0.082647f, -0.004991f, 0.057237f}, | ||
| {-0.027375f, 0.031581f, 0.006868f}, | ||
| {-0.045434f, 0.029444f, 0.019287f}, | ||
| {-0.046572f, -0.012537f, 0.006675f}, | ||
| {0.074709f, 0.033690f, 0.025289f}, | ||
| {-0.008251f, -0.002745f, -0.006999f}, | ||
| {0.012685f, -0.061856f, -0.048658f}, | ||
| {0.042304f, -0.007039f, 0.000295f}, | ||
| {-0.007644f, -0.060843f, -0.033142f}, | ||
| {0.159909f, 0.045628f, 0.367541f}, | ||
| {0.095171f, 0.086438f, 0.010271f}, | ||
| {0.006812f, 0.019643f, 0.029637f}, | ||
| {0.003467f, -0.010705f, 0.014252f}, | ||
| {-0.099681f, -0.066272f, -0.006243f}, | ||
| {0.047357f, 0.037040f, 0.000185f}, | ||
| {-0.041797f, -0.089225f, -0.032257f}, | ||
| {0.008928f, 0.017028f, 0.018684f}, | ||
| {-0.042255f, 0.016045f, 0.006849f}, | ||
| {0.011268f, 0.036462f, 0.037387f}, | ||
| {0.011553f, -0.016375f, -0.048589f}, | ||
| {0.046266f, -0.027189f, 0.056979f}, | ||
| {0.009640f, -0.017576f, 0.030324f}, | ||
| {-0.045794f, -0.036083f, -0.010616f}, | ||
| {0.022418f, 0.039783f, -0.032939f}, | ||
| {-0.052714f, -0.015525f, 0.007438f}, | ||
| {0.193004f, 0.223541f, 0.264175f}, | ||
| {-0.059406f, -0.008188f, 0.022867f}, | ||
| {-0.156742f, -0.263791f, -0.007385f}, | ||
| {-0.015717f, 0.016570f, 0.033969f}, | ||
| {0.037969f, 0.109835f, 0.200449f}, | ||
| {-0.000782f, -0.009566f, -0.008058f}, | ||
| {0.010709f, 0.052960f, -0.044195f}, | ||
| {0.017271f, 0.045839f, 0.034569f}, | ||
| {0.009424f, 0.013088f, -0.001714f}, | ||
| {-0.024805f, -0.059378f, -0.033756f}, | ||
| {-0.078293f, 0.029070f, 0.026129f}}; | ||
| float wan_22_latent_rgb_bias[3] = {0.013160f, -0.096492f, -0.071323f}; | ||
|
|
||
| const float flux_latent_rgb_proj[16][3] = { | ||
| {-0.041168f, 0.019917f, 0.097253f}, | ||
| {0.028096f, 0.026730f, 0.129576f}, | ||
| {0.065618f, -0.067950f, -0.014651f}, | ||
| {-0.012998f, -0.014762f, 0.081251f}, | ||
| {0.078567f, 0.059296f, -0.024687f}, | ||
| {-0.015987f, -0.003697f, 0.005012f}, | ||
| {0.033605f, 0.138999f, 0.068517f}, | ||
| {-0.024450f, -0.063567f, -0.030101f}, | ||
| {-0.040194f, -0.016710f, 0.127185f}, | ||
| {0.112681f, 0.088764f, -0.041940f}, | ||
| {-0.023498f, 0.093664f, 0.025543f}, | ||
| {0.082899f, 0.048320f, 0.007491f}, | ||
| {0.075712f, 0.074139f, 0.081965f}, | ||
| {-0.143501f, 0.018263f, -0.136138f}, | ||
| {-0.025767f, -0.082035f, -0.040023f}, | ||
| {-0.111849f, -0.055589f, -0.032361f}}; | ||
| float flux_latent_rgb_bias[3] = {0.024600f, -0.006937f, -0.008089f}; | ||
|
|
||
| // This one was taken straight from | ||
| // https://github.com/Stability-AI/sd3.5/blob/8565799a3b41eb0c7ba976d18375f0f753f56402/sd3_impls.py#L288-L303 | ||
| // (MiT Licence) | ||
| const float sd3_latent_rgb_proj[16][3] = { | ||
| {-0.0645f, 0.0177f, 0.1052f}, | ||
| {0.0028f, 0.0312f, 0.0650f}, | ||
| {0.1848f, 0.0762f, 0.0360f}, | ||
| {0.0944f, 0.0360f, 0.0889f}, | ||
| {0.0897f, 0.0506f, -0.0364f}, | ||
| {-0.0020f, 0.1203f, 0.0284f}, | ||
| {0.0855f, 0.0118f, 0.0283f}, | ||
| {-0.0539f, 0.0658f, 0.1047f}, | ||
| {-0.0057f, 0.0116f, 0.0700f}, | ||
| {-0.0412f, 0.0281f, -0.0039f}, | ||
| {0.1106f, 0.1171f, 0.1220f}, | ||
| {-0.0248f, 0.0682f, -0.0481f}, | ||
| {0.0815f, 0.0846f, 0.1207f}, | ||
| {-0.0120f, -0.0055f, -0.0867f}, | ||
| {-0.0749f, -0.0634f, -0.0456f}, | ||
| {-0.1418f, -0.1457f, -0.1259f}, | ||
| }; | ||
| float sd3_latent_rgb_bias[3] = {0, 0, 0}; | ||
|
|
||
| const float sdxl_latent_rgb_proj[4][3] = { | ||
| {0.258303f, 0.277640f, 0.329699f}, | ||
| {-0.299701f, 0.105446f, 0.014194f}, | ||
| {0.050522f, 0.186163f, -0.143257f}, | ||
| {-0.211938f, -0.149892f, -0.080036f}}; | ||
| float sdxl_latent_rgb_bias[3] = {0.144381f, -0.033313f, 0.007061f}; | ||
|
|
||
| const float sd_latent_rgb_proj[4][3] = { | ||
| {0.337366f, 0.216344f, 0.257386f}, | ||
| {0.165636f, 0.386828f, 0.046994f}, | ||
| {-0.267803f, 0.237036f, 0.223517f}, | ||
| {-0.178022f, -0.200862f, -0.678514f}}; | ||
| float sd_latent_rgb_bias[3] = {-0.017478f, -0.055834f, -0.105825f}; | ||
|
|
||
| void preview_latent_video(uint8_t* buffer, struct ggml_tensor* latents, const float (*latent_rgb_proj)[3], const float latent_rgb_bias[3], int width, int height, int frames, int dim) { | ||
| size_t buffer_head = 0; | ||
| for (int k = 0; k < frames; k++) { | ||
| for (int j = 0; j < height; j++) { | ||
| for (int i = 0; i < width; i++) { | ||
| size_t latent_id = (i * latents->nb[0] + j * latents->nb[1] + k * latents->nb[2]); | ||
| float r = 0, g = 0, b = 0; | ||
| if (latent_rgb_proj != nullptr) { | ||
| for (int d = 0; d < dim; d++) { | ||
| float value = *(float*)((char*)latents->data + latent_id + d * latents->nb[ggml_n_dims(latents) - 1]); | ||
| r += value * latent_rgb_proj[d][0]; | ||
| g += value * latent_rgb_proj[d][1]; | ||
| b += value * latent_rgb_proj[d][2]; | ||
| } | ||
| } else { | ||
| // interpret first 3 channels as RGB | ||
| r = *(float*)((char*)latents->data + latent_id + 0 * latents->nb[ggml_n_dims(latents) - 1]); | ||
| g = *(float*)((char*)latents->data + latent_id + 1 * latents->nb[ggml_n_dims(latents) - 1]); | ||
| b = *(float*)((char*)latents->data + latent_id + 2 * latents->nb[ggml_n_dims(latents) - 1]); | ||
| } | ||
| if (latent_rgb_bias != nullptr) { | ||
| // bias | ||
| r += latent_rgb_bias[0]; | ||
| g += latent_rgb_bias[1]; | ||
| b += latent_rgb_bias[2]; | ||
| } | ||
| // change range | ||
| r = r * .5f + .5f; | ||
| g = g * .5f + .5f; | ||
| b = b * .5f + .5f; | ||
|
|
||
| // clamp rgb values to [0,1] range | ||
| r = r >= 0 ? r <= 1 ? r : 1 : 0; | ||
| g = g >= 0 ? g <= 1 ? g : 1 : 0; | ||
| b = b >= 0 ? b <= 1 ? b : 1 : 0; | ||
|
|
||
| buffer[buffer_head++] = (uint8_t)(r * 255); | ||
| buffer[buffer_head++] = (uint8_t)(g * 255); | ||
| buffer[buffer_head++] = (uint8_t)(b * 255); | ||
| } | ||
| } | ||
| } | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.