Open
Description
Name and Version
B5727
Operating systems
Mac
GGML backends
Metal
Hardware
M4 Mac Studio
Models
Qwen 2.5 1.7b
Problem description & steps to reproduce
load the model and crash at
GGML_METAL_ADD_KERNEL(GGML_METAL_KERNEL_TYPE_FLASH_ATTN_EXT_Q8_0_H96, flash_attn_ext_q8_0_h96, has_simdgroup_mm);
Thread 7: EXC_BAD_ACCESS (code=1, address=0x4e29444af118)
First Bad Commit
No response
Relevant log output
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
llama_model_loader: loaded meta data with 26 key-value pairs and 339 tensors from /Users/animax/Library/Developer/Xcode/DerivedData/LocalLLM-fpkqjzkzghleumgxashmyivglcsj/Build/Products/Debug/model.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = qwen2.5-1.5b-instruct
llama_model_loader: - kv 3: general.version str = v0.1
llama_model_loader: - kv 4: general.finetune str = qwen2.5-1.5b-instruct
llama_model_loader: - kv 5: general.size_label str = 1.8B
llama_model_loader: - kv 6: qwen2.block_count u32 = 28
llama_model_loader: - kv 7: qwen2.context_length u32 = 32768
llama_model_loader: - kv 8: qwen2.embedding_length u32 = 1536
llama_model_loader: - kv 9: qwen2.feed_forward_length u32 = 8960
llama_model_loader: - kv 10: qwen2.attention.head_count u32 = 12
llama_model_loader: - kv 11: qwen2.attention.head_count_kv u32 = 2
llama_model_loader: - kv 12: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 13: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 14: general.file_type u32 = 7
llama_model_loader: - kv 15: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 16: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 17: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 18: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 19: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 22: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 23: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 24: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 25: general.quantization_version u32 = 2
...
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M4 Max
ggml_metal_init: picking default device: Apple M4 Max
ggml_metal_load_library: using embedded metal library
Embedding model loaded successfully
Warning: Compilation succeeded with:
program_source:485:28: warning: unused variable 'ksigns64' [-Wunused-const-variable]
GGML_TABLE_BEGIN(uint64_t, ksigns64, 128)
^
program_source:1080:26: warning: unused variable 'kvalues_iq4nl' [-Wunused-const-variable]
GGML_TABLE_BEGIN(int8_t, kvalues_iq4nl, 16)
^
Warning: Compilation succeeded with:
program_source:485:28: warning: unused variable 'ksigns64' [-Wunused-const-variable]
GGML_TABLE_BEGIN(uint64_t, ksigns64, 128)
^
program_source:1080:26: warning: unused variable 'kvalues_iq4nl' [-Wunused-const-variable]
GGML_TABLE_BEGIN(int8_t, kvalues_iq4nl, 16)
^
ggml_metal_init: GPU name: Apple M4 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets = true
ggml_metal_init: has bfloat = true
ggml_metal_init: use bfloat = true
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 51539.61 MB
ggml_metal_init: GPU name: Apple M4 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple9 (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3 (5001)
ggml_metal_init: simdgroup reduction = true
ggml_metal_init: simdgroup matrix mul. = true
ggml_metal_init: has residency sets = true
ggml_metal_init: has bfloat = true
ggml_metal_init: use bfloat = true
ggml_metal_init: hasUnifiedMemory = true
ggml_metal_init: recommendedMaxWorkingSetSize = 51539.61 MB
...
ggml_metal_init: loaded kernel_mul_mv_ext_q5_1_f32_r1_3 0x600000a84900 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_1_f32_r1_4 0x600000a84a20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_1_f32_r1_4 0x600000ab3d20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_1_f32_r1_5 0x600000ab1980 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_1_f32_r1_5 0x600000ab63a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_2 0x600000a8e3a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_2 0x600000a84b40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_3 0x600000ab7ba0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_3 0x600000a8f360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_4 0x600000a984e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_4 0x600000a802a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_5 0x600000a8af40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q8_0_f32_r1_5 0x600000a98600 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_2 0x600000a81260 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_2 0x600000ab3f00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_3 0x600000ab3ea0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_3 0x600000a8b7e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_4 0x600000ab3d80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_4 0x600000a99800 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_5 0x600000a45c80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q4_K_f32_r1_5 0x600000a81bc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_2 0x600000ac24c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_2 0x600000a84960 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_3 0x600000a85aa0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_3 0x600000a910e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_4 0x600000a0dc20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_4 0x600000af3840 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_5 0x600000acc180 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q5_K_f32_r1_5 0x600000afd140 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_2 0x600000af07e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_2 0x600000ac9440 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_3 0x600000afe3a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_3 0x600000acc240 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_4 0x600000a581e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_4 0x600000acca20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_5 0x600000a58240 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_q6_K_f32_r1_5 0x600000af3c00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_2 0x600000af1380 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_2 0x600000a8fa80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_3 0x600000af2b80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_3 0x600000acb0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_4 0x600000acb180 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_4 0x600000af1bc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_5 0x600000afcd20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_ext_iq4_nl_f32_r1_5 0x600000a8f660 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q2_K_f32 0x600000b69140 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q2_K_f32 0x600000b6c360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q3_K_f32 0x600000b6c3c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q3_K_f32 0x600000b60300 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q4_K_f32 0x600000b6c420 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q4_K_f32 0x600000acd020 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q5_K_f32 0x600000afd500 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q5_K_f32 0x600000b6ca80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q6_K_f32 0x600000b6ccc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_q6_K_f32 0x600000afd800 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq2_xxs_f32 0x600000b698c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq2_xxs_f32 0x600000af3d80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq2_xs_f32 0x600000b60420 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq2_xs_f32 0x600000b6da40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq3_xxs_f32 0x600000acd1a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq3_xxs_f32 0x600000af0180 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq3_s_f32 0x600000b640c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq3_s_f32 0x600000b781e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq2_s_f32 0x600000b64780 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq2_s_f32 0x600000ace880 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq1_s_f32 0x600000b78300 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq1_s_f32 0x600000ace820 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq1_m_f32 0x600000b789c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq1_m_f32 0x600000b6db00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq4_nl_f32 0x600000b62700 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq4_nl_f32 0x600000b78ae0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq4_xs_f32 0x600000b62f40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_iq4_xs_f32 0x600000b648a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_f32_f32 0x600000b628e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_f32_f32 0x600000ace9a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_f16_f32 0x600000b791a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_f16_f32 0x600000acf180 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_bf16_f32 0x600000b6db60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_bf16_f32 0x600000b65020 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q4_0_f32 0x600000b63000 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q4_0_f32 0x600000b79a40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q4_1_f32 0x600000acaf40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q4_1_f32 0x600000b6b6c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q5_0_f32 0x600000b6e2e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q5_0_f32 0x600000b7c0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q5_1_f32 0x600000b6e3a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q5_1_f32 0x600000b63840 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q8_0_f32 0x600000b7c7e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q8_0_f32 0x600000b65e60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q2_K_f32 0x600000aca280 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q2_K_f32 0x600000b6bde0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q3_K_f32 0x600000b7a9a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q3_K_f32 0x600000b66700 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q4_K_f32 0x600000af2ac0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q4_K_f32 0x600000b6be40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q5_K_f32 0x600000b701e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q5_K_f32 0x600000b66d60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q6_K_f32 0x600000b6bf60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_q6_K_f32 0x600000b70300 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq2_xxs_f32 0x600000b7cf60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq2_xxs_f32 0x600000b74000 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq2_xs_f32 0x600000b480c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq2_xs_f32 0x600000b6bea0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq3_xxs_f32 0x600000b74f00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq3_xxs_f32 0x600000b70ae0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq3_s_f32 0x600000b7d860 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq3_s_f32 0x600000b66fa0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq2_s_f32 0x600000b69f80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq2_s_f32 0x600000ace7c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq1_s_f32 0x600000ace6a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq1_s_f32 0x600000b729a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq1_m_f32 0x600000b67ea0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq1_m_f32 0x600000b74fc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq4_nl_f32 0x600000b40600 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq4_nl_f32 0x600000b75020 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq4_xs_f32 0x600000b488a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mv_id_iq4_xs_f32 0x600000b4c0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_f32_f32 0x600000b40660 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_f32_f32 0x600000b755c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_f16_f32 0x600000b49080 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_f16_f32 0x600000b4ce40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_bf16_f32 0x600000b7fb40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_bf16_f32 0x600000b75680 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q4_0_f32 0x600000b73660 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q4_0_f32 0x600000b406c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q4_1_f32 0x600000b73ba0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q4_1_f32 0x600000b49680 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q5_0_f32 0x600000b4a100 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q5_0_f32 0x600000b4c900 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q5_1_f32 0x600000b41200 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q5_1_f32 0x600000b4c8a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q8_0_f32 0x600000b412c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q8_0_f32 0x600000b4d4a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q2_K_f32 0x600000b41920 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q2_K_f32 0x600000b4a1c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q3_K_f32 0x600000b75d40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q3_K_f32 0x600000b583c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q4_K_f32 0x600000b76280 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q4_K_f32 0x600000b4a220 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q5_K_f32 0x600000b4e0a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q5_K_f32 0x600000b41e60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q6_K_f32 0x600000b41f20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_q6_K_f32 0x600000b58e40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq2_xxs_f32 0x600000b42580 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq2_xxs_f32 0x600000b45da0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq2_xs_f32 0x600000b76a60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq2_xs_f32 0x600000b58f00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq3_xxs_f32 0x600000b42640 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq3_xxs_f32 0x600000b77060 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq3_s_f32 0x600000b4a2e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq3_s_f32 0x600000b4e6a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq2_s_f32 0x600000b4a880 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq2_s_f32 0x600000b46940 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq1_s_f32 0x600000b4ae20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq1_s_f32 0x600000b46f40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq1_m_f32 0x600000b430c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq1_m_f32 0x600000b72760 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq4_nl_f32 0x600000b436c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq4_nl_f32 0x600000b595c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq4_xs_f32 0x600000b4b420 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_iq4_xs_f32 0x600000b47480 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_map0_f16 0x600000b59bc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_map0_f16 0x600000b478a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_map1_f32 0x600000b5a280 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_map1_f32 0x600000b43840 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_f32_f16 0x600000b72100 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_f32_f16 0x600000b4ebe0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_f16_f16 0x600000a3d860 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_f16_f16 0x600000b5c0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_bf16_f16 0x600000b5cc00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_bf16_f16 0x600000b5a7c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q4_0_f16 0x600000b50360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q4_0_f16 0x600000b47600 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q4_1_f16 0x600000b50960 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q4_1_f16 0x600000b4ba20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q5_0_f16 0x600000b5ccc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q5_0_f16 0x600000b5a880 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q5_1_f16 0x600000b4f3c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q5_1_f16 0x600000b540c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q8_0_f16 0x600000b4f9c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q8_0_f16 0x600000b51ec0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q2_K_f16 0x600000b283c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q2_K_f16 0x600000b5d200 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q3_K_f16 0x600000b5ad60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q3_K_f16 0x600000b51e60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q4_K_f16 0x600000b5aee0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q4_K_f16 0x600000b54780 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q5_K_f16 0x600000b5af40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q5_K_f16 0x600000b28a80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q6_K_f16 0x600000b5b060 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_q6_K_f16 0x600000b54d80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq2_xxs_f16 0x600000b52460 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq2_xxs_f16 0x600000b5d8c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq2_xs_f16 0x600000b529a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq2_xs_f16 0x600000b55380 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq3_xxs_f16 0x600000b28fc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq3_xxs_f16 0x600000b5e340 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq3_s_f16 0x600000b29020 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq3_s_f16 0x600000b53600 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq2_s_f16 0x600000b29080 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq2_s_f16 0x600000b2c0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq1_s_f16 0x600000b5eee0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq1_s_f16 0x600000b55980 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq1_m_f16 0x600000b5f060 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq1_m_f16 0x600000b290e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq4_nl_f16 0x600000b51e00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq4_nl_f16 0x600000b2d1a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq4_xs_f16 0x600000b29680 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_mul_mm_id_iq4_xs_f16 0x600000b20120 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_norm_f32 0x600000b29560 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_norm_f32 0x600000b47a80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_norm_f16 0x600000b29e00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_norm_f16 0x600000b240c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_multi_f32 0x600000b2b480 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_multi_f32 0x600000b51ce0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_multi_f16 0x600000b380c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_multi_f16 0x600000b55ec0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_vision_f32 0x600000b38c60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_vision_f32 0x600000b2d380 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_vision_f16 0x600000b2d980 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_vision_f16 0x600000b39800 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_neox_f32 0x600000b3a3a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_neox_f32 0x600000b2ef40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_neox_f16 0x600000b2f0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_rope_neox_f16 0x600000b3af40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_f16 0x600000b2ff00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_f16 0x600000b223a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_f32 0x600000b53f00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_f32 0x600000b2b540 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_ext_f16 0x600000b2bae0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_ext_f16 0x600000b3c480 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_ext_f32 0x600000b3c5a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_im2col_ext_f32 0x600000b55f20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_conv_transpose_1d_f32_f32 0x600000b562e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_conv_transpose_1d_f32_f32 0x600000b3cc00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_conv_transpose_1d_f16_f32 0x600000b3cf00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_conv_transpose_1d_f16_f32 0x600000b22ee0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_upscale_f32 0x600000b306c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_upscale_f32 0x600000b22fa0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_pad_f32 0x600000b3d140 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_pad_f32 0x600000b30e40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_pad_reflect_1d_f32 0x600000b57420 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_pad_reflect_1d_f32 0x600000b23720 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_timestep_embedding_f32 0x600000b2fde0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_timestep_embedding_f32 0x600000b2bb40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_arange_f32 0x600000b34000 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_arange_f32 0x600000b2bf60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_argsort_f32_i32_asc 0x600000b3d860 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_argsort_f32_i32_asc 0x600000b57d20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_argsort_f32_i32_desc 0x600000b3d6e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_argsort_f32_i32_desc 0x600000b34060 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_leaky_relu_f32 0x600000b34180 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_leaky_relu_f32 0x600000b57ea0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h64 0x600000b08000 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h64 0x600000b342a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h80 0x600000b24ba0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h80 0x600000b31500 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h96 0x600000b0c000 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h96 0x600000b3d560 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h112 0x600000b25ec0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h112 0x600000b08120 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h128 0x600000b08ae0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h128 0x600000b25da0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h192 0x600000b323a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h192 0x600000b3de60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_hk192_hv128 0x600000b0dc20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_hk192_hv128 0x600000b093e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h256 0x600000b32ca0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_h256 0x600000b3df80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_hk576_hv512 0x600000ab2f40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_f16_hk576_hv512 0x600000b0dce0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h64 0x600000b0edc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h64 0x600000ab2700 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h80 0x600000b32d60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h80 0x600000b361c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h96 0x600000b35da0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h96 0x600000b09e60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h112 0x600000b35d40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h112 0x600000ac1c20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h128 0x600000a85b00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h128 0x600000a9c900 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h192 0x600000a92220 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h192 0x600000a823a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_hk192_hv128 0x600000a82520 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_hk192_hv128 0x600000a85c20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h256 0x600000a9d2c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_h256 0x600000a83e40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_hk576_hv512 0x600000a9dc80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_bf16_hk576_hv512 0x600000a94540 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h64 0x600000a9e580 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h64 0x600000a98f00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h80 0x600000b0e580 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h80 0x600000a86ee0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h96 0x600000a86640 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h96 0x600000a9b9c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h112 0x600000a9e6a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h112 0x600000a96160 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h128 0x600000a96280 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h128 0x600000b05260 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h192 0x600000b05ce0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h192 0x600000a9f060 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_hk192_hv128 0x600000b1c300 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_hk192_hv128 0x600000a83d80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h256 0x600000a83d20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_h256 0x600000b06700 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_hk576_hv512 0x600000a83cc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_0_hk576_hv512 0x600000b070c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h64 0x600000b12640 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h64 0x600000a83360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h80 0x600000a87ba0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h80 0x600000a83540 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h96 0x600000b1c360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h96 0x600000b181e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h112 0x600000b1cc00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h112 0x600000b18b40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h128 0x600000b1cd80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h128 0x600000a87b40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h192 0x600000b15260 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h192 0x600000b18a20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_hk192_hv128 0x600000b1d800 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_hk192_hv128 0x600000b19c80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h256 0x600000b166a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_h256 0x600000b124c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_hk576_hv512 0x600000b1e1c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q4_1_hk576_hv512 0x600000a87a80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h64 0x600000b11c20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h64 0x600000be8240 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h80 0x600000b1f480 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h80 0x600000b11bc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h96 0x600000b16820 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h96 0x600000bec0c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h112 0x600000bec9c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h112 0x600000be9320 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h128 0x600000beca20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h128 0x600000b1f540 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h192 0x600000bed440 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h192 0x600000b1fe40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_hk192_hv128 0x600000becba0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_hk192_hv128 0x600000be44e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h256 0x600000bed560 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_h256 0x600000b17900 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_hk576_hv512 0x600000be91a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_0_hk576_hv512 0x600000b1a880 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h64 0x600000be8fc0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h64 0x600000bee760 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h80 0x600000be55c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h80 0x600000be00c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h96 0x600000bef1e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h96 0x600000b1aa00 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h112 0x600000bee940 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h112 0x600000be09c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h128 0x600000beb360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h128 0x600000be0a20 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h192 0x600000bfc9c0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h192 0x600000bf8360 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_hk192_hv128 0x600000bfcae0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_hk192_hv128 0x600000bf07e0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h256 0x600000bfcb40 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_h256 0x600000bf8de0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_hk576_hv512 0x600000bf1980 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q5_1_hk576_hv512 0x600000be54a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q8_0_h64 0x600000be0a80 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q8_0_h64 0x600000bfcc60 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q8_0_h80 0x600000bfa0a0 | th_max = 1024 | th_width = 32
ggml_metal_init: loaded kernel_flash_attn_ext_q8_0_h80 0x600000be0ae0 | th_max = 1024 | th_width = 32