Skip to content

Eval bug: IQ2_M broken for mradermacher / Llama-4-Maverick-17B-128E-Instruct-GGUF #12913

Open
@whatever1983

Description

@whatever1983

Name and Version

Ask mradermacher or unsloth

Operating systems

Linux

GGML backends

CUDA

Hardware

Ask mradermacher or unsloth

Models

Llama-4-Maverick-17B-128E-Instruct-IQ2_M

Problem description & steps to reproduce

IQ2_M is broken for Llama-4-Maverick-17B-128E-Instruct-GGUF
https://hf.tst.eu/status.html
nico1 nice size (static/imatrix) -- jobs 9/8-40 maxm 130 free 2815 budget 1523 uploads 95 hfd 585 32c
-7776 804 I Llama-4-Maverick-17B-128E-Instruct error/47 8/24,IQ2_M [20/531]
-3999 804 I Llama-4-Maverick-17B-128E error/47 8/24,IQ2_M [20/531]

Unsloth noticed this same issue in their report:
https://docs.unsloth.ai/basics/tutorial-how-to-run-and-fine-tune-llama-4
Interesting Insights and Issues
During quantization of Llama 4 Maverick (the large model), we found the 1st, 3rd and 45th MoE layers could not be calibrated correctly. Maverick uses interleaving MoE layers for every odd layer, so Dense->MoE->Dense and so on.

Apparently the Scout 16E model can be successfully quantized to IQ2_M. Just that the Maverick 128E can't. Can we fix that? Because IQ2M should be in the ~135GB in size, Unsloth had to use 3bits and 4bits for those layers that couldn't quantize, blowing up the model to 4x ~40GB in size. And mradermacher Q2K is almost 157GB.

Would be very nice to compress another 20GB off using IQ2_M.

First Bad Commit

No response

Relevant log output

nico1    nice size (static/imatrix) -- [jobs](https://huggingface.co/mradermacher/jobs-GGUF) 9/8-40 [maxm](https://huggingface.co/mradermacher/maxm-GGUF) 130 free 2815 budget 1523 uploads 95 hfd 585 32c
        -7776  804  I Llama-4-Maverick-17B-128E-Instruct           error/47 8/24,IQ2_M [20/531]
        -3999  804  I Llama-4-Maverick-17B-128E                    error/47 8/24,IQ2_M [20/531]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions