Skip to content

Commit 0b4ac75

Browse files
authored
RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list (ggml-org#9387)
Signed-off-by: Molly Sophia <[email protected]>
1 parent fb3f249 commit 0b4ac75

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

convert_hf_to_gguf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,8 @@ def prepare_tensors(self):
302302
gguf.MODEL_TENSOR.TIME_MIX_FIRST,
303303
gguf.MODEL_TENSOR.TIME_MIX_W1,
304304
gguf.MODEL_TENSOR.TIME_MIX_W2,
305+
gguf.MODEL_TENSOR.TIME_MIX_DECAY_W1,
306+
gguf.MODEL_TENSOR.TIME_MIX_DECAY_W2,
305307
)
306308
)
307309
or not new_name.endswith(".weight")

src/llama.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17530,6 +17530,8 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
1753017530
quantize &= name.find("time_mix_first.weight") == std::string::npos;
1753117531
quantize &= name.find("time_mix_w1.weight") == std::string::npos;
1753217532
quantize &= name.find("time_mix_w2.weight") == std::string::npos;
17533+
quantize &= name.find("time_mix_decay_w1.weight") == std::string::npos;
17534+
quantize &= name.find("time_mix_decay_w2.weight") == std::string::npos;
1753317535

1753417536
// do not quantize relative position bias (T5)
1753517537
quantize &= name.find("attn_rel_b.weight") == std::string::npos;

0 commit comments

Comments
 (0)