Skip to content

I ran into this issue while trying to convert Smollm2 and Qwen2.5 #13603

Closed
@IzzulGod

Description

@IzzulGod

INFO:hf-to-gguf:Loading model: safetensors
INFO:hf-to-gguf:Model architecture: LlamaForCausalLM
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
INFO:hf-to-gguf:token_embd.weight, torch.float32 --> Q8_0, shape = {960, 49152}
INFO:hf-to-gguf:blk.0.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.0.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.0.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.0.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.0.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.0.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.0.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.0.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.0.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.1.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.1.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.1.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.1.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.1.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.1.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.1.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.1.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.1.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.10.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.10.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.10.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.10.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.10.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.10.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.10.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.10.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.10.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.11.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.11.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.11.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.11.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.11.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.11.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.11.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.11.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.11.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.12.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.12.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.12.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.12.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.12.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.12.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.12.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.12.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.12.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.13.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.13.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.13.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.13.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.13.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.13.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.13.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.13.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.13.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.14.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.14.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.14.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.14.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.14.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.14.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.14.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.14.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.14.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.15.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.15.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.15.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.15.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.15.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.15.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.15.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.15.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.15.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.16.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.16.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.16.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.16.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.16.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.16.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.16.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.16.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.16.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.17.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.17.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.17.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.17.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.17.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.17.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.17.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.17.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.17.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.18.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.18.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.18.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.18.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.18.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.18.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.18.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.18.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.18.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.19.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.19.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.19.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.19.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.19.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.19.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.19.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.19.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.19.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.2.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.2.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.2.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.2.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.2.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.2.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.2.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.2.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.2.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.20.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.20.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.20.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.20.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.20.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.20.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.20.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.20.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.20.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.21.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.21.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.21.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.21.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.21.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.21.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.21.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.21.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.21.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.22.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.22.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.22.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.22.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.22.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.22.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.22.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.22.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.22.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.23.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.23.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.23.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.23.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.23.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.23.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.23.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.23.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.23.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.24.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.24.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.24.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.24.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.24.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.24.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.24.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.24.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.24.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.25.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.25.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.25.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.25.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.25.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.25.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.25.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.25.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.25.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.26.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.26.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.26.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.26.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.26.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.26.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.26.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.26.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.26.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.27.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.27.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.27.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.27.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.27.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.27.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.27.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.27.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.27.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.28.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.28.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.28.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.28.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.28.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.28.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.28.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.28.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.28.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.29.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.29.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.29.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.29.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.29.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.29.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.29.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.29.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.29.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.3.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.3.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.3.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.3.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.3.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.3.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.3.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.3.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.3.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.30.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.30.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.30.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.30.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.30.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.30.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.30.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.30.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.30.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.31.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.31.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.31.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.31.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.31.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.31.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.31.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.31.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.31.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.4.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.4.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.4.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.4.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.4.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.4.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.4.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.4.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.4.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.5.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.5.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.5.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.5.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.5.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.5.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.5.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.5.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.5.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.6.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.6.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.6.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.6.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.6.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.6.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.6.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.6.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.6.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.7.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.7.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.7.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.7.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.7.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.7.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.7.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.7.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.7.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.8.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.8.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.8.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.8.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.8.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.8.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.8.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.8.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.8.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.9.attn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.9.ffn_down.weight, torch.float32 --> Q8_0, shape = {2560, 960}
INFO:hf-to-gguf:blk.9.ffn_gate.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.9.ffn_up.weight, torch.float32 --> Q8_0, shape = {960, 2560}
INFO:hf-to-gguf:blk.9.ffn_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:blk.9.attn_k.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:blk.9.attn_output.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.9.attn_q.weight, torch.float32 --> Q8_0, shape = {960, 960}
INFO:hf-to-gguf:blk.9.attn_v.weight, torch.float32 --> Q8_0, shape = {960, 320}
INFO:hf-to-gguf:output_norm.weight, torch.float32 --> F32, shape = {960}
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 8192
INFO:hf-to-gguf:gguf: embedding length = 960
INFO:hf-to-gguf:gguf: feed forward length = 2560
INFO:hf-to-gguf:gguf: head count = 15
INFO:hf-to-gguf:gguf: key-value head count = 5
INFO:hf-to-gguf:gguf: rope theta = 100000
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 7
INFO:hf-to-gguf:Set model quantization version
INFO:hf-to-gguf:Set model tokenizer
Traceback (most recent call last):
File "/content/llama.cpp/convert_hf_to_gguf.py", line 1796, in set_vocab
self._set_vocab_sentencepiece()
File "/content/llama.cpp/convert_hf_to_gguf.py", line 894, in _set_vocab_sentencepiece
tokens, scores, toktypes = self._create_vocab_sentencepiece()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/llama.cpp/convert_hf_to_gguf.py", line 911, in _create_vocab_sentencepiece
raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /content/drive/MyDrive/Sorachio-Small/safetensors/tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/content/llama.cpp/convert_hf_to_gguf.py", line 1799, in set_vocab
self._set_vocab_llama_hf()
File "/content/llama.cpp/convert_hf_to_gguf.py", line 989, in _set_vocab_llama_hf
vocab = gguf.LlamaHfVocab(self.dir_model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/content/llama.cpp/gguf-py/gguf/vocab.py", line 395, in init
raise FileNotFoundError('Cannot find Llama BPE tokenizer')
FileNotFoundError: Cannot find Llama BPE tokenizer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1778, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1147, in _find_and_load_unlocked
File "", line 690, in _load_unlocked
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/usr/local/lib/python3.11/dist-packages/transformers/generation/utils.py", line 115, in
from accelerate.hooks import AlignDevicesHook, add_hook_to_module
File "/usr/local/lib/python3.11/dist-packages/accelerate/init.py", line 16, in
from .accelerator import Accelerator
File "/usr/local/lib/python3.11/dist-packages/accelerate/accelerator.py", line 36, in
from accelerate.utils.imports import is_torchao_available
File "/usr/local/lib/python3.11/dist-packages/accelerate/utils/init.py", line 14, in
from .ao import convert_model_to_fp8_ao, filter_first_and_last_linear_layers, has_ao_layers
File "/usr/local/lib/python3.11/dist-packages/accelerate/utils/ao.py", line 28, in
from torchao.float8.float8_linear import Float8LinearConfig
File "/usr/local/lib/python3.11/dist-packages/torchao/init.py", line 41, in
from torchao.quantization import (
File "/usr/local/lib/python3.11/dist-packages/torchao/quantization/init.py", line 6, in
from .autoquant import (
File "/usr/local/lib/python3.11/dist-packages/torchao/quantization/autoquant.py", line 11, in
from torchao.dtypes import (
File "/usr/local/lib/python3.11/dist-packages/torchao/dtypes/init.py", line 1, in
from . import affine_quantized_tensor_ops
File "/usr/local/lib/python3.11/dist-packages/torchao/dtypes/affine_quantized_tensor_ops.py", line 14, in
from torchao.dtypes.floatx.cutlass_semi_sparse_layout import (
File "/usr/local/lib/python3.11/dist-packages/torchao/dtypes/floatx/init.py", line 1, in
from .cutlass_semi_sparse_layout import (
File "/usr/local/lib/python3.11/dist-packages/torchao/dtypes/floatx/cutlass_semi_sparse_layout.py", line 19, in
from torchao.ops import (
File "/usr/local/lib/python3.11/dist-packages/torchao/ops.py", line 46, in
tags=[torch._C.Tag.needs_fixed_stride_order],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'torch._C.Tag' has no attribute 'needs_fixed_stride_order'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1778, in _get_module
return importlib.import_module("." + module_name, self.name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1204, in _gcd_import
File "", line 1176, in _find_and_load
File "", line 1147, in _find_and_load_unlocked
File "", line 690, in _load_unlocked
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 38, in
from .auto_factory import _LazyAutoMapping
File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/auto_factory.py", line 40, in
from ...generation import GenerationMixin
File "", line 1229, in _handle_fromlist
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1766, in getattr
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1780, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
type object 'torch._C.Tag' has no attribute 'needs_fixed_stride_order'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/content/llama.cpp/convert_hf_to_gguf.py", line 6216, in
main()
File "/content/llama.cpp/convert_hf_to_gguf.py", line 6210, in main
model_instance.write()
File "/content/llama.cpp/convert_hf_to_gguf.py", line 402, in write
self.prepare_metadata(vocab_only=False)
File "/content/llama.cpp/convert_hf_to_gguf.py", line 512, in prepare_metadata
self.set_vocab()
File "/content/llama.cpp/convert_hf_to_gguf.py", line 1802, in set_vocab
self._set_vocab_gpt2()
File "/content/llama.cpp/convert_hf_to_gguf.py", line 830, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
^^^^^^^^^^^^^^^^^^^^^
File "/content/llama.cpp/convert_hf_to_gguf.py", line 597, in get_vocab_base
from transformers import AutoTokenizer
File "", line 1229, in _handle_fromlist
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1767, in getattr
value = getattr(module, name)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1766, in getattr
module = self._get_module(self._class_to_module[name])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1780, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.auto.tokenization_auto because of the following error (look up to see its traceback):
Failed to import transformers.generation.utils because of the following error (look up to see its traceback):
type object 'torch._C.Tag' has no attribute 'needs_fixed_stride_order'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions