Misc. bug: Qwen 3.0 "enable_thinking" parameter not working

Name and Version

llama-server --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3060, compute capability 8.6, VMM: yes
version: 5199 (ced44be)
built with MSVC 19.41.34120.0 for x64

Operating systems

windows 11

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server Qwen3-14B-Q5_K_M.gguf

Problem description & steps to reproduce

param enable_thinking: false on llama-server has not effect at all when you send on request. (despite been on Alibaba examples)

SGLang and VLLM support this by "chat_template_kwargs":

https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes
https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes

curl http://localhost:30000/v1/chat/completions -H "Content-Type: application/json" -d '{
  "model": "Qwen/Qwen3-8B",
  "messages": [
    {"role": "user", "content": "Give me a short introduction to large language models."}
  ],
  "temperature": 0.7,
  "top_p": 0.8,
  "top_k": 20,
  "max_tokens": 8192,
  "presence_penalty": 1.5,
  "chat_template_kwargs": {"enable_thinking": false}
}'

First Bad Commit

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Qwen 3.0 "enable_thinking" parameter not working #13160

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Qwen 3.0 "enable_thinking" parameter not working #13160

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions