Skip to content

Misc. bug: llama-server webui with --jinja flag does not show thinking when using reasoning models #14007

Open
@littlett77

Description

@littlett77

Name and Version

./build/bin/llama-cli --version version: 5586 (3ac6753) built with clang version 20.1.4 for aarch64-unknown-linux-android24

tested on Chrome browser for Android

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Command line

./build/bin/llama-server     --model ~/_memorycard/_y/Qwen3-1.7B-Q8_K_XL.gguf     --temp 0.6     --top_p 0.95  --top_k 20    --min_p 0.01      --seed 666     -fa  --host 0.0.0.0 --no-warmup --jinja

Problem description & steps to reproduce

as by title, when using the --jinja flag with models like Qwen3 or Deepseek-R1-0528-Qwen3 during the thinking proces the animated 3 dots keep spinning and after a while the token counter gets updated but no "thinking" or "thought process" icon/tags appear and no elucubration is written as it never happened; when the reasoning ends the actual answer start showing as usual. When not using the --jinja flag the reasoning stuff happens regularly. When instead using the llama-cli command the reasoning appear inside the <think>...</think> tags as expected

First Bad Commit

No response

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions