-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Open
Description
I use a llama.cpp-b6765 to run GLM- 4.6-q4_0 (from https://huggingface.co/bartowski/zai-org_GLM-4.6-GGUF) on CANN platform, found running very slow(npu-smi info shows AIcore is 0), and tools call fails.
The compilation command is as follows:
cmake -B build -DGGML_CANN=on -DCMAKE_BUILD_TYPE=release -DUSE_ACL_GRAPH=ON
cmake --build build --config release -j 32
The startup command is as follows:
source /usr/local/Ascend/ascend-toolkit/set_env.sh
export GGML_CANN_ACL_GRAPH=0
build/bin/llama-server \
--model /mnt/1/model/GLM-4.6-GGUF/Q4_0/zai-org_GLM-4.6-Q4_0/ zai-org_glM-4.6-q4_0-00001-of 00006.gguf \
--host 0.0.0.0 \
--port 1025 \
--ctx-size 204800 \
--parallel 1 \
--no-context-shift \
--gpu-layers -1 \
--alias glm-4.6 \
--jinja \
--no-webui \
--metrics
If 'export GGML_CANN_ACL_GRAPH=1' is set, an error will be reported
CANN error: EE9999: Inner Error!
EE9999: [PID: [881460] 2025-10-15-14:43:16.351.702 Not allow to synchronize captured stream stream_id=2.[FUNC:StreamSynchronize][FILE:api_error.cc][LINE:884]
TraceBack (most recent call last):
rtStreamSynchronize execute failed, reason=[stream is captured][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:53]
synchronize stream failed, runtime result = 107027[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:161]
current device: 0, in function ggml_cann_mul_mat_id_quant at /mnt/0/zzc/llama.cpp-b6765/ggml/src/ggml-cann/aclnn_ops.cpp:3016
aclrtSynchronizeStream(ctx.stream())
An error occurred when the tool was called
{"error":{"code":500,"message":"Unknown argument ensure_ascii for function tojson at row 11, column 37:\n{% for tool in tools %}\n{{ tool | tojson(ensure_ascii=False) }}\n ^\n{% endfor %}\n at row 11,
column 1:\n{% for tool in tools %}\n{{ tool | tojson(ensure_ascii=False) }}\n^\n{% endfor %}\n at row 10, column 24:\n<tools>\n{% for tool in tools %}\n ^\n{{ tool | tojson(ensure_ascii=False) }}\n at row 10, colum
n 1:\n<tools>\n{% for tool in tools %}\n^\n{{ tool | tojson(ensure_ascii=False) }}\n at row 2, column 17:\n[gMASK]<sop>\n{%- if tools -%}\n ^\n<|system|>\n at row 2, column 1:\n[gMASK]<sop>\n{%- if tools -%}\n^\n<|system|
>\n at row 1, column 1:\n[gMASK]<sop>\n^\n{%- if tools -%}\n","type":"server_error"}}
Metadata
Metadata
Assignees
Labels
No labels