Closed
Description
Name and Version
Last (binaries) version from Releases:
llama-b5456-bin-ubuntu-vulkan-x64.zip
$ llama-tts --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = NVIDIA GeForce GTX 1660 SUPER (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 49152 | int dot: 1 | matrix cores: none
version: 5456 (cc74d5be)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
Vulkan
Hardware
System (updated today):
OS: Arch Linux x86_64
Kernel: Linux 6.12.24-1-lts
Shell: bash 5.2.37
WM: dwm (X11)
Terminal: tmux 3.5a
CPU: Intel(R) Core(TM) i7-4790 (8) @ 3.60 GHz
GPU: NVIDIA GeForce GTX 1660 SUPER [Discrete]
Memory: 2.83 GiB / 15.56 GiB (18%)
Locale: en_US.UTF-8
Models
--tts-oute-default
(OuteTTS-0.2-500M)
Problem description & steps to reproduce
-
This works:
$ llama-tts --tts-oute-default -p "The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance" && aplay output.wav
-
This doesn't (and abort):
$ llama-tts --tts-oute-default -p "The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide range of hardware" && aplay output.wav
-
Difference is only the prompt:
"on a wide range of hardware"
First Bad Commit
No response
Relevant log output
|1750|><|875|><|933|><|1595|><|1406|><|861|><|437|><|747|><|1542|><|639|><|607|><|1308|><|1427|><|1141|><|1450|><|1304|><|1492|><|1656|>'
main: codes audio size: 544
/home/runner/work/llama.cpp/llama.cpp/src/llama-context.cpp:897: GGML_ASSERT((cparams.causal_attn || cparams.n_ubatch >= n_tokens_all) && "non-causal attention requires n_ubatch >= n_tokens") failed
[New LWP 260885]
[New LWP 260884]
[New LWP 260883]
[New LWP 260882]
[New LWP 260879]
[New LWP 260877]
[New LWP 260876]
[New LWP 260875]
[New LWP 260867]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
Function(s) ^std::(move|forward|as_const|(__)?addressof) will be skipped when stepping.
Function(s) ^std::(shared|unique)_ptr<.*>::(get|operator) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|(forward_)?list|(unordered_|flat_)?(multi)?(map|set)|span)<.*>::(c?r?(begin|end)|front|back|data|size|empty) will be skipped when stepping.
Function(s) ^std::(basic_string|vector|array|deque|span)<.*>::operator.] will be skipped when stepping.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x0000789957280e22 in ?? () from /usr/lib/libc.so.6
#0 0x0000789957280e22 in ?? () from /usr/lib/libc.so.6
#1 0x0000789957274fda in ?? () from /usr/lib/libc.so.6
#2 0x0000789957275024 in ?? () from /usr/lib/libc.so.6
#3 0x00007899572e592f in wait4 () from /usr/lib/libc.so.6
#4 0x00007899578b7f6d in ggml_abort () from /home/kuro/Exec5/libggml-base.so
#5 0x0000789957a03c17 in llama_context::decode(llama_batch&) () from /home/kuro/Exec5/libllama.so
#6 0x0000789957a03d78 in llama_decode () from /home/kuro/Exec5/libllama.so
#7 0x0000568382fde354 in main ()
[Inferior 1 (process 260866) detached]
Aborted (core dumped)