Open
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Llama-bench is missing many useful flags that are present in batched-bench. Can we add these flags to llama-bench as well?
PP - prompt tokens per batch
TG - generated tokens per batch
B - number of batches
T_PP - prompt processing time (i.e. time to first token)
S_PP - prompt processing speed ((BPP)/T_PP or PP/T_PP)
T_TG - time to generate all batches
S_TG - text generation speed ((BTG)/T_TG)
T - total time
S - total speed (i.e. all tokens / total time)
Motivation
To get better metrics using llama-bench.
Possible Implementation
No response