Skip to content

Misc. bug: --cache-reuse no longer seems to be caching prompt prefixes #14113

Open
@ghnp5

Description

@ghnp5

Name and Version

Affected:
Version at commit: b7a1746

Not affected:
Version at commit: c6a2c9e

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-server

Problem description & steps to reproduce

I had open this bug in oobabooga/text-generation-webui: oobabooga/text-generation-webui#7060

The issue being that prompt prefixes were no longer being used in the following requests.

I confirmed that --cache-reuse 1 was being passed on, so that wasn't the issue.
After reverting to the previous version of the WebUI (which ships an older version of llama.cpp), the prompts started to be cached again.

So, this seems to point to being a bug with llama.cpp.

First Bad Commit

It looks like there may have been a commit between c6a2c9e and b7a1746 that broke --cache-reuse, or that changed its behavior.


EDIT - still broken in today's commit: 2bb0467


EDIT 2 - someone seems to have reported the same thing here: #13576 (comment) (although that PR is not yet merged)

There was another recent change, here: 06a92a1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions