Download Latest Version koboldcpp-1.100.1 source code.zip (45.6 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.0.3
Name Modified Size InfoDownloads / Week
Parent folder
llamacpp_for_kobold.zip 2023-03-22 740.0 kB
llamacpp-for-kobold-1.0.3 source code.tar.gz 2023-03-22 2.4 MB
llamacpp-for-kobold-1.0.3 source code.zip 2023-03-22 2.4 MB
README.md 2023-03-22 791 Bytes
Totals: 4 Items   5.5 MB 0

llamacpp-for-kobold-1.0.3

  • Applied the massive refactor from the parent repo. It was a huge pain but I managed to keep the old tokenizer untouched and retained full support for the original model formats.
  • Reduced default batch sizes greatly, as large batch sizes were causing bad output and high memory usage
  • Support dynamic context lengths sent from client.
  • TavernAI is working although I wouldn't recommend it, they spam the server with multiple requests of huge contexts so you're going to have a very painful time getting responses.

Weights not included. To use, download, extract and run (defaults port is 5001): llama_for_kobold.py [ggml_quant_model.bin] [port]

and then you can connect like this (or use the full koboldai client): http://localhost:5001

Source: README.md, updated 2023-03-22