Download Latest Version koboldcpp-1.100.1 source code.zip (45.6 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.41
Name Modified Size InfoDownloads / Week
Parent folder
koboldcpp.exe 2023-08-24 285.3 MB
koboldcpp_nocuda.exe 2023-08-24 22.8 MB
koboldcpp-1.41 (beta) source code.tar.gz 2023-08-24 10.6 MB
koboldcpp-1.41 (beta) source code.zip 2023-08-24 10.7 MB
README.md 2023-08-24 1.5 kB
Totals: 5 Items   329.3 MB 0

koboldcpp-1.41 (beta)

It's been a while since the last release and quite a lot upstream has changed under the hood, so consider this release a beta.

  • Added support for LLAMA GGUF models, handled automatically. All older models will still continue to work normally. Note that GGUF format support for other non-llama architectures has not been added yet.
  • Added --config flag to load a .kcpps settings file when launching from command line (Credits: @poppeman), these files can also be imported/exported from the GUI.
  • Added a new endpoint /api/extra/tokencount which can be used to tokenize and accurately measure how many tokens any string has.
  • Fix for bell characters occasionally causing the terminal to beep in debug mode.
  • Fix for incorrect list of backends & missing backends displayed in the GUI.
  • Set MMQ to be the default for CUDA when running from GUI.
  • Updated Lite, and merged all the improvements and fixes from upstream.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2023-08-24