Download Latest Version koboldcpp-1.100.1 source code.zip (45.6 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.46.1
Name Modified Size InfoDownloads / Week
Parent folder
koboldcpp_nocuda.exe 2023-10-08 23.7 MB
koboldcpp.exe 2023-10-08 286.5 MB
koboldcpp-1.46.1 source code.tar.gz 2023-10-08 13.3 MB
koboldcpp-1.46.1 source code.zip 2023-10-08 13.5 MB
README.md 2023-10-08 2.8 kB
Totals: 5 Items   337.0 MB 0

koboldcpp-1.46.1

Important: Deprecation Notice for KoboldCpp 1.46

  • The following command line arguments are deprecated and have been removed from this version on.

    --psutil_set_threads - parameter will be removed as it's now generally unhelpful, the defaults are usually sufficient. --stream - a Kobold Lite only parameter, which is now a toggle saved inside Lite's settings and thus no longer necessary. --unbantokens - EOS unbans should only be set via the generate API, in the use_default_badwordsids json field. --usemirostat - Mirostat values should only be set via the generate API, in the mirostat mirostat_tau and mirostat_eta json fields. - Removed the original deprecated tkinter GUI, now only the new customtkinter GUI remains. - Improved embedded horde worker, added even more session stats, job pulls and job submits are now done in parallel so it should run about 20% faster for horde requests. - Changed the default model name from concedo/koboldcpp to koboldcpp/[model_filename]. This does prevent old "Kobold AI-Client" users from connecting via the API, so if you're still using that, either switch to a newer client or connect via the Basic/OpenAI API instead of the Kobold API. - Added proper API documentation, which can be found by navigating to /api or the web one at https://lite.koboldai.net/koboldcpp_api - Allow .kcpps files to be drag & dropped, as well as working via OpenWith in windows. - Added a new OpenAI Chat Completions compatible endpoint at /v1/chat/completions (credit: @teddybear082) - --onready processes are now started with subprocess.run instead of Popen (https://github.com/LostRuins/koboldcpp/pull/462) - Both /check and /abort can now function together with multiuser mode, provided the correct genkey is used by the client (automatically handled in Lite). - Allow 64k --contextsize (for GGUF only, still 16k otherwise). - Minor UI fixes and enhancements. - Updated Lite, pulled fixes and improvements from upstream.

v1.46.1 hotfix: fixed an issue where blasthreads was used for values between 1 and 32 tokens.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2023-10-08