Download Latest Version koboldcpp-1.100.1 source code.zip (45.6 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.42.1
Name Modified Size InfoDownloads / Week
Parent folder
koboldcpp_nocuda.exe 2023-08-30 22.9 MB
koboldcpp.exe 2023-08-30 285.4 MB
koboldcpp-1.42.1 source code.tar.gz 2023-08-30 10.6 MB
koboldcpp-1.42.1 source code.zip 2023-08-30 10.7 MB
README.md 2023-08-30 1.9 kB
Totals: 5 Items   329.7 MB 0

koboldcpp-1.42.1

  • Added support for LLAMA GGUFv2 models, handled automatically. All older models will still continue to work normally.
  • Fixed a problem with certain logit values that were causing segfaults when using the Typical sampler. Please let me know if it happens again.
  • Merged rocm support from @YellowRoseCx so you should now be able to build AMD compatible GPU builds with HIPBLAS, which should be faster than using CLBlast.
  • Merged upstream support for GGUF Falcon models. Note that GPU layer offload for Falcon is unavailable with --useclblast but works with CUDA. Older pre-gguf Falcon models are not supported.
  • Added support for unbanning EOS tokens directly from API, and by extension it can now be triggered from Lite UI settings. Note: Your command line --unbantokens flag will force override this. - Added support for automatic rope scale calculations based on a model's training context (n_ctx_train), this triggers if you do not explicitly specify a --ropeconfig. For example, this means llama2 models will (by default) use a smaller rope scale compared to llama1 models, for the same specified --contextsize. Setting --ropeconfig will override this. (reverted in 1.42.1 for now, it was not setup correctly)
  • Updated Kobold Lite, now with tavern style portraits in Aesthetic Instruct mode.
  • Pulled other fixes and improvements from upstream.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2023-08-30