| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| koboldcpp_nocuda.exe | 2023-08-30 | 22.9 MB | |
| koboldcpp.exe | 2023-08-30 | 285.4 MB | |
| koboldcpp-1.42.1 source code.tar.gz | 2023-08-30 | 10.6 MB | |
| koboldcpp-1.42.1 source code.zip | 2023-08-30 | 10.7 MB | |
| README.md | 2023-08-30 | 1.9 kB | |
| Totals: 5 Items | 329.7 MB | 0 | |
koboldcpp-1.42.1
- Added support for LLAMA GGUFv2 models, handled automatically. All older models will still continue to work normally.
- Fixed a problem with certain logit values that were causing segfaults when using the Typical sampler. Please let me know if it happens again.
- Merged rocm support from @YellowRoseCx so you should now be able to build AMD compatible GPU builds with HIPBLAS, which should be faster than using CLBlast.
- Merged upstream support for GGUF Falcon models. Note that GPU layer offload for Falcon is unavailable with
--useclblastbut works with CUDA. Older pre-gguf Falcon models are not supported. - Added support for unbanning EOS tokens directly from API, and by extension it can now be triggered from Lite UI settings. Note: Your command line
--unbantokensflag will force override this.- Added support for automatic rope scale calculations based on a model's training context (n_ctx_train), this triggers if you do not explicitly specify a(reverted in 1.42.1 for now, it was not setup correctly)--ropeconfig. For example, this means llama2 models will (by default) use a smaller rope scale compared to llama1 models, for the same specified--contextsize. Setting--ropeconfigwill override this. - Updated Kobold Lite, now with tavern style portraits in Aesthetic Instruct mode.
- Pulled other fixes and improvements from upstream.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help flag.