| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| koboldcpp_nocuda.exe | 2024-08-31 | 63.0 MB | |
| koboldcpp_cu12.exe | 2024-08-31 | 593.6 MB | |
| koboldcpp.exe | 2024-08-31 | 475.2 MB | |
| koboldcpp-mac-arm64 | 2024-08-31 | 26.8 MB | |
| koboldcpp-linux-x64-nocuda | 2024-08-31 | 71.6 MB | |
| koboldcpp-linux-x64-cuda1210 | 2024-08-31 | 676.8 MB | |
| koboldcpp-linux-x64-cuda1150 | 2024-08-31 | 589.2 MB | |
| koboldcpp_oldcpu.exe | 2024-08-31 | 467.3 MB | |
| koboldcpp-1.74 source code.tar.gz | 2024-08-30 | 28.5 MB | |
| koboldcpp-1.74 source code.zip | 2024-08-30 | 29.0 MB | |
| README.md | 2024-08-30 | 2.3 kB | |
| Totals: 11 Items | 3.0 GB | 0 | |
koboldcpp-1.74
Kobo's all grown up now
- NEW: Added XTC (Exclude Top Choices) sampler, a brand new creative writing sampler designed by the same author of DRY (@p-e-w). To use it, increase
xtc_probabilityabove 0 (recommended values to try:xtc_threshold=0.15, xtc_probability=0.5) - Added automatic image resizing and letterboxing for llava/minicpm images, this should improve handling of oddly-sized images.
- Added a new flag
--nomodelwhich allows launching the Lite WebUI without loading any model at all. You can then select an external api provider like Horde, Gemini or OpenAI - MacOS defaults to full offload when
-1gpulayers selected - Minor tweaks to context shifting thresholds
- Horde Worker now has a 5 minute timeout for each request, which should reduce the likelihood of getting stuck (e.g. internet issues). Also, horde worker now supports connecting to SSL secured Kcpp instances (remember to enable
--nocertifyif using self signed certs) - Updated Kobold Lite, multiple fixes and improvements
- Merged fixes and improvements from upstream (plus Llama-3.1-Minitron-4B-Width support)
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary. If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help flag.