| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| koboldcpp_ggml_tools_25may.zip | 2024-05-25 | 11.6 MB | |
| koboldcpp_nocuda.exe | 2024-05-25 | 61.0 MB | |
| koboldcpp_cu12.exe | 2024-05-25 | 479.2 MB | |
| koboldcpp.exe | 2024-05-25 | 346.1 MB | |
| koboldcpp-linux-x64-nocuda | 2024-05-25 | 68.7 MB | |
| koboldcpp-linux-x64-cuda1210 | 2024-05-25 | 494.2 MB | |
| koboldcpp-linux-x64-cuda1150 | 2024-05-25 | 426.1 MB | |
| koboldcpp-1.66.1 source code.tar.gz | 2024-05-24 | 32.8 MB | |
| koboldcpp-1.66.1 source code.zip | 2024-05-24 | 33.1 MB | |
| README.md | 2024-05-24 | 3.2 kB | |
| Totals: 10 Items | 2.0 GB | 0 | |
koboldcpp-1.66.1
Phi guess that's the way the cookie crumbles edition
- NEW: Added custom SD LoRA support! Specify it with
--sdloraand set the LoRA multiplier with--sdloramult. Note that SD LoRAs can only be used when loading in 16bit (e.g. with the.safetensorsmodel) and will not work on quantized models (so incompatible with--sdquant) - NEW: Added custom SD VAE support, which can be specified in the Image Gen tab of the GUI launcher, or using
--sdvae [vae_file.safetensors] - NEW: Added in-built support for TAE SD for SD1.5 and SDXL. This is a very small VAE replacement that can be used if a model has a broken VAE, it also works faster than regular VAE. To use it, select "Fix Bad VAE" checkbox or use the flag
--sdvaeauto - Note: Do not use the above new flags with
--sdconfig, which is a deprecated flag and not to be used. - NEW: Added experimental support for Rep Pen Slope. This is not a true slope, but the end result is it applies a slightly reduced rep pen for older tokens within the rep pen range, scaled by the slope value. Setting rep pen slope to 1 negates this effect. For compatibility reasons, rep pen slope defaults to 1 if unspecified (same behavior as before).
- NEW: You can now specify a http/https URL to a GGUF file when passing the
--modelparameter, or in the model selector UI. KoboldCpp will attempt to download the model file into your current working directory, and automatically load it when the download is done. - Disable UI launcher scaling on MacOS due to display issues. Please report any further scaling issues.
- Improved EOT token handling, fixed a bug in token speed calculations.
- Default thread count will not exceed 8 unless overridden, this helps mitigate e-core issues.
- Merged improvements and fixes from upstream, including new Phi support and Vulkan fixes from @0cc4m
- Updated Kobold Lite:
- Now attempts to function correctly if hosted on a subdirectory URL path (e.g. using a reverse proxy), if that fails it defaults back to the root URL.
- Changed default chatmode player name from "You" to "User", which solves some wonky phrasing issues.
- Added viewport width controls in settings, including horizontal fullscreen.
- Minor bugfixes for markdown
Fix for 1.66.1 - Fixed quant tools makefile, fixed sd seed parsing, updated lite
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help flag.