Download Latest Version koboldcpp-1.100.1 source code.zip (45.6 MB)
Email in envelope

Get an email when there's a new version of KoboldCpp

Home / v1.66
Name Modified Size InfoDownloads / Week
Parent folder
koboldcpp_ggml_tools_25may.zip 2024-05-25 11.6 MB
koboldcpp_nocuda.exe 2024-05-25 61.0 MB
koboldcpp_cu12.exe 2024-05-25 479.2 MB
koboldcpp.exe 2024-05-25 346.1 MB
koboldcpp-linux-x64-nocuda 2024-05-25 68.7 MB
koboldcpp-linux-x64-cuda1210 2024-05-25 494.2 MB
koboldcpp-linux-x64-cuda1150 2024-05-25 426.1 MB
koboldcpp-1.66.1 source code.tar.gz 2024-05-24 32.8 MB
koboldcpp-1.66.1 source code.zip 2024-05-24 33.1 MB
README.md 2024-05-24 3.2 kB
Totals: 10 Items   2.0 GB 0

koboldcpp-1.66.1

Phi guess that's the way the cookie crumbles edition

  • NEW: Added custom SD LoRA support! Specify it with --sdlora and set the LoRA multiplier with --sdloramult. Note that SD LoRAs can only be used when loading in 16bit (e.g. with the .safetensors model) and will not work on quantized models (so incompatible with --sdquant)
  • NEW: Added custom SD VAE support, which can be specified in the Image Gen tab of the GUI launcher, or using --sdvae [vae_file.safetensors]
  • NEW: Added in-built support for TAE SD for SD1.5 and SDXL. This is a very small VAE replacement that can be used if a model has a broken VAE, it also works faster than regular VAE. To use it, select "Fix Bad VAE" checkbox or use the flag --sdvaeauto
  • Note: Do not use the above new flags with --sdconfig, which is a deprecated flag and not to be used.
  • NEW: Added experimental support for Rep Pen Slope. This is not a true slope, but the end result is it applies a slightly reduced rep pen for older tokens within the rep pen range, scaled by the slope value. Setting rep pen slope to 1 negates this effect. For compatibility reasons, rep pen slope defaults to 1 if unspecified (same behavior as before).
  • NEW: You can now specify a http/https URL to a GGUF file when passing the --model parameter, or in the model selector UI. KoboldCpp will attempt to download the model file into your current working directory, and automatically load it when the download is done.
  • Disable UI launcher scaling on MacOS due to display issues. Please report any further scaling issues.
  • Improved EOT token handling, fixed a bug in token speed calculations.
  • Default thread count will not exceed 8 unless overridden, this helps mitigate e-core issues.
  • Merged improvements and fixes from upstream, including new Phi support and Vulkan fixes from @0cc4m
  • Updated Kobold Lite:
    • Now attempts to function correctly if hosted on a subdirectory URL path (e.g. using a reverse proxy), if that fails it defaults back to the root URL.
    • Changed default chatmode player name from "You" to "User", which solves some wonky phrasing issues.
    • Added viewport width controls in settings, including horizontal fullscreen.
    • Minor bugfixes for markdown

Fix for 1.66.1 - Fixed quant tools makefile, fixed sd seed parsing, updated lite

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2024-05-24