The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp_nocuda.exe	2024-08-31	63.0 MB	0
koboldcpp_cu12.exe	2024-08-31	593.6 MB	0
koboldcpp.exe	2024-08-31	475.2 MB	0
koboldcpp-mac-arm64	2024-08-31	26.8 MB	0
koboldcpp-linux-x64-nocuda	2024-08-31	71.6 MB	0
koboldcpp-linux-x64-cuda1210	2024-08-31	676.8 MB	0
koboldcpp-linux-x64-cuda1150	2024-08-31	589.2 MB	0
koboldcpp_oldcpu.exe	2024-08-31	467.3 MB	0
koboldcpp-1.74 source code.tar.gz	2024-08-30	28.5 MB	0
koboldcpp-1.74 source code.zip	2024-08-30	29.0 MB	0
README.md	2024-08-30	2.3 kB	0
Totals: 11 Items		3.0 GB	0

koboldcpp-1.74

Kobo's all grown up now

NEW: Added XTC (Exclude Top Choices) sampler, a brand new creative writing sampler designed by the same author of DRY (@p-e-w). To use it, increase xtc_probability above 0 (recommended values to try: xtc_threshold=0.15, xtc_probability=0.5)
Added automatic image resizing and letterboxing for llava/minicpm images, this should improve handling of oddly-sized images.
Added a new flag --nomodel which allows launching the Lite WebUI without loading any model at all. You can then select an external api provider like Horde, Gemini or OpenAI
MacOS defaults to full offload when -1 gpulayers selected
Minor tweaks to context shifting thresholds
Horde Worker now has a 5 minute timeout for each request, which should reduce the likelihood of getting stuck (e.g. internet issues). Also, horde worker now supports connecting to SSL secured Kcpp instances (remember to enable --nocertify if using self signed certs)
Updated Kobold Lite, multiple fixes and improvements
Merged fixes and improvements from upstream (plus Llama-3.1-Minitron-4B-Width support)

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller. If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller. If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster). If you're using Linux, select the appropriate Linux binary file instead (not exe). If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary. If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI. and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Source: README.md, updated 2024-08-30