KoboldCpp - Browse /v1.0.9beta at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
koboldcpp.exe	2023-04-05	16.4 MB	0
koboldcpp-1.0.9beta source code.tar.gz	2023-04-05	8.5 MB	0
koboldcpp-1.0.9beta source code.zip	2023-04-05	8.6 MB	0
README.md	2023-04-05	1.4 kB	0
Totals: 4 Items		33.5 MB	0

koboldcpp-1.0.9beta

Integrated support for GPT2! This also should theoretically work with Cerebras models, but I have not tried those yet. This is a great way to get started as now you can try models so tiny even a potato CPU can run them. Here's a good one to start with: https://huggingface.co/ggerganov/ggml/resolve/main/ggml-model-gpt-2-117M.bin with which I can generate 100 tokens in a second.
Upgraded embedded Kobold Lite to support a Stanford Alpaca compatible Instruct Mode, which can be enabled in settings.
Removed all -march=native and -mtune=native flags when building the binary. Compatibility should be more consistent with different devices now.
Fixed an incorrect flag name used to trigger the ACCELERATE library for mac OSX. This should give you greatly increased performance of OSX users for GPT-J and GPT2 models, assuming you have ACCELERATE support.
Added Rep Pen for GPT-J and GPT-2 models, and by extension pyg.cpp, this means that repetition penalty now works similar to the way it does in llama.cpp.

To use, download and run the koboldcpp.exe Alternatively, drag and drop a compatible ggml model on top of the .exe, or run it and manually select the model in the popup dialog.

and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

Source: README.md, updated 2023-04-05

KoboldCpp Files

Run GGUF models easily with a UI or API. One File. Zero Install.

KoboldCpp Files

Run GGUF models easily with a UI or API. One File. Zero Install.

Get an email when there's a new version of KoboldCpp