KoboldCpp Files

Run GGUF models easily with a UI or API. One File. Zero Install.

Brought to you by: henk717

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
llamacpp-for-kobold.exe	2023-03-29	16.7 MB	0
llamacpp-for-kobold-1.0.6-beta source code.tar.gz	2023-03-29	8.8 MB	0
llamacpp-for-kobold-1.0.6-beta source code.zip	2023-03-29	8.8 MB	0
README.md	2023-03-29	945 Bytes	0
Totals: 4 Items		34.3 MB	0

llamacpp-for-kobold-1.0.6-beta

This is an experimental release containing new integrations for OpenBLAS, which should increase initial prompt processing speed on compatible systems by over 2 times!
Updated Embedded Kobold Lite with the latest version which supports pseudo token streaming. This should make the UI feel much more responsive during prompt generation.
Switched to argparse, you can view all command line flags with llamacpp-for-kobold.exe --help
To disable OpenBLAS, you can run it with --noblas. Please tell me if you have issues with it, and include which specific OS and platform.

To use, download and run the llamacpp-for-kobold.exe Alternatively, drag and drop a compatible quantized model for llamacpp on top of the .exe, or run it and manually select the model in the popup dialog.

and then once loaded, you can connect like this (or use the full koboldai client): http://localhost:5001

Source: README.md, updated 2023-03-29

Other Useful Business Software

Keep company data safe with Chrome Enterprise Icon

Keep company data safe with Chrome Enterprise

Protect your business with AI policies and data loss prevention in the browser

Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.

Download Chrome

Gen AI apps are built with MongoDB Atlas Icon

Gen AI apps are built with MongoDB Atlas

Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free

Recommended Projects

Text Generation Web UI
A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. Dropdown menu for switching between models. Notebook mode that resembles OpenAI's playground. Chat mode for conversation and role playing. Instruct mode compatible with Alpaca and Open...
Python Client For NLP Cloud
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases...
PHP Client For NLP Cloud
NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases...
Basaran
Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models. The open source community will eventually witness the Stable Diffusion moment for large language models (LLMs), and...
gpt2-client
GPT-2 is a Natural Language Processing model developed by OpenAI for text generation. It is the successor to the GPT (Generative Pre-trained Transformer) model trained on 40GB of text from the internet. It features a Transformer model that was brought to light by the Attention Is All You Need...