Skip to content

Manage completion models

igardev edited this page Oct 14, 2025 · 5 revisions

Manage completion models

Requred servers

  • No servers required

Overview

Completion models configurations are stored and could be reused. For simplicity the term "completion models" will be used as a synonim for Completion models configurations. Completion models could be for local models (run by llama-vscode) and for externally run servers. They have properties: name, local start command (llama-server command to start a server with this model locally), ai model (as required by the provider), endpoint, is key required

Completion models configurations could be added/deleted/viewed/selected/deselected/added from huggingface/exported/imported

How to use it

Select "Completion models..." from llama-vscode menu

  • Add local model
    Enter the requested properties.
    Name, local start command and endpoint are required
    Use models, which support FIM (Fill In the Middle), for example Qwen2.5-Coder-1.5B-Q8_0-GGUF

  • Add external model
    Enter the requested properties.
    Name and endpoint are required.
    Use models, which support FIM (Fill In the Middle)

  • Delete models
    Select the model you want to delete from the list and delete it.

  • View
    Select a model from the list to view all the details for this model

  • Select
    Select a model from the list to select it. If the model is a local one (has a command in local start command) a llama.cpp server with this model will be started. Only one completion model could be selected at a time.

  • Deselect
    Deselect the currently selected model. If the model is local, the llama.cpp server will be started.

  • Add model from huggingface
    Enter search words to find a model from huggingface. If the model is selected it will be automatically downloaded (if not yet done) and a llama.cpp server will be started with it.

  • Add completion model from OpenAI compatible provider
    Add completion model from OpenAI compatible provider - OpenRouter or custom (for example local/external llama.cpp server).

  • Export
    A model could be exported as a .json files. This file could be shared with other users, modified if needed and imported again. Select a model to export it.

  • Import
    A model could be imported from a .json file - select a file to import it.

Clone this wiki locally