Skip to content

Conversation

@FlynnCruse
Copy link

@FlynnCruse FlynnCruse commented Dec 11, 2025

Adds an optional local provider so LLM Council can run fully on Ollama-hosted local models with basic resource safeguards.

What this PR does
• Introduces PROVIDER=local mode with Ollama-backed models (LOCAL_MODELS, CHAIRMAN_LOCAL_MODEL) and a RAM-aware adaptive resource guard (COUNCIL_MEM_RESERVE_GB, COUNCIL_MAX_PARALLEL_LOCAL, COUNCIL_LOCAL_TIMEOUT_SEC).
• Adds backend/local.py and wires council stages + title generation to choose between OpenRouter and local providers.
• Extends openrouter.py with CUSTOM_MODELS support for arbitrary OpenAI-compatible endpoints (e.g. Ollama’s /v1/chat/completions), reusing the same code path.
• Documents local setup in README.md and a new backend/Local-AI.md with step-by-step instructions for running NVIDIA’s Nemotron models via Ollama on macOS/Linux/Windows.
• Adds scripts/council.sh for quick CLI smoke tests against the council API.
• Adds psutil dependency to estimate system RAM for the local resource guard.

---

Has not been peer reviewed yet; I welcome feedback! Happy hacking 👍

Easily extensible, it's fun to mix frontier & local models!

…el locally via Ollama, including usage instructions, recommended settings, and troubleshooting tips.
…d Windows, along with model management and troubleshooting tips. Include guidance on RAM sizing and safeguards for optimal performance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant