Local with resource guard #93
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds an optional local provider so LLM Council can run fully on Ollama-hosted local models with basic resource safeguards.
What this PR does
• Introduces PROVIDER=local mode with Ollama-backed models (LOCAL_MODELS, CHAIRMAN_LOCAL_MODEL) and a RAM-aware adaptive resource guard (COUNCIL_MEM_RESERVE_GB, COUNCIL_MAX_PARALLEL_LOCAL, COUNCIL_LOCAL_TIMEOUT_SEC).
• Adds backend/local.py and wires council stages + title generation to choose between OpenRouter and local providers.
• Extends openrouter.py with CUSTOM_MODELS support for arbitrary OpenAI-compatible endpoints (e.g. Ollama’s /v1/chat/completions), reusing the same code path.
• Documents local setup in README.md and a new backend/Local-AI.md with step-by-step instructions for running NVIDIA’s Nemotron models via Ollama on macOS/Linux/Windows.
• Adds scripts/council.sh for quick CLI smoke tests against the council API.
• Adds psutil dependency to estimate system RAM for the local resource guard.
Has not been peer reviewed yet; I welcome feedback! Happy hacking 👍
Easily extensible, it's fun to mix frontier & local models!