Warn when configured LLM is not frontier-recommended#586
Warn when configured LLM is not frontier-recommended#586devin-ai-integration[bot] wants to merge 3 commits into
Conversation
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Greptile SummaryThis PR adds a startup
Confidence Score: 4/5Safe to merge; the warning is non-blocking and the normalization logic is verified by the new test suite. The frontier-detection logic is correct for all documented inputs and the tests cover the important cases. The two observations are about implicit coupling between the lowercased candidate strings and the case of RECOMMENDED_MODEL_NAMES constants, and about the bare-name extraction allowing frontier-prefix matches from any provider. Neither breaks current behavior, but both could cause silent failures or missed warnings as the constant lists grow. strix/config/models.py — the _is_recommended_or_frontier_candidate function has an implicit invariant that RECOMMENDED_MODEL_NAMES entries are always lowercase, and the bare-name prefix matching has no provider guard. Important Files Changed
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
strix/config/models.py:191-194
**Implicit case-sensitivity invariant on `RECOMMENDED_MODEL_NAMES`**
`_is_recommended_or_frontier_candidate` receives already-lowercased strings from `_normalized_model_candidates`, but the `in RECOMMENDED_MODEL_NAMES` lookup relies on `RECOMMENDED_MODEL_NAMES` values also being lowercase — this is an implicit coupling. All three current entries happen to be lowercase, so it works today, but adding a mixed-case entry (e.g. `"Vertex_AI/Gemini-3-Pro-Preview"`) would silently break the exact-match path while the `startswith` frontier-prefix path would still pass. A `.lower()` call on each constant in the set (or converting `RECOMMENDED_MODEL_NAMES` to a `frozenset` of lowercased strings) would make this invariant explicit and safe.
### Issue 2 of 2
strix/config/models.py:68-73
**Frontier prefix matching can produce false negatives via bare-name extraction**
`_normalized_model_candidates` generates a second candidate by splitting on the last `/` and taking the right-hand side. Any model whose bare name starts with a frontier prefix will be treated as approved — regardless of provider. For instance, a self-hosted or proxy model named `"custom-ollama/gpt-5-mini-local"` yields bare candidate `"gpt-5-mini-local"`, which starts with `"gpt-5"` and silently suppresses the quality warning. This may be an acceptable trade-off, but worth documenting explicitly (and possibly adding a provider check alongside the prefix check).
Reviews (1): Last reviewed commit: "Warn for non-frontier LLM selections" | Re-trigger Greptile |
| def _is_recommended_or_frontier_candidate(model_name: str) -> bool: | ||
| if model_name in RECOMMENDED_MODEL_NAMES: | ||
| return True | ||
| return any(model_name.startswith(prefix) for prefix in FRONTIER_MODEL_PREFIXES) |
There was a problem hiding this comment.
Implicit case-sensitivity invariant on
RECOMMENDED_MODEL_NAMES
_is_recommended_or_frontier_candidate receives already-lowercased strings from _normalized_model_candidates, but the in RECOMMENDED_MODEL_NAMES lookup relies on RECOMMENDED_MODEL_NAMES values also being lowercase — this is an implicit coupling. All three current entries happen to be lowercase, so it works today, but adding a mixed-case entry (e.g. "Vertex_AI/Gemini-3-Pro-Preview") would silently break the exact-match path while the startswith frontier-prefix path would still pass. A .lower() call on each constant in the set (or converting RECOMMENDED_MODEL_NAMES to a frozenset of lowercased strings) would make this invariant explicit and safe.
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/config/models.py
Line: 191-194
Comment:
**Implicit case-sensitivity invariant on `RECOMMENDED_MODEL_NAMES`**
`_is_recommended_or_frontier_candidate` receives already-lowercased strings from `_normalized_model_candidates`, but the `in RECOMMENDED_MODEL_NAMES` lookup relies on `RECOMMENDED_MODEL_NAMES` values also being lowercase — this is an implicit coupling. All three current entries happen to be lowercase, so it works today, but adding a mixed-case entry (e.g. `"Vertex_AI/Gemini-3-Pro-Preview"`) would silently break the exact-match path while the `startswith` frontier-prefix path would still pass. A `.lower()` call on each constant in the set (or converting `RECOMMENDED_MODEL_NAMES` to a `frozenset` of lowercased strings) would make this invariant explicit and safe.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Addressed in 8ca83e4 by normalizing recommended-model constants into _RECOMMENDED_MODEL_NAME_SET for case-insensitive exact matches.
| FRONTIER_MODEL_PREFIXES = ( | ||
| "gpt-5", | ||
| "claude-opus-4", | ||
| "claude-sonnet-4", | ||
| "gemini-3", | ||
| ) |
There was a problem hiding this comment.
Frontier prefix matching can produce false negatives via bare-name extraction
_normalized_model_candidates generates a second candidate by splitting on the last / and taking the right-hand side. Any model whose bare name starts with a frontier prefix will be treated as approved — regardless of provider. For instance, a self-hosted or proxy model named "custom-ollama/gpt-5-mini-local" yields bare candidate "gpt-5-mini-local", which starts with "gpt-5" and silently suppresses the quality warning. This may be an acceptable trade-off, but worth documenting explicitly (and possibly adding a provider check alongside the prefix check).
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/config/models.py
Line: 68-73
Comment:
**Frontier prefix matching can produce false negatives via bare-name extraction**
`_normalized_model_candidates` generates a second candidate by splitting on the last `/` and taking the right-hand side. Any model whose bare name starts with a frontier prefix will be treated as approved — regardless of provider. For instance, a self-hosted or proxy model named `"custom-ollama/gpt-5-mini-local"` yields bare candidate `"gpt-5-mini-local"`, which starts with `"gpt-5"` and silently suppresses the quality warning. This may be an acceptable trade-off, but worth documenting explicitly (and possibly adding a provider check alongside the prefix check).
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Addressed in 8ca83e4 by only applying frontier bare-name prefix matching when the provider is known frontier (openai, anthropic, vertex_ai, gemini, google). Added a regression test for custom-ollama/gpt-5-mini-local.
Summary
Adds a non-blocking startup
MODEL QUALITY WARNINGinwarm_up_llm()whenSTRIX_LLMis set to a model outside Strix's recommended/frontier set:The warning runs after settings are loaded and before the LLM warmup call; it prints guidance and recommended model names but does not block the scan.
Expands the frontier coverage from a small OpenAI/Anthropic/Gemini set to current SOTA families across OpenAI GPT-5.x, Anthropic Claude Opus/Sonnet 4.x, Gemini 3.x, Grok 4.x, DeepSeek V4/R1, Qwen3.x, Moonshot Kimi K2.x, and Mistral/Magistral. The matcher now groups provider markers with model prefixes so routed names like
litellm/openai/gpt-5.4-pro,bedrock_mantle/openai.gpt-5.5,vertex_ai/claude-sonnet-4-6@default, andopenrouter/google/gemini-3.1-pro-previeware accepted without allowing arbitrary custom providers to bypass the warning.Also updates the pre-commit mypy hook to a current mypy version and fixes existing hook-only lint/typecheck issues so local hooks pass.
Link to Devin session: https://app.devin.ai/sessions/0c9d3d7e49ce4f6d84113bcce25c4456
Requested by: @0xallam