feat: Auto-detect all languages in non-interactive / autonomous agent mode#1126
feat: Auto-detect all languages in non-interactive / autonomous agent mode#1126droideronline wants to merge 2 commits intooraios:mainfrom
Conversation
When autogenerating project.yml without user interaction (e.g. when invoked via MCP from an autonomous agent), only the dominant language was previously enabled. This breaks polyglot repos (e.g. Python backend + TypeScript/React frontend) where no project.yml exists. Adds auto_detect_all_languages parameter (True by default) to ProjectConfig.autogenerate(). In non-interactive mode with this flag enabled, all detected languages are automatically included rather than just the top one. Interactive mode behaviour is unchanged.
|
@opcode81 - would love to get your eyes on this when you get a chance. This is blocking a use case where Serena is invoked autonomously via MCP (no human in the loop to configure |
There was a problem hiding this comment.
Pull request overview
Adds an opt-in (default-enabled) behavior to ProjectConfig.autogenerate() so that, when running non-interactively (e.g., MCP/agent/CI) and no .serena/project.yml exists, Serena enables all detected languages instead of only the dominant one—improving LSP tool correctness in polyglot repos.
Changes:
- Add
auto_detect_all_languages: bool = Trueparameter toProjectConfig.autogenerate(). - In non-interactive mode, enable all detected languages (sorted) when the flag is true.
- Add logging to surface which languages are enabled automatically.
- Return unrounded percentages from determine_programming_language_composition() so sorting in autogenerate() is accurate and deterministic - Update test_autogenerate_with_js_files and test_autogenerate_custom_project_name to use 'in' assertions (Vue matcher overlaps with TS/JS extensions) - Update test_autogenerate_with_multiple_languages to assert all detected languages are enabled by default (new behavior) - Add test_autogenerate_with_multiple_languages_dominant_only to cover auto_detect_all_languages=False (old behavior)
|
Addressed both Copilot review comments in commit
All 9 autogenerate tests pass locally. |
|
EDIT: See comment in issue |
Closes #1125
Problem
When Serena is invoked via MCP from an autonomous agent (Claude Code SDK, CI pipelines, etc.) and no
.serena/project.ymlexists,ProjectConfig.autogenerate()only enables the single dominant language by file count. Secondary languages are silently dropped.For polyglot repos (e.g. Python backend + TypeScript/React frontend) this means all LSP-based tools (
find_symbol,get_symbols_overview,replace_symbol_body, etc.) return empty or wrong results for the non-dominant language, with no error or warning to indicate why.Solution
Add
auto_detect_all_languages: bool = TruetoProjectConfig.autogenerate().The change is backward-compatible: existing
project.ymlfiles are not affected; the parameter only applies during auto-generation when no config exists yet.What changes
src/serena/config/serena_config.pyautogenerate()gainsauto_detect_all_languages: bool = Truelanguages_to_useis populated from all detected languages, not just the top oneNo new dependencies
The
LanguageServerManageralready spawns multiple language servers in parallel threads andget_language_server()already routes tool calls by file extension — this PR purely fixes the auto-detection step upstream.