Skip to content

feat: Auto-detect all languages in non-interactive / autonomous agent mode#1126

Closed
droideronline wants to merge 2 commits intooraios:mainfrom
droideronline:feat/auto-detect-all-languages-non-interactive
Closed

feat: Auto-detect all languages in non-interactive / autonomous agent mode#1126
droideronline wants to merge 2 commits intooraios:mainfrom
droideronline:feat/auto-detect-all-languages-non-interactive

Conversation

@droideronline
Copy link

Closes #1125

Problem

When Serena is invoked via MCP from an autonomous agent (Claude Code SDK, CI pipelines, etc.) and no .serena/project.yml exists, ProjectConfig.autogenerate() only enables the single dominant language by file count. Secondary languages are silently dropped.

For polyglot repos (e.g. Python backend + TypeScript/React frontend) this means all LSP-based tools (find_symbol, get_symbols_overview, replace_symbol_body, etc.) return empty or wrong results for the non-dominant language, with no error or warning to indicate why.

Solution

Add auto_detect_all_languages: bool = True to ProjectConfig.autogenerate().

  • Non-interactive mode (default for MCP/agent): all detected languages are automatically enabled, sorted by file count descending.
  • Interactive mode: behaviour is unchanged — the user is still asked about each additional language.

The change is backward-compatible: existing project.yml files are not affected; the parameter only applies during auto-generation when no config exists yet.

What changes

src/serena/config/serena_config.py

  • autogenerate() gains auto_detect_all_languages: bool = True
  • In non-interactive mode with this flag, languages_to_use is populated from all detected languages, not just the top one
  • Adds a log line listing all enabled languages for visibility

No new dependencies

The LanguageServerManager already spawns multiple language servers in parallel threads and get_language_server() already routes tool calls by file extension — this PR purely fixes the auto-detection step upstream.

When autogenerating project.yml without user interaction (e.g. when
invoked via MCP from an autonomous agent), only the dominant language
was previously enabled. This breaks polyglot repos (e.g. Python backend
+ TypeScript/React frontend) where no project.yml exists.

Adds auto_detect_all_languages parameter (True by default) to
ProjectConfig.autogenerate(). In non-interactive mode with this flag
enabled, all detected languages are automatically included rather than
just the top one. Interactive mode behaviour is unchanged.
Copilot AI review requested due to automatic review settings March 3, 2026 16:52
@droideronline
Copy link
Author

droideronline commented Mar 3, 2026

@opcode81 - would love to get your eyes on this when you get a chance. This is blocking a use case where Serena is invoked autonomously via MCP (no human in the loop to configure project.yml), and polyglot repos silently lose symbol support for secondary languages. Happy to adjust the approach or add tests based on your feedback.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in (default-enabled) behavior to ProjectConfig.autogenerate() so that, when running non-interactively (e.g., MCP/agent/CI) and no .serena/project.yml exists, Serena enables all detected languages instead of only the dominant one—improving LSP tool correctness in polyglot repos.

Changes:

  • Add auto_detect_all_languages: bool = True parameter to ProjectConfig.autogenerate().
  • In non-interactive mode, enable all detected languages (sorted) when the flag is true.
  • Add logging to surface which languages are enabled automatically.

- Return unrounded percentages from determine_programming_language_composition()
  so sorting in autogenerate() is accurate and deterministic
- Update test_autogenerate_with_js_files and test_autogenerate_custom_project_name
  to use 'in' assertions (Vue matcher overlaps with TS/JS extensions)
- Update test_autogenerate_with_multiple_languages to assert all detected
  languages are enabled by default (new behavior)
- Add test_autogenerate_with_multiple_languages_dominant_only to cover
  auto_detect_all_languages=False (old behavior)
@droideronline
Copy link
Author

Addressed both Copilot review comments in commit 78ebdac4:

  1. Sorting precisiondetermine_programming_language_composition() now returns raw (unrounded) percentages, so the sort order in autogenerate() is fully deterministic without rounding ties.

  2. Tests — Updated the test suite:

    • test_autogenerate_with_js_files and test_autogenerate_custom_project_name now use in assertions (Vue's FilenameMatcher overlaps with .js/.ts extensions, so both languages are correctly detected when auto_detect_all_languages=True)
    • test_autogenerate_with_multiple_languages now asserts both Python and TypeScript are enabled (the new default behaviour)
    • Added test_autogenerate_with_multiple_languages_dominant_only to cover auto_detect_all_languages=False (the old single-language behaviour)

All 9 autogenerate tests pass locally.

@opcode81
Copy link
Contributor

opcode81 commented Mar 3, 2026

EDIT: See comment in issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Auto-detect all languages in non-interactive / autonomous agent mode

3 participants