fix(extraction): index C functions decorated with a leading export macro (#515)#597
Open
maxmilian wants to merge 1 commit into
Open
fix(extraction): index C functions decorated with a leading export macro (#515)#597maxmilian wants to merge 1 commit into
maxmilian wants to merge 1 commit into
Conversation
…cro (colbymchenry#515) A leading export/visibility macro followed by a typedef'd return type — `AX_VIN_GLB_API AX_S32 AX_VIN_Init(AX_VOID) { ... }`, common in vendor C SDKs that gate symbols behind `__attribute__((visibility))` / `__declspec(dllexport)` macros — makes tree-sitter-c misparse the function: both the macro and the return type are unknown identifiers in type position, so the grammar peels `MACRO RET` off as a separate broken declaration and parses the real function with its name absorbed as the `type` field and the parameter list wrapped in a bare `parenthesized_declarator` instead of a `function_declarator`. The function was then indexed under the garbage name `(params)` and could not be found. Recover the true name from the `type` field when a function_definition's declarator is a parenthesized_declarator — a shape valid C never produces (redundant parens and function-pointer returns keep a function_declarator, covered by a no-misfire test), so the recovery never fires on well-formed code. Deliberately narrow: macro + primitive or pointer return types (`AX_API int F(...)`, `AX_API char* F(...)`) and all C++ forms already parse to a function_declarator and extract correctly via the existing path, so they don't reach this recovery. Bare prototypes in headers, which tree-sitter reparses as a top-level call expression, are left as a follow-up to avoid synthesizing symbols from genuine call sites. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
f6c276e to
b02776d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #515. A C function written as
EXPORT_MACRO ReturnType Name(...)— a leading export/visibility macro in front of a project-defined (typedef'd) return type — was indexed under the garbage name(params)instead of its real name, so it couldn't be searched or navigated to. This pattern is ubiquitous in vendor SDKs, e.g. the Axera SoC SDK from the report:AX_VIN_GLB_API AX_S32 AX_VIN_Init(AX_VOID).Root cause
Both the macro and the typedef'd return type are unknown identifiers in type position, so tree-sitter-c peels
MACRO REToff as a separate (broken)declarationand parses the real function with its name absorbed as thetypefield and the parameter list wrapped in a bareparenthesized_declaratorinstead of afunction_declarator:extractNamethen falls through togetNodeText(parenthesized_declarator)→"(AX_VOID)".Fix
recoverMacroDecoratedFunctionName(wired only into the C extractor'sresolveName): when afunction_definition's declarator field is directly aparenthesized_declarator, take the name from thetypefield instead. That shape is one valid C never produces.Tests
__tests__/extraction.test.ts→ newC/C++ Extractionblock:AX_VIN_Initfrom the macro-decorated definition (and is not named(AX_VOID))int compute(int a)is unaffectedint (foo)(void)(redundant parens — keeps afunction_declarator) does not misfire and is not renamed tointFull suite green locally:
npm run buildclean,1097 passed | 2 skipped(+3 new).Scope
Deliberately narrow — verified against the real grammars with parse dumps before/after:
AX_API int F(...)), macro + pointer return (AX_API char* F(...)), and all C++ forms already parse to afunction_declaratorand extract correctly onmain— they don't reach this recovery and aren't touched (noresolveNamechange for the C++ extractor).AX_API AX_S32 AX_VIN_Init(void);) are reparsed by tree-sitter as a top-level call expression. Recovering a symbol from that would risk synthesizing symbols from genuine call sites, so it's left as a follow-up. The definition site (this fix) is where the symbol lives, which makes it searchable.No node-count change —
resolveNameonly renames; it cannot add or drop nodes.