Conversation
|
29f70c0 to
fe7bd75
Compare
📝 WalkthroughWalkthroughThis PR enriches Meilisearch results with highlight metadata ( Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant fetchMeilisearch as fetchMeilisearchResults
participant buildHits
participant enrichTree as enrichHighlightTree
participant calcMeta as calculateHighlightMetadata
Client->>fetchMeilisearch: fetchMeilisearchResults(queries)
fetchMeilisearch->>fetchMeilisearch: buildSearchRequest(queries)
fetchMeilisearch->>fetchMeilisearch: execute Meilisearch query
fetchMeilisearch->>buildHits: buildHits(result, query)
loop For each hit
buildHits->>enrichTree: enrichHighlightTree(highlightResult)
loop Recursive traversal (depth <= 20)
enrichTree->>enrichTree: traverse structure
alt Is highlight value
enrichTree->>calcMeta: calculateHighlightMetadata(...)
calcMeta-->>enrichTree: HighlightMetadata{value, fullyHighlighted, matchLevel, matchedWords}
enrichTree->>enrichTree: attach metadata
else Is object/array
enrichTree->>enrichTree: recurse into nested structure
end
end
enrichTree-->>buildHits: enriched highlight tree
end
buildHits-->>fetchMeilisearch: enriched hits with metadata
fetchMeilisearch-->>Client: results with HighlightMetadata
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/autocomplete-client/src/utils.ts (1)
1-7: Drop the JSDoc here.
mapOneOrManyand its type signature already explain the helper, so this block mostly adds noise.As per coding guidelines, "Prefer self-descriptive code to comments. Only use comments for complex logic."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/autocomplete-client/src/utils.ts` around lines 1 - 7, Remove the redundant JSDoc block above the mapOneOrMany helper: delete the multi-line comment that documents the function and its parameters/returns so the code remains self-descriptive; locate the comment immediately preceding the mapOneOrMany function in utils.ts and remove it, leaving only the function declaration and its TypeScript signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/autocomplete-client/src/search/highlight.ts`:
- Around line 37-40: The current fullyHighlighted check uses highlightedText =
matches.join(''), which drops the original separators (spaces/punctuation) and
makes comparisons with cleanValue incorrect for multi-tag full matches; instead
reconstruct the highlightedText using the matched ranges from the original
string (use the match indices/positions associated with matches) and include the
intervening substrings from cleanValue between matches so the rebuilt
highlightedText preserves separators, then set fullyHighlighted = cleanValue ===
rebuiltHighlightedText (update references to matches and fullyHighlighted
accordingly).
- Around line 23-35: The current code constructs RegExp with raw user-provided
highlight tags (preTag/postTag) in the highlighting logic (variables preTag,
postTag, highlightValue), which breaks for regex metacharacters; fix by escaping
meta-characters before building regexes: implement or use an escapeRegExp helper
and apply it to preTag and postTag when creating the two RegExp instances used
for extracting matches (highlightRegex) and for removing tags (the replace
regex), so both the exec loop in the highlight extraction and the cleanValue
replace operate on the literal tag strings.
---
Nitpick comments:
In `@packages/autocomplete-client/src/utils.ts`:
- Around line 1-7: Remove the redundant JSDoc block above the mapOneOrMany
helper: delete the multi-line comment that documents the function and its
parameters/returns so the code remains self-descriptive; locate the comment
immediately preceding the mapOneOrMany function in utils.ts and remove it,
leaving only the function declaration and its TypeScript signature.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b2c811d6-d743-41fb-b93e-d900f4ca4a38
📒 Files selected for processing (4)
packages/autocomplete-client/src/search/__tests__/fetchMeilisearchResults.test.tspackages/autocomplete-client/src/search/fetchMeilisearchResults.tspackages/autocomplete-client/src/search/highlight.tspackages/autocomplete-client/src/utils.ts
| // Extract all highlighted segments | ||
| const highlightRegex = new RegExp(`${preTag}(.*?)${postTag}`, 'g') | ||
| const matches: string[] = [] | ||
| let match | ||
| while ((match = highlightRegex.exec(highlightValue)) !== null) { | ||
| matches.push(match[1]) | ||
| } | ||
|
|
||
| // Remove highlight tags to get the highlighted text without the tags | ||
| const cleanValue = highlightValue.replace( | ||
| new RegExp(`${preTag}|${postTag}`, 'g'), | ||
| '' | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file structure and find the highlight.ts file
find . -name "highlight.ts" -type f 2>/dev/null | head -20Repository: meilisearch/meilisearch-js-plugins
Length of output: 132
🏁 Script executed:
# Get the full content of the highlight.ts file to understand the context
cat -n packages/autocomplete-client/src/search/highlight.tsRepository: meilisearch/meilisearch-js-plugins
Length of output: 2461
🏁 Script executed:
# Search for other RegExp constructions with template literals in the autocomplete-client package
rg 'new RegExp.*\$\{' packages/autocomplete-client/src --type ts --type tsx -A 2Repository: meilisearch/meilisearch-js-plugins
Length of output: 105
🏁 Script executed:
# Check if there are any existing tests for this function that might reveal expected behavior
rg 'calculateHighlightMetadata|preTag|postTag' packages/autocomplete-client --type ts --type tsx | head -30Repository: meilisearch/meilisearch-js-plugins
Length of output: 105
🏁 Script executed:
# Search for other RegExp constructions with template literals in the autocomplete-client package
rg 'new RegExp.*\$\{' packages/autocomplete-client/src --type ts -A 2Repository: meilisearch/meilisearch-js-plugins
Length of output: 566
🏁 Script executed:
# Check if there are any existing tests for this function
find . -path "*/test*" -o -path "*/spec*" | grep -i highlightRepository: meilisearch/meilisearch-js-plugins
Length of output: 60
🏁 Script executed:
# Look for test files that might test calculateHighlightMetadata
find . -name "*.test.ts" -o -name "*.spec.ts" | xargs grep -l "calculateHighlightMetadata" 2>/dev/nullRepository: meilisearch/meilisearch-js-plugins
Length of output: 62
🏁 Script executed:
# Check how calculateHighlightMetadata is called in the codebase
rg 'calculateHighlightMetadata' packages/autocomplete-client/src -B 3 -A 3Repository: meilisearch/meilisearch-js-plugins
Length of output: 2188
🏁 Script executed:
# Look for where preTag and postTag come from (likely from Meilisearch response)
rg 'preTag|postTag' packages/autocomplete-client/src -B 2 -A 2 | head -50Repository: meilisearch/meilisearch-js-plugins
Length of output: 4034
🏁 Script executed:
# Check the package.json and documentation for context on highlight tags
cat packages/autocomplete-client/package.jsonRepository: meilisearch/meilisearch-js-plugins
Length of output: 1241
🏁 Script executed:
# Check the constants to see what the default highlight tags are
rg 'HIGHLIGHT_PRE_TAG|HIGHLIGHT_POST_TAG' packages/autocomplete-client/src -B 2 -A 2Repository: meilisearch/meilisearch-js-plugins
Length of output: 2114
🏁 Script executed:
# Look at the complete context of how preTag and postTag flow through the code
cat -n packages/autocomplete-client/src/search/fetchMeilisearchResults.ts | head -50Repository: meilisearch/meilisearch-js-plugins
Length of output: 1902
Escape custom highlight tags before using them in regex patterns.
highlightPreTag and highlightPostTag come from user-provided parameters and are directly interpolated into RegExp constructors at lines 24 and 33. Any regex metacharacter (e.g., ., *, [, ], ^, $, |, +, ?, (, ), {, }, \) will either cause a runtime error or alter the regex semantics, breaking metadata extraction for otherwise valid custom tags.
Suggested fix
+function escapeRegExp(value: string): string {
+ return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&')
+}
+
export function calculateHighlightMetadata(
query: string,
preTag: string,
postTag: string,
highlightValue: string
): HighlightMetadata {
- const highlightRegex = new RegExp(`${preTag}(.*?)${postTag}`, 'g')
+ const escapedPreTag = escapeRegExp(preTag)
+ const escapedPostTag = escapeRegExp(postTag)
+ const highlightRegex = new RegExp(
+ `${escapedPreTag}(.*?)${escapedPostTag}`,
+ 'g'
+ )
const matches: string[] = []
let match
while ((match = highlightRegex.exec(highlightValue)) !== null) {
matches.push(match[1])
}
const cleanValue = highlightValue.replace(
- new RegExp(`${preTag}|${postTag}`, 'g'),
+ new RegExp(`${escapedPreTag}|${escapedPostTag}`, 'g'),
''
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Extract all highlighted segments | |
| const highlightRegex = new RegExp(`${preTag}(.*?)${postTag}`, 'g') | |
| const matches: string[] = [] | |
| let match | |
| while ((match = highlightRegex.exec(highlightValue)) !== null) { | |
| matches.push(match[1]) | |
| } | |
| // Remove highlight tags to get the highlighted text without the tags | |
| const cleanValue = highlightValue.replace( | |
| new RegExp(`${preTag}|${postTag}`, 'g'), | |
| '' | |
| ) | |
| function escapeRegExp(value: string): string { | |
| return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&') | |
| } | |
| export function calculateHighlightMetadata( | |
| query: string, | |
| preTag: string, | |
| postTag: string, | |
| highlightValue: string | |
| ): HighlightMetadata { | |
| // Extract all highlighted segments | |
| const escapedPreTag = escapeRegExp(preTag) | |
| const escapedPostTag = escapeRegExp(postTag) | |
| const highlightRegex = new RegExp( | |
| `${escapedPreTag}(.*?)${escapedPostTag}`, | |
| 'g' | |
| ) | |
| const matches: string[] = [] | |
| let match | |
| while ((match = highlightRegex.exec(highlightValue)) !== null) { | |
| matches.push(match[1]) | |
| } | |
| // Remove highlight tags to get the highlighted text without the tags | |
| const cleanValue = highlightValue.replace( | |
| new RegExp(`${escapedPreTag}|${escapedPostTag}`, 'g'), | |
| '' | |
| ) |
🧰 Tools
🪛 ast-grep (0.42.0)
[warning] 23-23: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(${preTag}(.*?)${postTag}, 'g')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
[warning] 32-32: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(${preTag}|${postTag}, 'g')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html
(regexp-from-variable)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/autocomplete-client/src/search/highlight.ts` around lines 23 - 35,
The current code constructs RegExp with raw user-provided highlight tags
(preTag/postTag) in the highlighting logic (variables preTag, postTag,
highlightValue), which breaks for regex metacharacters; fix by escaping
meta-characters before building regexes: implement or use an escapeRegExp helper
and apply it to preTag and postTag when creating the two RegExp instances used
for extracting matches (highlightRegex) and for removing tags (the replace
regex), so both the exec loop in the highlight extraction and the cleanValue
replace operate on the literal tag strings.
| // Determine if the entire attribute is highlighted | ||
| // fullyHighlighted = true if cleanValue and the concatenation of all matched segments are identical | ||
| const highlightedText = matches.join('') | ||
| const fullyHighlighted = cleanValue === highlightedText |
There was a problem hiding this comment.
fullyHighlighted is wrong for multi-tag full matches.
matches.join('') drops separators, so <em>Star</em> <em>Wars</em> becomes StarWars and compares unequal to Star Wars. A value whose entire content is highlighted across multiple tags is therefore reported as fullyHighlighted: false, which breaks the exact selection/ranking behavior this PR is adding.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/autocomplete-client/src/search/highlight.ts` around lines 37 - 40,
The current fullyHighlighted check uses highlightedText = matches.join(''),
which drops the original separators (spaces/punctuation) and makes comparisons
with cleanValue incorrect for multi-tag full matches; instead reconstruct the
highlightedText using the matched ranges from the original string (use the match
indices/positions associated with matches) and include the intervening
substrings from cleanValue between matches so the rebuilt highlightedText
preserves separators, then set fullyHighlighted = cleanValue ===
rebuiltHighlightedText (update references to matches and fullyHighlighted
accordingly).
Pull Request
Related issue
Fixes #1337
What does this PR do?
PR checklist
Please check if your PR fulfills the following requirements:
Thank you so much for contributing to Meilisearch!
Summary by CodeRabbit
Tests
Refactor