fix(utils): use inner_text() in rate limit detection#278
Open
stickerdaniel wants to merge 1 commit intojoeyism:masterfrom
Open
fix(utils): use inner_text() in rate limit detection#278stickerdaniel wants to merge 1 commit intojoeyism:masterfrom
stickerdaniel wants to merge 1 commit intojoeyism:masterfrom
Conversation
text_content() captures invisible React RSC serialized JSON that LinkedIn now embeds on every page, containing "try again later" as a preloaded error template. This causes false positive rate limit detection on every scrape. inner_text() returns only visible text, matching the pattern used throughout the rest of the codebase. Resolves: joeyism#277 See also: joeyism#275
There was a problem hiding this comment.
Pull request overview
This PR fixes false-positive rate limit detection triggered by hidden React RSC serialized content embedded in LinkedIn pages by switching the DOM text extraction method to only consider visible text.
Changes:
- Update
detect_rate_limit()to useLocator.inner_text()instead oftext_content()when scanning the page for rate-limit phrases.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
89
to
+91
| # Check for rate limit messages | ||
| try: | ||
| body_text = await page.locator('body').text_content(timeout=1000) | ||
| body_text = await page.locator('body').inner_text(timeout=1000) |
There was a problem hiding this comment.
There’s currently no automated test coverage for detect_rate_limit() (no tests reference it), and this change tweaks detection semantics in a way that’s easy to regress. Consider adding a small unit test using page.set_content() with hidden DOM text containing "try again later" to ensure it does not raise, and another with visible text to ensure it does raise.
stickerdaniel
added a commit
to stickerdaniel/linkedin-mcp-server
that referenced
this pull request
Feb 12, 2026
Point dependency at stickerdaniel/linkedin_scraper fork (fix/rate-limit-false-positive) to fix detect_rate_limit() false-firing on React RSC payloads. Also update docs with detailed release workflow notes and bump opencode agent models to gpt-5.3-codex. See also: joeyism/linkedin_scraper#278
stickerdaniel
added a commit
to stickerdaniel/linkedin-mcp-server
that referenced
this pull request
Feb 12, 2026
Point dependency at stickerdaniel/linkedin_scraper fork (fix/rate-limit-false-positive) to fix detect_rate_limit() false-firing on React RSC payloads. Also update docs with detailed release workflow notes and bump opencode agent models to gpt-5.3-codex. See also: joeyism/linkedin_scraper#278
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
detect_rate_limit()false-fires on every page becausetext_content()picks up invisible React RSC serialized JSON that LinkedIn now embeds in the DOM. The phrase"something went wrong. please try again later."appears inside preloaded RSC data like:This matches the
"try again later"check and raisesRateLimitErroreven on perfectly normal pages.Fix: Switch from
text_content()toinner_text()to return only visible text, which is already the pattern used in all the scrapers (person.py,company.py,job.py).Resolves #277, likely fixes #275