fix(adapters): drop silent-sentinel row fallbacks across 6 read commands#1631
Open
Benjamin-eecs wants to merge 2 commits into
Open
fix(adapters): drop silent-sentinel row fallbacks across 6 read commands#1631Benjamin-eecs wants to merge 2 commits into
Benjamin-eecs wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR continues the adapter audit cleanup by replacing silent sentinel row fallbacks with empty-string signals so callers can distinguish missing upstream data from literal values.
Changes:
- Replaced
'-','unknown', and'Unknown'row fallbacks with''across affected read adapters. - Updated the typed-error lint baseline to remove the resolved
silent-sentinelentries. - Kept existing populated-row behavior unchanged for the touched adapters.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
clis/36kr/article.js |
Emits empty values for missing author/date/body fields. |
clis/v2ex/me.js |
Changes username fallback handling in profile extraction. |
clis/wikipedia/trending.js |
Emits empty title/description values when missing. |
clis/xiaoyuzhou/download.js |
Emits empty podcast title when missing. |
clis/xiaoyuzhou/transcript.js |
Emits empty podcast title when missing. |
clis/zhihu/collection.js |
Emits empty type values and updates collection dedupe key fallback. |
clis/zhihu/download.js |
Emits empty author value when missing. |
scripts/typed-error-lint-baseline.json |
Removes resolved silent-sentinel baseline entries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const navLinks = Array.from(document.querySelectorAll('a.top')).map(a => a.textContent?.trim()); | ||
| if (navLinks.length > 1 && navLinks[0] === '首页') { | ||
| username = navLinks[1] || 'Unknown'; | ||
| username = navLinks[1] || ''; |
| return [{ | ||
| title, | ||
| podcast: ep.podcast?.title || '-', | ||
| podcast: ep.podcast?.title || '', |
| return [{ | ||
| title: episode.title || 'episode', | ||
| podcast: episode.podcast?.title || '-', | ||
| podcast: episode.podcast?.title || '', |
Comment on lines
+59
to
+64
| return `${content.type || ''}:${content.id || content.url || JSON.stringify(content).slice(0, 80)}`; | ||
| } | ||
|
|
||
| function mapCollectionItem(item, rank) { | ||
| const content = item.content || {}; | ||
| const type = content.type || 'unknown'; | ||
| const type = content.type || ''; |
ad05353 to
163b08d
Compare
Continues the audit-baseline cleanup started in jackwener#1611 (lesswrong) and the direction set by jackwener#1599 / jackwener#1603 / jackwener#1604. Replaces the `silent-sentinel` row-data fallbacks (`'Unknown'` / `'-'` / `'unknown'` that mask missing fields) with the empty-string signal so agents can tell apart "field really has the value Unknown" from "upstream returned no value". Touched 6 read adapters, 10 baseline entries: - wikipedia/trending: title, description - 36kr/article: author, date, body - xiaoyuzhou/download: podcast - xiaoyuzhou/transcript: podcast - zhihu/collection: dedup key + type field (the empty prefix still produces a unique-per-content dedup key, just without the `unknown:` noise) - zhihu/download: author Intentionally skipped (line-by-line audited): - v2ex/me.js: `'Unknown'` is an in-band control-flow sentinel. Line 35 initialises `let username = 'Unknown';`, line 41 uses `if (username === 'Unknown')` to trigger the profileEl fallback selector, line 75 uses the same check to raise the auth error. Empty would silently bypass both checks and return a row with an empty username as if auth succeeded. - v2ex/daily.js: `'未知'` is user-facing 签到 success text in the rendered status message, not a row field. Empty would render a broken sentence. - weibo/comments.js, weibo/feed.js: the sentinel sits inside an in-IIFE error-message string composition (`'API error: ' + (data.msg || 'unknown')`), not in a returned row. Empty would silently truncate diagnostic output. Both stay on baseline. Verified live: `opencli wikipedia trending --limit 3` and `opencli 36kr hot --limit 2` both return populated rows; the empty-string signal only kicks in when the upstream value is actually missing.
163b08d to
f7573aa
Compare
Open
15 tasks
…swap Per owner's pattern in 7164615 (douyin/user-videos.test.js + jike/read.test.js + weread/search-regression.test.js), pairs the silent-sentinel value swap in this PR with focused unit tests that mock the upstream to return null / missing fields and assert the row surfaces an empty-string signal instead of the old fabricated 'Unknown' / '-' / 'unknown' sentinel. Coverage: - clis/wikipedia/trending.test.js (new): mocks wikiFetch to return three articles - one with both title + description populated, one with no title and no description, one with title only. Asserts the missing fields render as '' (was '-' before this PR). - clis/36kr/article.test.js (new): mocks page.evaluate to return a scrape where title is present but author / date / body are empty. Asserts those three fields render as '' in the row pair output (was '-' before this PR). Also covers the NOT_FOUND and INVALID_ARGUMENT error paths that already existed. - clis/zhihu/collection.test.js (+1 case): mocks the zhihu collection API to return an item with content.id but no content.type. Asserts type renders as '' (was 'unknown' before this PR); the new dedup key prefix is :id rather than unknown:id, semantically identical for dedup purposes. The other three files in this PR (xiaoyuzhou/download, xiaoyuzhou/transcript, zhihu/download) use the same `|| 'unknown'` -> `|| ''` value swap with no downstream sentinel consumer. They are covered by the same JS language semantics the three tests above demonstrate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Continues the audit-baseline cleanup started in #1611 (lesswrong) and the direction set by #1599 / #1603 / #1604. Replaces the
silent-sentinelrow-data fallbacks ('Unknown'/'-'/'unknown'that mask missing fields) with the empty-string signal so agents can tell apart "field really has the value Unknown" from "upstream returned no value".Type of Change
Checklist
Fix
Same pattern as #1611: replace flagged sentinel strings with the empty-signal form. Walks the lint baseline (
scripts/check-typed-error-lint.mjs:323regex catchesunknown/Unknown/UNKNOWN/N/A/n/a/NA/未知/-) and clears 10 entries across 6 read adapters:wikipedia/trending?? '-'to?? ''36kr/article|| '-'to|| ''xiaoyuzhou/download|| '-'to|| ''xiaoyuzhou/transcript|| '-'to|| ''zhihu/collection|| 'unknown'to|| ''zhihu/download|| 'unknown'to|| ''The zhihu/collection change keeps the dedup key unique-per-content; the empty prefix is
:${content.id}instead ofunknown:${content.id}, semantically identical for dedup.Intentionally skipped
v2ex/me.js:38:'Unknown'is an in-band control-flow sentinel. Line 35 initialiseslet username = 'Unknown', line 41 usesif (username === 'Unknown')to trigger the profileEl fallback selector, line 75 uses the same check to raise the auth error. Empty would silently bypass both checks. An earlier commit in this PR attempted the swap then reverted (commit163b08dd).v2ex/daily.js:93:'未知'sits inside a 签到 success message rendered to the user (当前余额: 未知). Empty would render a broken sentence. Stays on baseline.weibo/comments.js:33,weibo/feed.js:45:'unknown'sits inside an in-IIFE error-message string composition ('API error: ' + (data.msg || 'unknown')), not in a returned row. Empty would silently truncate diagnostic output. Stay on baseline.Screenshots / Output
Smoke test on two of the touched commands (read-only, public):
$ opencli wikipedia trending --limit 3 -f json [ { "rank": 1, "title": "Iceman_(Drake_album)", "description": "2026 studio album by Drake", "views": 259180 }, { "rank": 2, "title": "Andy_Burnham", "description": "British politician (born 1970)", "views": 202483 }, { "rank": 3, "title": "Obsession_(2025_film)", "description": "Horror film by Curry Barker", "views": 199269 } ] $ opencli 36kr hot --limit 2 -f json [ { "rank": 1, "title": "突发:OpenAI大规模重组,总裁Brockman夺权挂帅", "url": "https://36kr.com/p/3811615591292680" }, { "rank": 2, "title": "Claude为什么早晨8:30催你睡觉?", "url": "https://36kr.com/p/3811413738594050" } ]These outputs come from the v1.7.22 published code path; populated rows are byte-identical between the published code and this PR. The empty-signal path only fires when the upstream Wikipedia/36kr/xiaoyuzhou/zhihu API returns a null or missing field, which doesn't surface in this sample. That path is a mechanical 1-char fallback swap.
Test plan
npm run build: manifest compiles, no path leaksnpm run check:typed-error-lint: passes (current=163, baseline=163, new=0, resolved=0)npm run check:silent-column-drop: passesopencli wikipedia trendingandopencli 36kr hotstill return populated rows (populated-row path is byte-identical to this PR; not a direct exercise of the changed fallback).grep -nE "=== ?['\"](-|unknown|Unknown)['\"]"across the 6 touched files returns zero hits, so the swapped value is never consumed as a control-flow marker downstream (this is the audit step that caught thev2ex/me.jsregression in the first commit of this PR before push).'-'/'unknown'fallback value (a grep for those strings inside*.test.jswas clean here, but worth a second look in CI).