Skip to content

Commit 31062ae

Browse files
Merge branch 'release-1.9.0' into fix/kb-bulk-delete-path-traversal
2 parents 0c2f8e6 + 04b4af6 commit 31062ae

File tree

420 files changed

+55956
-16093
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

420 files changed

+55956
-16093
lines changed
Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
---
2+
name: backend-code-review
3+
description: Review backend code for quality, security, maintainability, and best practices based on established checklist rules. Use when the user requests a review, analysis, or improvement of backend files (e.g., `.py`) under the `src/backend/` directory. Do NOT use for frontend files (e.g., `.tsx`, `.ts`, `.js`). Supports pending-change review, code snippets review, and file-focused review.
4+
---
5+
6+
# Backend Code Review
7+
8+
## When to use this skill
9+
10+
Use this skill whenever the user asks to **review, analyze, or improve** backend code (e.g., `.py`) under the `src/backend/` directory. Supports the following review modes:
11+
12+
- **Pending-change review**: when the user asks to review current changes (inspect staged/working-tree files slated for commit to get the changes).
13+
- **Code snippets review**: when the user pastes code snippets (e.g., a function/class/module excerpt) into the chat and asks for a review.
14+
- **File-focused review**: when the user points to specific files and asks for a review of those files (one file or a small, explicit set of files, e.g., `src/backend/base/langflow/api/v1/flows.py`).
15+
16+
Do NOT use this skill when:
17+
18+
- The request is about frontend code or UI (e.g., `.tsx`, `.ts`, `.js`, `src/frontend/`).
19+
- The user is not asking for a review/analysis/improvement of backend code.
20+
- The scope is not under `src/backend/` (unless the user explicitly asks to review backend-related changes outside `src/backend/`).
21+
22+
## How to use this skill
23+
24+
Follow these steps when using this skill:
25+
26+
1. **Identify the review mode** (pending-change vs snippet vs file-focused) based on the user's input. Keep the scope tight: review only what the user provided or explicitly referenced.
27+
2. Follow the rules defined in **Checklist** to perform the review. If no Checklist rule matches, apply **General Review Rules** as a fallback to perform the best-effort review.
28+
3. Compose the final output strictly following the **Required Output Format**.
29+
30+
Notes when using this skill:
31+
- Always include actionable fixes or suggestions (including possible code snippets).
32+
- Use best-effort `File:Line` references when a file path and line numbers are available; otherwise, use the most specific identifier you can.
33+
34+
## Checklist
35+
36+
- db schema design: if the review scope includes code/files under `src/backend/base/langflow/services/database/models/` or Alembic migrations under `src/backend/base/langflow/alembic/versions/`, follow [references/db-schema-rule.md](references/db-schema-rule.md) to perform the review
37+
- architecture: if the review scope involves route/service/model layering, dependency direction, or moving responsibilities across modules, follow [references/architecture-rule.md](references/architecture-rule.md) to perform the review
38+
- service abstraction: if the review scope contains table/model operations (e.g., `select(...)`, `session.execute(...)`, joins, CRUD) and is not already inside a service under `src/backend/base/langflow/services/`, follow [references/repositories-rule.md](references/repositories-rule.md) to perform the review
39+
- sqlalchemy patterns: if the review scope involves SQLAlchemy/SQLModel session/query usage, db transaction/crud usage, `session_scope()` usage, or raw SQL usage, follow [references/sqlalchemy-rule.md](references/sqlalchemy-rule.md) to perform the review
40+
41+
## General Review Rules
42+
43+
### 1. Security Review
44+
45+
Check for:
46+
- SQL injection vulnerabilities (especially raw `text()` queries with string interpolation). Consequence: attacker can read/modify/delete any data in the database.
47+
- Server-Side Request Forgery (SSRF) in component HTTP calls. Consequence: attacker uses the server to scan internal networks or access cloud metadata endpoints.
48+
- Command injection (especially in subprocess or shell-executing components). Consequence: attacker gains shell access to the server.
49+
- Insecure deserialization (pickle, yaml.load without SafeLoader). Consequence: arbitrary code execution on the server.
50+
- Hardcoded secrets/credentials. Consequence: secrets leak via git history and are impossible to fully revoke.
51+
- Improper authentication/authorization (missing `CurrentActiveUser` dependency). Consequence: unauthenticated users can access protected endpoints.
52+
- Insecure direct object references (missing `user_id` scoping on queries). Consequence: user A can read/modify user B's flows, variables, API keys.
53+
- Path traversal in file storage operations. Consequence: attacker reads arbitrary server files (e.g., `/etc/passwd`, `.env`).
54+
55+
### 2. Performance Review
56+
57+
Check for:
58+
- N+1 queries (especially in loops calling `session.execute()`). Consequence: 100 flows = 101 DB queries instead of 2; page load goes from 50ms to 5s.
59+
- Missing database indexes on frequently queried columns. Consequence: full table scans on large datasets; queries degrade from O(log n) to O(n).
60+
- Memory leaks (unbounded caches, retained references in long-lived services). Consequence: server OOM after hours of operation; pods restart in production.
61+
- Blocking operations in async code (`time.sleep()`, synchronous I/O, CPU-bound work without `run_in_executor`). Consequence: entire event loop stalls; all concurrent requests hang until the blocking call completes.
62+
- Missing caching opportunities for expensive computations. Consequence: repeated computation of the same result on every request.
63+
- Large result sets loaded entirely into memory without pagination. Consequence: memory spike + slow response when user has 10K+ flows.
64+
65+
### 3. Code Quality Review
66+
67+
Check for:
68+
- Code forward compatibility with Python 3.10-3.13
69+
- Code duplication (DRY violations — extract when the *exact same business rule* is duplicated in 3+ places)
70+
- Functions doing too much (SRP violations — if you need "and" to describe it, split it)
71+
- Deep nesting / complex conditionals (prefer early returns and guard clauses)
72+
- Magic numbers/strings (extract to named constants or enums)
73+
- Poor naming: unclear abbreviations, misleading names, generic names (`data`, `result`, `obj`, `temp`). Functions should use verbs (`get`, `create`, `validate`). Booleans should use prefixes (`is_`, `has_`, `can_`, `should_`).
74+
- Missing error handling (bare `except`, swallowed exceptions, silent failures)
75+
- Incomplete type coverage (use strong typing, avoid `Any` where a concrete type is known)
76+
- Use Python 3.10+ union syntax (`X | Y` not `Union[X, Y]`, `X | None` not `Optional[X]`)
77+
- Use `TYPE_CHECKING` guard for imports only needed for type annotations (prevents circular imports)
78+
- Use `Annotated[Type, Depends(...)]` with project aliases (`CurrentActiveUser`, `DbSession`, `DbSessionReadOnly`) for FastAPI DI
79+
- Google-style docstrings (enforced by Ruff): `Args:`, `Returns:`, `Raises:` sections for public functions
80+
- Violations of SOLID principles
81+
- YAGNI violations (code that anticipates future needs without a present requirement)
82+
- Line length exceeding 120 characters (project Ruff config)
83+
- Comments that explain WHAT instead of WHY (comments should only explain reasoning, not restate code)
84+
- Commented-out code (use version control instead)
85+
- Boolean parameters that switch function behavior (split into two named functions instead)
86+
- Mutable shared state where immutable alternatives exist (prefer returning new objects over mutation)
87+
88+
### 4. File Structure Review
89+
90+
Check for:
91+
- Production files exceeding ~500 lines of code (excluding imports, types, and docstrings). Files above 600 lines are a red flag and should be split by responsibility. Why: Files above 500 lines have statistically higher defect rates and take longer to review. They signal multiple responsibilities (SRP violation). In Langflow, services like `DatabaseService` that grow beyond this limit should have their CRUD operations extracted to dedicated modules.
92+
- Test files exceeding ~1000 lines. Split by logical grouping if exceeded.
93+
- No more than 5 functions with different responsibilities in a single file (per AGENTS-example.md).
94+
- Each file has a single reason to exist and a single reason to change (SRP).
95+
- No generic file names: `utils.py`, `helpers.py`, `misc.py`, `common.py` as standalone files. Why: A file named `utils.py` becomes a dumping ground for unrelated functions. Within months it has 50+ functions covering formatting, validation, parsing, and HTTP calls — violating SRP. Each function group should be in a file named after its responsibility (`formatting.py`, `validation.py`).
96+
97+
### 5. Testing Review
98+
99+
Check for:
100+
- Missing test coverage for new code paths
101+
- Tests that don't test behavior (testing implementation details)
102+
- Flaky test patterns (time-dependent, order-dependent, external-service-dependent)
103+
- Proper use of `pytest.mark.asyncio` for async tests
104+
- Excessive mocking (prefer real integrations per project conventions)
105+
- Coverage target: 80% (minimum acceptable: 75%)
106+
- Test anti-patterns: The Liar (passes but doesn't verify claimed behavior), The Mirror (asserts exactly what code does), The Giant (50+ lines setup), The Mockery (tests only mock setup), The Inspector (coupled to implementation), The Chain Gang (depends on execution order), The Flaky (inconsistent results)
107+
108+
**Happy path tests are the foundation but are NOT enough.** Tests MUST also challenge the code to find real defects:
109+
110+
- **Unexpected inputs**: `None`, `""`, `[]`, `{}`, `0`, `-1`, `UUID("00000000-0000-0000-0000-000000000000")`
111+
- **Boundary values**: max length strings, exactly at the limit, one past the limit, zero items, max items
112+
- **Malformed data**: missing required fields, extra unexpected fields, wrong types, invalid formats
113+
- **Error states**: what happens when the database is down? When an external API returns 500? When the user doesn't exist?
114+
- **What should NOT happen**: verify that user A CANNOT access user B's flows. Verify that a deleted flow returns 404. Verify that invalid `endpoint_name` is rejected with 422.
115+
- **Error messages and types**: not just that it fails, but that it fails with the RIGHT exception and the RIGHT message
116+
- **Concurrency**: what happens when two requests try to update the same flow simultaneously?
117+
118+
**Write tests based on REQUIREMENTS/SPEC, not on what the source code currently does.** This is how you catch bugs where the code diverges from expected behavior.
119+
120+
**When a test fails:** first ask if the CODE is wrong, not the test. Do NOT silently change a failing assertion to match the current code without understanding WHY.
121+
122+
### 6. Observability Review
123+
124+
Check for:
125+
- Use the async logger from `lfx.log.logger` with `a`-prefixed methods (`adebug`, `ainfo`, `awarning`, `aerror`, `aexception`). Never use `print()` or stdlib `logging`.
126+
- Log at key decision points and boundaries, not inside tight loops
127+
- Include: operation name, relevant IDs, outcome (success/failure), duration if relevant
128+
- Correct log levels: ERROR (broken, needs attention), WARN (degraded but recoverable), INFO (significant events), DEBUG (diagnostic, off in prod)
129+
- **ZERO PII TOLERANCE**: Never log email addresses, user names, phone numbers, tokens, passwords. Only approved identifiers: `user_id`, `flow_id`, `session_id`
130+
- No `print()` statements — these go to production logs
131+
- Use `{e!s}` for string representation of exceptions in log messages
132+
133+
### 7. Pre-Commit Verification
134+
135+
For pending-change reviews, verify the author has run:
136+
- `make format_backend` (Ruff formatter) — inconsistent formatting creates noisy diffs that hide real changes in code review. Format first, review second.
137+
- `make lint` (MyPy type checking) — type errors caught at lint time are 10x cheaper to fix than runtime crashes in production. Langflow services use duck typing via `Service` base class; MyPy catches mismatches early.
138+
- `make unit_tests` (pytest) — a failing test means the change breaks existing behavior. Never merge with failing tests; investigate whether the code or the test is wrong.
139+
140+
## Required Output Format
141+
142+
When this skill is invoked, the response must exactly follow one of the two templates:
143+
144+
### Template A (any findings)
145+
146+
```markdown
147+
# Code Review Summary
148+
149+
Found <X> critical issues need to be fixed:
150+
151+
## 🔴 Critical (Must Fix)
152+
153+
### 1. <brief description of the issue>
154+
155+
FilePath: <path> line <line>
156+
<relevant code snippet or pointer>
157+
158+
#### Explanation
159+
160+
<detailed explanation and references of the issue>
161+
162+
#### Suggested Fix
163+
164+
1. <brief description of suggested fix>
165+
2. <code example> (optional, omit if not applicable)
166+
167+
---
168+
... (repeat for each critical issue) ...
169+
170+
Found <Y> suggestions for improvement:
171+
172+
## 🟡 Suggestions (Should Consider)
173+
174+
### 1. <brief description of the suggestion>
175+
176+
FilePath: <path> line <line>
177+
<relevant code snippet or pointer>
178+
179+
#### Explanation
180+
181+
<detailed explanation and references of the suggestion>
182+
183+
#### Suggested Fix
184+
185+
1. <brief description of suggested fix>
186+
2. <code example> (optional, omit if not applicable)
187+
188+
---
189+
... (repeat for each suggestion) ...
190+
191+
Found <Z> optional nits:
192+
193+
## 🟢 Nits (Optional)
194+
### 1. <brief description of the nit>
195+
196+
FilePath: <path> line <line>
197+
<relevant code snippet or pointer>
198+
199+
#### Explanation
200+
201+
<explanation and references of the optional nit>
202+
203+
#### Suggested Fix
204+
205+
- <minor suggestions>
206+
207+
---
208+
... (repeat for each nits) ...
209+
210+
## ✅ What's Good
211+
212+
- <Positive feedback on good patterns>
213+
```
214+
215+
- If there are no critical issues or suggestions or optional nits or good points, just omit that section.
216+
- If the issue number is more than 10, summarize as "Found 10+ critical issues/suggestions/optional nits" and only output the first 10 items.
217+
- Don't compress the blank lines between sections; keep them as-is for readability.
218+
- If there is any issue that requires code changes, append a brief follow-up question to ask whether the user wants to apply the fix(es) after the structured output. For example: "Would you like me to use the Suggested fix(es) to address these issues?"
219+
220+
### Template B (no issues)
221+
222+
```markdown
223+
## Code Review Summary
224+
✅ No issues found.
225+
```

0 commit comments

Comments
 (0)