Add grammar theory pages (CFG, why-LR, shift-reduce) by halotukozak · Pull Request #262 · halotukozak/alpaca

halotukozak · 2026-02-20T23:40:16Z

Summary

theory/cfg.md — Context-Free Grammars: formal 4-tuple definition G = (V, Σ, R, S), calculator grammar in BNF (6 Expr productions), leftmost derivation of 1 + 2 with ⇒ steps, ASCII parse tree, Alpaca DSL mapping with sc:nocompile CalcParser block, ambiguity discussion → conflict-resolution.md
theory/why-lr.md — Why LR?: LL infinite-loop trace on left-recursive grammar, LR family comparison table (LR(0)/SLR/LALR/LR(1)), Alpaca correctly identified as full LR(1) (verified from ParseTable.scala docstring + Item.scala per-item lookahead), formal LR(1) item definition [A → α • β, a]
theory/shift-reduce.md — Shift-Reduce Parsing: parse stack as (stateIndex, node) pairs, 8-step parse trace for 1 + 2, formal LR parse configuration definition, connection to Alpaca's loop() in Parser.scala

Part of the v1.1 Compiler Theory Tutorial milestone — Phase 9: Grammar Theory (TH-04, TH-05, TH-06).

Test plan

./mill docJar passes (all examples compile)
theory/cfg.md contains formal 4-tuple definition and leftmost derivation
theory/why-lr.md says "full LR(1)" — zero occurrences of "LALR" for Alpaca
theory/shift-reduce.md contains 8-row parse trace table with Stack | Input | Action columns
All pages have > **Compile-time processing:** callout and cross-links

🤖 Generated with Claude Code

…finition - Top-down vs bottom-up parsing approaches - Left recursion infinite-loop trace showing LL failure - LR family comparison table: LR(0), SLR(1), LALR(1), LR(1) with Alpaca marked as LR(1) - Why LR(1) vs LALR(1) section grounded in Item.scala/ParseTable.scala source - LR(1) item formal definition using [A → α • β, a] dot notation with examples - O(n) parsing paragraph - Compile-time callout in established blockquote format - Cross-links to cfg.md, shift-reduce.md, ../conflict-resolution.md, ../parser.md

…rmal configuration - Parse stack explanation: (stateIndex, node) pairs from Parser.scala - Parse tables section: parse table + action table with separation of concerns - Simplified 3-production grammar block for trace clarity - 8-row parse trace table for '1 + 2' with Stack | Remaining input | Action columns - Annotation notes for steps 1, 2, 6, 7, 8 - Disclaimer that state numbers are illustrative for simplified grammar - 3 LR(1) item examples with dot notation from Item.scala - LR parse configuration formal definition in blockquote format - Connection to Alpaca runtime loop() function prose reference - O(n) loop termination paragraph - Compile-time callout in established blockquote format - Cross-links to why-lr.md, cfg.md, ../conflict-resolution.md, ../parser.md, pipeline.md

- Formal CFG 4-tuple definition (V, Σ, R, S) in blockquote format - 7-production CalcParser BNF grammar (6 Expr productions + root) - Leftmost derivation for 1 + 2 with ⇒ steps - ASCII parse tree for 1 + 2 - CalcParser Alpaca DSL block annotated with sc:nocompile - Compile-time callout in established blockquote format - Cross-links to tokens.md, why-lr.md, ../parser.md, ../conflict-resolution.md

Copilot

Pull request overview

This PR adds three comprehensive theory documentation pages explaining the fundamentals of parsing with context-free grammars, LR parsing, and the shift-reduce algorithm. These pages form part of the v1.1 Compiler Theory Tutorial (Phase 9: Grammar Theory).

Changes:

Added formal CFG definition with calculator grammar example, derivation trace, and mapping to Alpaca DSL
Added explanation of why LR parsing handles left-recursive grammars better than LL, with LR family comparison table and LR(1) item definition
Added shift-reduce parsing explanation with detailed 8-step parse trace and connection to Alpaca's runtime implementation

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
docs/_docs/theory/cfg.md	Introduces context-free grammars with formal 4-tuple definition, BNF notation, calculator grammar example, leftmost derivation trace, and DSL mapping
docs/_docs/theory/why-lr.md	Explains LL vs LR parsing, left-recursion problems, LR family comparison, and why Alpaca uses full LR(1) with source code references
docs/_docs/theory/shift-reduce.md	Details the shift-reduce loop with parse stack structure, 8-step trace table, LR(1) item lookahead mechanics, and runtime connection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-20T23:45:32Z

docs/_docs/theory/why-lr.md

+|-----------|-------------------|-------------|-------|
+| LR(0) | None (reduce always) | Smallest | Too weak for most real grammars |
+| SLR(1) | FOLLOW sets (global per non-terminal) | Same as LR(0) | Better, still limited |
+| LALR(1) | Per-state lookahead (merged item-set cores) | Same as LR(0)/SLR | Most common in practice (yacc, Bison, ANTLR) |


ANTLR uses LL(*) parsing, not LALR(1). ANTLR is a top-down parser generator with dynamic lookahead, while LALR(1) is a bottom-up parsing technique. This entry should be removed from the "Most common in practice" notes for LALR(1).

Suggested change

| LALR(1) | Per-state lookahead (merged item-set cores) | Same as LR(0)/SLR | Most common in practice (yacc, Bison, ANTLR) |

| LALR(1) | Per-state lookahead (merged item-set cores) | Same as LR(0)/SLR | Most common in practice (yacc, Bison) |

halotukozak added 3 commits February 21, 2026 00:27

Copilot AI review requested due to automatic review settings February 20, 2026 23:40

github-actions bot added documentation Improvements or additions to documentation Parser labels Feb 20, 2026

Copilot started reviewing on behalf of halotukozak February 20, 2026 23:40 View session

Copilot AI reviewed Feb 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add grammar theory pages (CFG, why-LR, shift-reduce)#262

Add grammar theory pages (CFG, why-LR, shift-reduce)#262
halotukozak wants to merge 3 commits intotheory-foundationfrom
grammar-theory

halotukozak commented Feb 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	\| LALR(1) \| Per-state lookahead (merged item-set cores) \| Same as LR(0)/SLR \| Most common in practice (yacc, Bison, ANTLR) \|
	\| LALR(1) \| Per-state lookahead (merged item-set cores) \| Same as LR(0)/SLR \| Most common in practice (yacc, Bison) \|

Conversation

halotukozak commented Feb 20, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants