Skip to content

Add grammar theory pages (CFG, why-LR, shift-reduce)#262

Open
halotukozak wants to merge 3 commits intotheory-foundationfrom
grammar-theory
Open

Add grammar theory pages (CFG, why-LR, shift-reduce)#262
halotukozak wants to merge 3 commits intotheory-foundationfrom
grammar-theory

Conversation

@halotukozak
Copy link
Copy Markdown
Owner

Summary

  • theory/cfg.md — Context-Free Grammars: formal 4-tuple definition G = (V, Σ, R, S), calculator grammar in BNF (6 Expr productions), leftmost derivation of 1 + 2 with ⇒ steps, ASCII parse tree, Alpaca DSL mapping with sc:nocompile CalcParser block, ambiguity discussion → conflict-resolution.md
  • theory/why-lr.md — Why LR?: LL infinite-loop trace on left-recursive grammar, LR family comparison table (LR(0)/SLR/LALR/LR(1)), Alpaca correctly identified as full LR(1) (verified from ParseTable.scala docstring + Item.scala per-item lookahead), formal LR(1) item definition [A → α • β, a]
  • theory/shift-reduce.md — Shift-Reduce Parsing: parse stack as (stateIndex, node) pairs, 8-step parse trace for 1 + 2, formal LR parse configuration definition, connection to Alpaca's loop() in Parser.scala

Part of the v1.1 Compiler Theory Tutorial milestone — Phase 9: Grammar Theory (TH-04, TH-05, TH-06).

Test plan

  • ./mill docJar passes (all examples compile)
  • theory/cfg.md contains formal 4-tuple definition and leftmost derivation
  • theory/why-lr.md says "full LR(1)" — zero occurrences of "LALR" for Alpaca
  • theory/shift-reduce.md contains 8-row parse trace table with Stack | Input | Action columns
  • All pages have > **Compile-time processing:** callout and cross-links

🤖 Generated with Claude Code

…finition

- Top-down vs bottom-up parsing approaches
- Left recursion infinite-loop trace showing LL failure
- LR family comparison table: LR(0), SLR(1), LALR(1), LR(1) with Alpaca marked as LR(1)
- Why LR(1) vs LALR(1) section grounded in Item.scala/ParseTable.scala source
- LR(1) item formal definition using [A → α • β, a] dot notation with examples
- O(n) parsing paragraph
- Compile-time callout in established blockquote format
- Cross-links to cfg.md, shift-reduce.md, ../conflict-resolution.md, ../parser.md
…rmal configuration

- Parse stack explanation: (stateIndex, node) pairs from Parser.scala
- Parse tables section: parse table + action table with separation of concerns
- Simplified 3-production grammar block for trace clarity
- 8-row parse trace table for '1 + 2' with Stack | Remaining input | Action columns
- Annotation notes for steps 1, 2, 6, 7, 8
- Disclaimer that state numbers are illustrative for simplified grammar
- 3 LR(1) item examples with dot notation from Item.scala
- LR parse configuration formal definition in blockquote format
- Connection to Alpaca runtime loop() function prose reference
- O(n) loop termination paragraph
- Compile-time callout in established blockquote format
- Cross-links to why-lr.md, cfg.md, ../conflict-resolution.md, ../parser.md, pipeline.md
- Formal CFG 4-tuple definition (V, Σ, R, S) in blockquote format
- 7-production CalcParser BNF grammar (6 Expr productions + root)
- Leftmost derivation for 1 + 2 with ⇒ steps
- ASCII parse tree for 1 + 2
- CalcParser Alpaca DSL block annotated with sc:nocompile
- Compile-time callout in established blockquote format
- Cross-links to tokens.md, why-lr.md, ../parser.md, ../conflict-resolution.md
Copilot AI review requested due to automatic review settings February 20, 2026 23:40
@github-actions github-actions bot added documentation Improvements or additions to documentation Parser labels Feb 20, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds three comprehensive theory documentation pages explaining the fundamentals of parsing with context-free grammars, LR parsing, and the shift-reduce algorithm. These pages form part of the v1.1 Compiler Theory Tutorial (Phase 9: Grammar Theory).

Changes:

  • Added formal CFG definition with calculator grammar example, derivation trace, and mapping to Alpaca DSL
  • Added explanation of why LR parsing handles left-recursive grammars better than LL, with LR family comparison table and LR(1) item definition
  • Added shift-reduce parsing explanation with detailed 8-step parse trace and connection to Alpaca's runtime implementation

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
docs/_docs/theory/cfg.md Introduces context-free grammars with formal 4-tuple definition, BNF notation, calculator grammar example, leftmost derivation trace, and DSL mapping
docs/_docs/theory/why-lr.md Explains LL vs LR parsing, left-recursion problems, LR family comparison, and why Alpaca uses full LR(1) with source code references
docs/_docs/theory/shift-reduce.md Details the shift-reduce loop with parse stack structure, 8-step trace table, LR(1) item lookahead mechanics, and runtime connection

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

|-----------|-------------------|-------------|-------|
| LR(0) | None (reduce always) | Smallest | Too weak for most real grammars |
| SLR(1) | FOLLOW sets (global per non-terminal) | Same as LR(0) | Better, still limited |
| LALR(1) | Per-state lookahead (merged item-set cores) | Same as LR(0)/SLR | Most common in practice (yacc, Bison, ANTLR) |
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ANTLR uses LL(*) parsing, not LALR(1). ANTLR is a top-down parser generator with dynamic lookahead, while LALR(1) is a bottom-up parsing technique. This entry should be removed from the "Most common in practice" notes for LALR(1).

Suggested change
| LALR(1) | Per-state lookahead (merged item-set cores) | Same as LR(0)/SLR | Most common in practice (yacc, Bison, ANTLR) |
| LALR(1) | Per-state lookahead (merged item-set cores) | Same as LR(0)/SLR | Most common in practice (yacc, Bison) |

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation Parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants