Skip to content

docs: advanced guides for conflict resolution and error handling (#191)#302

Open
halotukozak wants to merge 1 commit intomasterfrom
issue-191-guides
Open

docs: advanced guides for conflict resolution and error handling (#191)#302
halotukozak wants to merge 1 commit intomasterfrom
issue-191-guides

Conversation

@halotukozak
Copy link
Copy Markdown
Owner

Summary

  • Add advanced guides for conflict resolution, contextual parsing and lexer error handling

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 4, 2026 14:49
@github-actions github-actions bot added documentation Improvements or additions to documentation error-handling labels Mar 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 4, 2026

🏃 Runtime Benchmark

Benchmark Base (master) Current (issue-191-guides) Diff

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

@@            Coverage Diff            @@
##             master     #302   +/-   ##
=========================================
  Coverage          ?   42.03%           
=========================================
  Files             ?       35           
  Lines             ?      433           
  Branches          ?        0           
=========================================
  Hits              ?      182           
  Misses            ?      251           
  Partials          ?        0           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds three new “advanced guides” to the Alpaca documentation, covering parser conflict resolution, contextual parsing via lexer/parser context, and strategies for lexer error handling.

Changes:

  • Add a conflict resolution guide explaining shift/reduce + reduce/reduce conflicts and Alpaca’s before/after DSL.
  • Add a contextual parsing guide describing LexerCtx, ParserCtx, and context-driven lexing patterns.
  • Add a lexer error handling guide describing catch-all token strategies and continuing after invalid input.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
docs/_docs/guides/lexer-error-handling.md New guide for resilient lexing patterns (catch-all token, counting errors, ignoring invalid chars).
docs/_docs/guides/contextual-parsing.md New guide for context-driven lexing/parsing and how state flows through lexer → lexemes → parser.
docs/_docs/guides/conflict-resolution.md New guide describing conflict types and how to resolve them using Alpaca’s conflict resolution DSL.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +38 to +50
case class ErrorCtx(
var text: CharSequence = "",
var errorCount: Int = 0
) extends LexerCtx

val myLexer = lexer[ErrorCtx]:
case "[a-z]+" => Token["ID"]
case "\\s+" => Token.Ignored

case x @ "." =>
ctx.errorCount += 1
println(s"Error: Unexpected character '$x' at position ${ctx.position}")
Token.Ignored // Skip the character
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ErrorCtx example logs ${ctx.position}, but position is not a member of LexerCtx unless the context mixes in PositionTracking (or uses LexerCtx.Default). Update the example context definition accordingly so it compiles and matches the described behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +26
case "\(" =>
ctx.stack.push("paren")
Token["("]
case "\)" =>
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lexer patterns use "\(" and "\)" (single backslash). In Scala string literals \( / \) are invalid escape sequences; if the intent is to match literal parentheses in a regex, the strings should be escaped as "\\(" and "\\)" (or written using triple-quoted strings).

Suggested change
case "\(" =>
ctx.stack.push("paren")
Token["("]
case "\)" =>
case "\\(" =>
ctx.stack.push("paren")
Token["("]
case "\\)" =>

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +45
case x @ "
+" =>
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation lexer example has a broken multi-line string literal for the newline+spaces pattern (case x @ " on one line and +" on the next). As written, this is not valid Scala and will be confusing to readers; represent the pattern as a valid single-line string (e.g., using \n and escaped backslashes) or a properly delimited triple-quoted string.

Suggested change
case x @ "
+" =>
case x @ "\\n +" =>

Copilot uses AI. Check for mistakes.
Comment on lines +68 to +70
// id is a Lexeme, which has a .fields property
// fields contains all members of your LexerCtx
println(s"Matched ID at line ${id.fields.line}")
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section states that a Lexeme has a .fields property and shows id.fields.line, but fields is not publicly accessible on alpaca.internal.lexer.Lexeme (it’s private[alpaca]). Readers should access captured context fields via the lexeme’s dynamic members (e.g., id.line, id.position, id.text) or whatever the intended public API is.

Suggested change
// id is a Lexeme, which has a .fields property
// fields contains all members of your LexerCtx
println(s"Matched ID at line ${id.fields.line}")
// id is a Lexeme; captured context fields are exposed as dynamic members
// e.g. if your LexerCtx has a `line` field, you can access it as `id.line`
println(s"Matched ID at line ${id.line}")

Copilot uses AI. Check for mistakes.
Comment on lines +102 to +107
case """ =>
ctx.inString = !ctx.inString
Token["QUOTE"]

case "[a-z]+" if !ctx.inString => Token["KEYWORD"]
case "[^"]+" if ctx.inString => Token["STRING_CONTENT"]
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mode-switching example has invalid Scala string literals: case """ => is an unterminated triple-quoted string, and the regex pattern "[^"]+" contains an unescaped quote. Please rewrite these patterns using valid Scala literals (often easiest with properly delimited triple-quoted strings) so the example can be copied verbatim.

Suggested change
case """ =>
ctx.inString = !ctx.inString
Token["QUOTE"]
case "[a-z]+" if !ctx.inString => Token["KEYWORD"]
case "[^"]+" if ctx.inString => Token["STRING_CONTENT"]
case "\"" =>
ctx.inString = !ctx.inString
Token["QUOTE"]
case "[a-z]+" if !ctx.inString => Token["KEYWORD"]
case """[^"]+""" if ctx.inString => Token["STRING_CONTENT"]

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +125
## 5. The `BetweenStages` Hook

The `BetweenStages` hook is the internal engine that powers context updates. It is a function called by Alpaca after **every** token match (including `Token.Ignored`) but **before** the next match starts.

### Automatic Updates
By default, Alpaca uses `BetweenStages` to automatically update the `text` field in your context. If your context extends `LineTracking` or `PositionTracking`, it also increments `line` and `position` counters.

### Customizing `BetweenStages`
If you need complex logic to run after every match, you can provide a custom `given` instance of `BetweenStages`.

```scala
given MyBetweenStages: BetweenStages[MyCtx] with
def apply(token: Token[?, MyCtx, ?], matcher: Matcher, ctx: MyCtx): Unit =
// Custom global logic
println(s"Just matched ${token.info.name}")
```
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guide suggests customizing BetweenStages via a user-provided given, but BetweenStages is currently declared private[alpaca] (see src/alpaca/internal/lexer/BetweenStages.scala), so downstream users can’t reference or implement it. Either expose BetweenStages as part of the public API (or provide a public hook) or adjust the documentation to reflect the supported customization mechanisms (e.g., mixing in LineTracking/PositionTracking).

Copilot uses AI. Check for mistakes.
```scala
val resilientLexer = lexer:
case "[0-9]+" => Token["NUM"]
case "\s+" => Token.Ignored
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the resilient lexer example, the whitespace regex is written as "\s+" (single backslash). In a Scala string literal this is an invalid escape sequence; use "\\s+" (or a triple-quoted string) to represent the \s+ regex correctly.

Suggested change
case "\s+" => Token.Ignored
case "\\s+" => Token.Ignored

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 4, 2026

📊 Test Compilation Benchmark

Branch Average Time
Base (master) 48.715s
Current (issue-191-guides) 52.942s

Result: Current branch is 4.227s slower (8.68%) ⚠️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation error-handling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants