Precedence between pratt and non-pratt combinators #923
-
|
The language I'm working on has list and map literals that are both delimited by square brackets: The elements of list literals are arbitrary expressions. The elements of map literals are two arbitrary expressions separated by a colon. However, I would like to change this syntax to use equals signs instead of colons, because I plan to use colons exclusively for future type ascription syntax. So instead it should be: The problem with this is that Any advice on how to structure this? Do I need to venture into context-sensitive parsing or is there a way to make it work just by specifying precedence rules somehow? I can show more real code if it would be helpful. Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
|
This cannot be done with precedence because the operator for assigning will always instead be parsed as the operator for There are a couple different ways in Chumsky to do this. The first would be with It allows words like Contextual
let expr = recursive(|expr| {
let num =
just::<_, _, chumsky::extra::Full<Simple<char>, (), ExprPlace>>('n').to(Expr::Num);
let array = expr
.clone()
.with_ctx(ExprPlace::Normal)
.separated_by(just(','))
.delimited_by(just('['), just(']'))
.to(Expr::Array);
let map = expr
.clone()
.with_ctx(ExprPlace::MapKey)
.then_ignore(just('='))
.then(expr.clone())
.separated_by(just(','))
.delimited_by(just('['), just(']'))
.to(Expr::Map);
let parens = expr
.with_ctx(ExprPlace::Normal)
.delimited_by(just('('), just(')'))
.to(Expr::Parens);
choice((num, map, array, parens))
.pratt((
pratt::infix(pratt::left(2), just('+'), |_, _, _, _| Expr::Binary),
pratt::infix(
pratt::none(3),
just('=')
.contextual()
.configure(|_, ctx| matches!(ctx, ExprPlace::Normal)),
|_, _, _, _| Expr::Binary,
),
))
.padded()
});Now, this would successfully parse each of the words that I showed, but it has a few problems:
I personally am not a fan of accidentally forgetting ValidateAs you pointed out, your parser currently thinks of a map as an array of binary expressions. So, instead of trying to disallow this, what if we instead allowed it, and then checked that our expectations are true. This condition is pretty cheap to check, as we must at most walk the list twice (you'll see why in a sec). The benefit of this, is that we avoid re-parsing any expressions. Converting the above parser to use let expr = recursive(|expr| {
let num = just::<_, _, extra::Err<Rich<char>>>('n').to(Expr::Num);
let array_or_map = expr
.clone()
.separated_by(just(','))
.collect::<Vec<_>>()
.delimited_by(just('['), just(']'))
.validate(|exprs, ex, em| {
let is_eq_expr = |expr: &Expr| expr.binary_op().is_some_and(|op| op == '=');
if exprs.iter().all(is_eq_expr) {
Expr::Map
} else if exprs.iter().any(is_eq_expr) {
// One of the items in the array is `expr '=' expr` which is a map-entry
em.emit(Rich::custom(
ex.span(),
"Expressions in an array using `=` require parenthesis",
));
// The user clearly meant it to be an array, so we preserve that idea
Expr::Array
} else {
Expr::Array
}
});
let parens = expr.delimited_by(just('('), just(')')).to(Expr::Parens);
choice((num, array_or_map, parens))
.pratt((
pratt::infix(pratt::left(2), just('+'), |_, op, _, _| Expr::Binary(op)),
pratt::infix(pratt::none(3), just('='), |_, op, _, _| Expr::Binary(op)),
))
.padded()
});In addition to not re-parsing an expression multiple times, we get another important benefit: We can very cleanly handle the case where the user has an array containing expressions that look like map-entries. And the only cost is that we walk the array twice (though obviously you could convert the two iters into one manual for-loop). Additionally, if expressions knew their own spans (or you used Concluding ThoughtsSo, unless you have something against |
Beta Was this translation helpful? Give feedback.
This cannot be done with precedence because the operator for assigning will always instead be parsed as the operator for
eqassuming that they have the same symbol, because they are bothinfix. This means that we will have to use context sensitive parsing.There are a couple different ways in Chumsky to do this. The first would be with
contextualand the second withvalidate. For both examples, I will be using the following toy language: