Precedence between pratt and non-pratt combinators #923

jimmycuadra · 2025-12-02T02:24:35Z

jimmycuadra
Dec 2, 2025

The language I'm working on has list and map literals that are both delimited by square brackets:

["a", "b", "c"] // list literal
["a": "apple", "b": "banana", "c": "carrot"] // map literal

The elements of list literals are arbitrary expressions. The elements of map literals are two arbitrary expressions separated by a colon. However, I would like to change this syntax to use equals signs instead of colons, because I plan to use colons exclusively for future type ascription syntax. So instead it should be:

["a" = "apple", "b" = "banana", "c" = "carrot"]

The problem with this is that expr = expr is parsed as a binary assignment operation (implemented with the pratt combinator), so my parser just sees this as a list of assignment expressions. I can associate a numeric precedence for assignment using pratt, but the parser for map literals is a non-pratt combinator that comes first, and I'm not sure how to specify the precedence here. Ideally, I would like the above form to be seen as a map literal, and if the user really wants a list of assignment expressions, they should have to disambiguate it with parentheses:

[("a" = "apple"), ("b" = "banana"), ("c" = "carrot")]

Any advice on how to structure this? Do I need to venture into context-sensitive parsing or is there a way to make it work just by specifying precedence rules somehow? I can show more real code if it would be helpful.

Thank you!

Answered by Zij-IT

Dec 2, 2025

This cannot be done with precedence because the operator for assigning will always instead be parsed as the operator for eq assuming that they have the same symbol, because they are both infix. This means that we will have to use context sensitive parsing.

There are a couple different ways in Chumsky to do this. The first would be with contextual and the second with validate. For both examples, I will be using the following toy language:

PROGRAM => EXPR
EXPR    => NUM
EXPR    => MAP
EXPR    => ARRAY
EXPR    => '(' EXPR ')'
EXPR    => EXPR '+' EXPR
EXPR    => EXPR '=' EXPR
ARRAY   => '[' (EXPR (',' EXPR)*)? ']'
MAP     => '[' (ENTRY (',' ENTRY)*)? ']'
ENTRY   => EXPR '=' EXPR
NUM     => 'n'

View full answer

Zij-IT · 2025-12-02T10:48:46Z

Zij-IT
Dec 2, 2025

This cannot be done with precedence because the operator for assigning will always instead be parsed as the operator for eq assuming that they have the same symbol, because they are both infix. This means that we will have to use context sensitive parsing.

There are a couple different ways in Chumsky to do this. The first would be with contextual and the second with validate. For both examples, I will be using the following toy language:

PROGRAM => EXPR
EXPR    => NUM
EXPR    => MAP
EXPR    => ARRAY
EXPR    => '(' EXPR ')'
EXPR    => EXPR '+' EXPR
EXPR    => EXPR '=' EXPR
ARRAY   => '[' (EXPR (',' EXPR)*)? ']'
MAP     => '[' (ENTRY (',' ENTRY)*)? ']'
ENTRY   => EXPR '=' EXPR
NUM     => 'n'

It allows words like n, n + n and n + [n] + [[n] = n], n = n and [(n = n)].

Contextual

contextual allows us to disable parsers based on the current context that they have. We could add an enum ExprPlace, which can be used to control whether or not the eq operator is at all allowed. For the toy language above, that looks like this:

let expr = recursive(|expr| {
    let num =
        just::<_, _, chumsky::extra::Full<Simple<char>, (), ExprPlace>>('n').to(Expr::Num);

    let array = expr
        .clone()
        .with_ctx(ExprPlace::Normal)
        .separated_by(just(','))
        .delimited_by(just('['), just(']'))
        .to(Expr::Array);

    let map = expr
        .clone()
        .with_ctx(ExprPlace::MapKey)
        .then_ignore(just('='))
        .then(expr.clone())
        .separated_by(just(','))
        .delimited_by(just('['), just(']'))
        .to(Expr::Map);

    let parens = expr
        .with_ctx(ExprPlace::Normal)
        .delimited_by(just('('), just(')'))
        .to(Expr::Parens);

    choice((num, map, array, parens))
        .pratt((
            pratt::infix(pratt::left(2), just('+'), |_, _, _, _| Expr::Binary),
            pratt::infix(
                pratt::none(3),
                just('=')
                    .contextual()
                    .configure(|_, ctx| matches!(ctx, ExprPlace::Normal)),
                |_, _, _, _| Expr::Binary,
            ),
        ))
        .padded()
});

Now, this would successfully parse each of the words that I showed, but it has a few problems:

Every use of expr inside of recursive closure will have to be tagged with a with_ctx call, so that the correct ExprPlace is used. Otherwise you could accidentally forget to enable eq expressions
The expression that would be the key is parsed twice when parsing an array. So in [[n + n + [] + [[[[[]]]]]], n] we have to parse the expression twice. Once because we are attempting to parse a map-entry key, and another time because we are parsing an array element.

I personally am not a fan of accidentally forgetting .with_ctx(..) leading to hard-to-debug errors, and parsing expressions twice is not ideal, because they can be arbitrarily long. So, instead of introducing context sensitivity using contextual, I would recommend using validate.

Validate

As you pointed out, your parser currently thinks of a map as an array of binary expressions. So, instead of trying to disallow this, what if we instead allowed it, and then checked that our expectations are true. This condition is pretty cheap to check, as we must at most walk the list twice (you'll see why in a sec). The benefit of this, is that we avoid re-parsing any expressions. Converting the above parser to use validate we end up with:

let expr = recursive(|expr| {
    let num = just::<_, _, extra::Err<Rich<char>>>('n').to(Expr::Num);

    let array_or_map = expr
        .clone()
        .separated_by(just(','))
        .collect::<Vec<_>>()
        .delimited_by(just('['), just(']'))
        .validate(|exprs, ex, em| {
            let is_eq_expr = |expr: &Expr| expr.binary_op().is_some_and(|op| op == '=');
            if exprs.iter().all(is_eq_expr) {
                Expr::Map
            } else if exprs.iter().any(is_eq_expr) {
                // One of the items in the array is `expr '=' expr` which is a map-entry
                em.emit(Rich::custom(
                    ex.span(),
                    "Expressions in an array using `=` require parenthesis",
                ));

                // The user clearly meant it to be an array, so we preserve that idea
                Expr::Array
            } else {
                Expr::Array
            }
        });

    let parens = expr.delimited_by(just('('), just(')')).to(Expr::Parens);

    choice((num, array_or_map, parens))
        .pratt((
            pratt::infix(pratt::left(2), just('+'), |_, op, _, _| Expr::Binary(op)),
            pratt::infix(pratt::none(3), just('='), |_, op, _, _| Expr::Binary(op)),
        ))
        .padded()
});

In addition to not re-parsing an expression multiple times, we get another important benefit: We can very cleanly handle the case where the user has an array containing expressions that look like map-entries. And the only cost is that we walk the array twice (though obviously you could convert the two iters into one manual for-loop). Additionally, if expressions knew their own spans (or you used map_with on element parser), you could emit an error for each problematic expression instead of emitting one error for the entire array-expression.

Concluding Thoughts

So, unless you have something against validate, I would recommend that you use it for this case. It allows for clean errors, simple logic, and better performance for this use-case. I showed contextual because if you look up context in the documentation for chumsky it is one of the first entries.

2 replies

jimmycuadra Dec 2, 2025
Author

Great write up! Thank you so much for your help. The validate approach does seem much better for this case.

zesterer Dec 2, 2025
Maintainer

Just wanted to jump in and say that this is an amazing answer!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Precedence between pratt and non-pratt combinators #923

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Precedence between pratt and non-pratt combinators #923

Uh oh!

jimmycuadra Dec 2, 2025

Replies: 1 comment · 2 replies

Uh oh!

Uh oh!

Zij-IT Dec 2, 2025

Contextual

Validate

Concluding Thoughts

Uh oh!

jimmycuadra Dec 2, 2025 Author

Uh oh!

zesterer Dec 2, 2025 Maintainer

jimmycuadra
Dec 2, 2025

Replies: 1 comment 2 replies

Zij-IT
Dec 2, 2025

jimmycuadra Dec 2, 2025
Author

zesterer Dec 2, 2025
Maintainer