llama.cpp can force the model to generate outputs that satisfy the given grammar format.
(which, in my understanding, is actually picking the token with the highest logit among the tokens that satisfy the grammar)
Is something like this possible with the current C or C# API?
I couldn't find the corresponding API in the documentation.