Skip to content

Do not emit prior emitted KnownNat constraints#122

Open
rowanG077 wants to merge 1 commit intomasterfrom
looping-knownnat-stuck
Open

Do not emit prior emitted KnownNat constraints#122
rowanG077 wants to merge 1 commit intomasterfrom
looping-knownnat-stuck

Conversation

@rowanG077
Copy link
Member

Implemented by maintaining a set of prior emitted KnownNat constraints.

Fixes clash-lang/ghc-typelits-knownnat#68

Copy link
Member

@christiaanb christiaanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment here clash-lang/ghc-typelits-knownnat#68 (comment) :

The reason I’m worried about keeping a simple set of types is that:

  1. Variables are only unique within their scope, and
  2. Plugin state is module-wide

So you might have two distinct case alternative (GADT pattern match) that bring a type variable in scope which are themselves not distinct. So with a simple set could end up in a situation where you emit a wanted constraint in one branch A, but then not emit it in branch B, even though it would be required for soundness.

Perhaps we need to maintain a set of

  1. Location (perhaps https://hackage-content.haskell.org/package/ghc-9.14.1/docs/GHC-Tc-Types-CtLoc.html#v:getCtLocEnvLoc) of the constraint that is requiring us to generate that KnownNat constraint
  2. Whether we already emitted a KnownNat constraint for that location.

That way we can distinguish scopes.

@rowanG077 rowanG077 force-pushed the looping-knownnat-stuck branch 5 times, most recently from 4d1ddc1 to d58efc9 Compare February 11, 2026 18:25
@rowanG077
Copy link
Member Author

rowanG077 commented Feb 11, 2026

I have been trying to get the plugin to allow erroneous programs to be accepted with the simple CType cache key without success. It's probably still a good idea to keep the composite cache keys though.

@rowanG077 rowanG077 force-pushed the looping-knownnat-stuck branch 3 times, most recently from dacfe1f to a03b716 Compare February 12, 2026 21:48
of the same KnownNat multiple times, potentially
leading to loops.
@rowanG077 rowanG077 force-pushed the looping-knownnat-stuck branch from a03b716 to a893232 Compare February 12, 2026 21:55
@rowanG077
Copy link
Member Author

rowanG077 commented Feb 21, 2026

@sheaf I'm very curious what you think about this approach?

@sheaf
Copy link
Collaborator

sheaf commented Feb 21, 2026

My opinion is that using an IORef to avoid looping is very risky because it has the potential for causing bugs that are extremely hard to debug and that vanish when a reproducer is minimised in any way. For this reason, I deliberately removed the IORef mechanism that existed in ghc-tcplugin-natnormalise.

As @christiaanb explained, the design of TcPlugin means that there would be one IORef per module, and the plugin will get repeatedly invoked when typechecking each module.
It should not be the case that processing KnownNat (1 + Stuck) when typechecking one particular declaration should cause the plugin to skip processing it when typechecking a completely separate part of the program.

Unfortunately I don't have convincing reproducers to provide right now, but certainly something which involves only stuck type families and literals (with no variables), with various pieces of evidence learned in GADT pattern matches, seems like an approach to cause issues. It would be worth looking at the tc-trace log and seeing what happens with the IORef approach (with added logging to say the plugin is skipping some work due to the IORef mechanism).

@sheaf
Copy link
Collaborator

sheaf commented Feb 21, 2026

I think a more typical approach would be to use bumpCtLocDepth or otherwise inspect constraint depth to avoid falling into a loop. However, I can't say I can see exactly how to fully solve the problem with this idea.

@rowanG077
Copy link
Member Author

Thanks you for your reply!

I see, I do agree that it's risky. I'd hoped That would be mitigated by using the full data KnownNatKey = KnownNatKey Type RealSrcSpan TcLevelKey SubGoalDepth as the key into the emmision cache.

I will investigate a bit deeper if I can think of a different approach.

@rowanG077
Copy link
Member Author

rowanG077 commented Feb 24, 2026

The only way I can see this being fixed without the IORef is with a GHC change. Essentially what I would like is an additional constructor in CtOrigin called CtTcPluginOrigin PluginName Dynamic CtOrigin. This would allow us to emit a wanted using an updated CtOrigin: CtTcPLugin "GHC.TypeLits.Natnormalise" <KnownNat type>. This can then later easily be used to break the cycle.

There is a similar idea inside GHC itself with CycleBreakerOrigin however that contains no extra metadata, which in our case would lead to spurious loop detections.

The disadvantage is of course this is a GHC change, so only future GHC versions benefit from the fix. But it's a pretty generic way to do inter-plugin things. Of course a question is whether GHC devs think this is even a good/feasible idea.

@rowanG077
Copy link
Member Author

rowanG077 commented Feb 25, 2026

In c84e911 I have implement this CtTcPluginOrigin approach. It requires this GHC patch: https://gitlab.haskell.org/rowanG/ghc/-/commit/bed5f7cf17658c2cb727e60cfc08c110f5ee0471

There are GHC test failures, but those fail for me locally on master as well. I will try to see what the GHC devs think about it soonish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misleading solverWanteds: too many iterations error message

3 participants