Skip to content

[BUG]: Danish Hunspell dictionary causes suggest() to hang / consume excessive CPU for certain words #264

@addvanced

Description

@addvanced

Issue Description

When using the Danish (da) dictionary, certain misspelled words cause spellbook::Dictionary::suggest() to take an extremely long time, effectively hanging the spell checker and causing high CPU usage in the LSP/editor integration.

The issue appears to be related to aggressive suffix/affix expansion and recursive suggestion generation in the Danish Hunspell dictionary.

The problem is reproducible outside the LSP/editor and can be isolated to a direct call to:

dict.suggest("gitleak")

Profiling shows the time is spent deep inside:

spellbook::suggester::map_suggest_impl
spellbook::checker::strip_suffix_only
spellbook::aff::Affix<Sfx>::condition_matches

Other dictionaries are significantly faster:

Dictionary Time
en_us ~118ms
de ~120ms
es ~108ms
sv ~281ms
fr ~520ms
da 20s+ / effectively hangs

Operating System

macOS

Editor

Zed

Codebook Version

0.3.39

Configuration

dictionaries = ["da"]

Steps to Reproduce

  1. Configure Codebook with the Danish dictionary:
dictionaries = ["da"]
  1. Open a .toml file in Zed with Codebook enabled
  2. Add the following line:
# .gitleaks.toml
  1. Wait for Codebook to mark gitleaks as a misspelled word
  2. Hover the misspelled word to trigger Code Actions / suggestions
  3. Observe that the Code Action menu never appears for the word
  4. Observe sustained high CPU usage from codebook-lsp via Activity Monitor

Expected Behavior

The Code Action menu should appear normally for misspelled words, even if no suggestions can be generated.

Suggestion generation should either:

  • complete within a reasonable amount of time, or
  • fail gracefully / time out without causing prolonged CPU usage in the background.

Actual Behavior

For affected words such as gitleaks, the Code Action menu never appears.

The editor itself remains usable, and Code Actions still work for other misspelled words. However, codebook-lsp appears to continue processing the suggestion request in the background, causing sustained high CPU usage.

In my case, the codebook-lsp process reached very high CPU usage while trying to generate suggestions for the affected word.

Code Sample


Log Output

Additional Context

The Danish dictionary used appears to come from:

Stavekontrolden.dk

Codebook currently appears to use a patched fork of the Danish dictionary (blopker/dictionaries), likely due to compatibility issues with spellbook parsing quoted entries containing / while using FLAG num.

The latest upstream Stavekontrolden dictionary (2.9.096) could not be parsed directly by spellbook without similar escaping fixes.

The issue appears specifically related to suffix/compound handling in the .aff rules.
Removing or heavily reducing suffix rules dramatically improves execution time.

This issue was originally discovered through the Codebook LSP integration in Zed, where hovering misspelled words such as:

# .gitleaks.toml

would cause codebook-lsp CPU usage to spike heavily.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions