Normalize symbols

#331 highlights an issue in the way we currently deal with variable and function names when it comes to subscripting. Substituting in strings leads to some strange edge cases, and the code is pretty difficult to parse. 

I suggest a new model: any symbol (that could represent a variable name or a function name) is only normalized *once*. It follows something like this schedule: 

1. Unicode substitution (`:a₁ --> :a_1`)
1. The string is split into "sub-symbols" (`:abc_x_y --> (:abc, :x, :y)`)
2. Each sub-symbol is normalized separately
3. If `snakecase`, sub-symbols are `join`ed with `\_`, otherwise all but the first pair are (`"abc_{x\_y}"`) 

Sub-symbol normalization:
1. If matches constant list (e.g. `inf`, `atan`), get normalized form from dict
2. If more than one (alphabetical) character, `\mathrm` (configurable?)
3. Else return sub-symbol as is

This leaves an uncertainty in how to sort indexing (`a_1[3]`), and breaks the current behavior of `latexify(:abc) --> "$abc$"`. It will however be more consistent with mathematical  notation. 

The indexing uncertainty is the biggest block to me, we might have to consider using a placeholder struct, more or less saving `:a_b` as a special type of `:a[:b]` and delaying the stringification, but that will require a bit of an overhaul. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Normalize symbols #332

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Normalize symbols #332

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions