Skip to content

Improved Footnote Serialization in MarkdownDocSerializer #3128

@simonschoe

Description

@simonschoe

Requested feature

Currently, footnotes are serialized as part of MarkdownDocSerializer more or less as-is:

Image

Serialized as:

5 https://github.com/tesseract-ocr/tesseract

6 https://github.com/VikParuchuri/surya

7 https://github.com/lukas-blecher/LaTeX-OCR

Alternatives

For downstream LLM-based applications it would be helpful if footnotes were serialized as actual footnotes in Markdown Syntax for the LLM to indentify them as footnotes (and not as a numbered list, for example).

^[5 https://github.com/tesseract-ocr/tesseract]

^[6 https://github.com/VikParuchuri/surya]

^[7 https://github.com/lukas-blecher/LaTeX-OCR]

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions