fix(markdown): preserve HTML comments as text during markdown parsing by weilinzung · Pull Request #7722 · ueberdosis/tiptap

weilinzung · 2026-04-07T19:56:16Z

Changes Overview

HTML comments () passed through editor.markdown.parse() were silently dropped because the browser DOM parser strips comment nodes before generateJSON processes them. This fix preserves them as plain text so comment content is not lost.

It should be similar to match markedJs:

Implementation Approach

Added a regex check in parseHTMLToken that intercepts comment tokens before they reach the DOM parser, returning them as plain text nodes instead:

const isHtmlComment = /<!--([\s\S]*?)-->/.test(normalizedHtml)

Block comments are wrapped in a paragraph; inline comments are returned as bare text nodes.

Testing Done

Added 4 unit tests in mixed-html.spec.ts:

standalone block comment → paragraph text
inline comment within surrounding text
multiline comment
comment with surrounding whitespace

Verification Steps

pnpm dev → open http://localhost:3000/Markdown/Parse/React
Click Parse Markdown — the Hidden HTML Comments section should render the comment text visibly in the editor
Run pnpm -w -F @tiptap/markdown test mixed-html

Additional Notes

Before

After

Checklist

I have created a changeset for this PR if necessary.
My changes do not break the library.
I have added tests where applicable.
I have followed the project guidelines.
I have fixed any lint issues.

Related Issues

Fixes #7720

changeset-bot · 2026-04-07T19:56:23Z

🦋 Changeset detected

Latest commit: da5a9be

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 72 packages

Name	Type
@tiptap/markdown	Patch
@tiptap/core	Patch
@tiptap/extension-audio	Patch
@tiptap/extension-blockquote	Patch
@tiptap/extension-bold	Patch
@tiptap/extension-bubble-menu	Patch
@tiptap/extension-bullet-list	Patch
@tiptap/extension-code-block-lowlight	Patch
@tiptap/extension-code-block	Patch
@tiptap/extension-code	Patch
@tiptap/extension-collaboration-caret	Patch
@tiptap/extension-collaboration	Patch
@tiptap/extension-color	Patch
@tiptap/extension-details	Patch
@tiptap/extension-document	Patch
@tiptap/extension-drag-handle-react	Patch
@tiptap/extension-drag-handle-vue-2	Patch
@tiptap/extension-drag-handle-vue-3	Patch
@tiptap/extension-drag-handle	Patch
@tiptap/extension-emoji	Patch
@tiptap/extension-file-handler	Patch
@tiptap/extension-floating-menu	Patch
@tiptap/extension-font-family	Patch
@tiptap/extension-hard-break	Patch
@tiptap/extension-heading	Patch
@tiptap/extension-highlight	Patch
@tiptap/extension-horizontal-rule	Patch
@tiptap/extension-image	Patch
@tiptap/extension-invisible-characters	Patch
@tiptap/extension-italic	Patch
@tiptap/extension-link	Patch
@tiptap/extension-list	Patch
@tiptap/extension-mathematics	Patch
@tiptap/extension-mention	Patch
@tiptap/extension-node-range	Patch
@tiptap/extension-ordered-list	Patch
@tiptap/extension-paragraph	Patch
@tiptap/extension-strike	Patch
@tiptap/extension-subscript	Patch
@tiptap/extension-superscript	Patch
@tiptap/extension-table-of-contents	Patch
@tiptap/extension-table	Patch
@tiptap/extension-text-align	Patch
@tiptap/extension-text-style	Patch
@tiptap/extension-text	Patch
@tiptap/extension-twitch	Patch
@tiptap/extension-typography	Patch
@tiptap/extension-underline	Patch
@tiptap/extension-unique-id	Patch
@tiptap/extension-youtube	Patch
@tiptap/extensions	Patch
@tiptap/html	Patch
@tiptap/pm	Patch
@tiptap/react	Patch
@tiptap/starter-kit	Patch
@tiptap/static-renderer	Patch
@tiptap/suggestion	Patch
@tiptap/vue-2	Patch
@tiptap/vue-3	Patch
@tiptap/extension-character-count	Patch
@tiptap/extension-dropcursor	Patch
@tiptap/extension-focus	Patch
@tiptap/extension-gapcursor	Patch
@tiptap/extension-history	Patch
@tiptap/extension-list-item	Patch
@tiptap/extension-list-keymap	Patch
@tiptap/extension-placeholder	Patch
@tiptap/extension-table-cell	Patch
@tiptap/extension-table-header	Patch
@tiptap/extension-table-row	Patch
@tiptap/extension-task-item	Patch
@tiptap/extension-task-list	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

netlify · 2026-04-07T19:56:26Z

✅ Deploy Preview for tiptap-embed ready!

Name	Link
🔨 Latest commit	`da5a9be`
🔍 Latest deploy log	https://app.netlify.com/projects/tiptap-embed/deploys/69d56557bc073b0008738d1d
😎 Deploy Preview	https://deploy-preview-7722--tiptap-embed.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot

Pull request overview

Fixes @tiptap/markdown parsing so HTML comments () are preserved (as text) instead of being dropped by the browser DOM parsing step, improving markdown round-tripping and preventing data loss for comment-based metadata.

Changes:

Intercepts HTML comment tokens in MarkdownManager.parseHTMLToken and converts them into text (paragraph-wrapped for block tokens).
Adds unit tests covering block, inline, multiline, and whitespace-preserving comment scenarios.
Updates the Markdown parse demo content and adds a changeset for a patch release.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
packages/markdown/src/MarkdownManager.ts	Adds HTML-comment detection and a text-node fallback to preserve comment content during parse.
packages/markdown/tests/mixed-html.spec.ts	Adds regression tests for comment preservation in different positions/forms.
demos/src/Markdown/Parse/React/index.jsx	Extends the demo markdown sample with HTML and “Hidden HTML Comments” examples.
.changeset/calm-cycles-hear.md	Declares a patch release for preserving HTML comments during markdown parsing.

packages/markdown/src/MarkdownManager.ts

packages/markdown/__tests__/mixed-html.spec.ts

demos/src/Markdown/Parse/React/index.jsx

weilinzung · 2026-04-07T20:17:07Z

@bdbch, I would appreciate it if you could review this. This is about our AI usage: we are feeding it MD content with hidden comments to replace.

bdbch · 2026-04-08T14:25:43Z

packages/markdown/src/MarkdownManager.ts

+        return {
+          type: 'paragraph',
+          content: [
+            {
+              type: 'text',
+              text: html,
+            },
+          ],
+        }
+      }


Not a fan of this being a hardcoded paragraph - it's rare but people can overwrite the default paragraph type name which would cause issues here. I think it's safer to assume the default content node from the schema.

Also - I am not sure if this should be a default. By default I assume users would NOT expect comments to appear in their parsed content. Either we do this as a global option on the markdown manager (which can be controlled as an editor option) OR we add an option to the parse functions.

That’s a fair point on the hardcoding—relying on the schema's default content node is definitely safer and cleaner.

Regarding the default behavior: the reason I was leaning toward a schema-based approach is that the DOM parser is currently dropping HTML comments entirely. By the time I run editor.markdown.parse(s), the data is already gone.

How heavy of a lift do you think it would be to introduce a default schema for this? If that feels too intrusive, I’m open to the global option on the markdown manager, provided we can ensure the parser doesn't strip the comments before they hit the manager. I feel like markedjs with sanitize-html could handle this already, but TipTap is using the custom parseHTMLToken.

But isn't the problem here that your comments are lost when parsing into your editor via Markdown? Aren't the comments part of your markdown when parsing in (where you still would have them and would be able to keep them via an option?)

How heavy of a lift do you think it would be to introduce a default schema for this?

I think those comments by default should not be part of the content at all as most people expect comments to just be comments, not content. I wonder if you could add a custom parser for Markdown that catches HTML Comments before the default lexer picks them up as HTML content - that way you could turn them into a "Comment" node" or something on your end without relying on the editor core supporting it.

Actually, the difficulty is that when we're in the clipboardTextParser hook (as seen in my PasteMarkdown extension below), we are dealing with a raw string. When we call editor.markdown.parse(s), the underlying lexer treats HTML comments as html_block or html_inline tokens.

export const PasteMarkdown = Extension.create({ name: 'pasteMarkdown', addProseMirrorPlugins() { const editor = this.editor; return [ new Plugin({ props: { clipboardTextParser(text, _, __, view): Slice { const s = String(text ?? ''); const { schema } = view.state; // Markdown → ProseMirror if (!!s.trim() && editor.markdown) { try { // Without a schema node or a custom parser rule, // HTML comments are lost here during the parse conversion. const json = editor.markdown.parse(s); const node = view.state.schema.nodeFromJSON(json); return new Slice(node.content, 0, 0); } catch (e) { // Fall through to default paste if parsing fails } } return new Slice(Fragment.from(schema.text(s)), 0, 0); } } }) ]; } });

By default, the parser handles these by converting them to DOM nodes and then into ProseMirror nodes. Since standard schemas don't have a spec for HTML comments, they are simply dropped during this conversion. To preserve them, we would need a node in the schema to "catch" them before they vanish.

I definitely hear you on comments not being "content" by default. Regarding the custom parser, perhaps we could introduce a configuration option in the Markdown Manager to define how specific tokens are handled.

For example, we could provide a parseHooks or tokenListeners option:

// Example of a potential configuration option editor.configure({ extensions: [ Markdown.configure({ // An option like this would allow us to intercept the token // and map it to a custom node without modifying core. parseOptions: { html_block: (state, token) => { if (isComment(token.content)) { state.addNode('comment', { value: token.content }); } } } }) ] })

How do you feel about providing a way to register a custom comment node in the schema that the markdown manager can optionally target? This keeps the "content" aspect opt-in while preventing the data loss I'm seeing in the paste hook. or this option is already able to handle it?:

fix(markdown): preserve HTML comments as text during markdown parsing

55a8da6

Copilot AI review requested due to automatic review settings April 7, 2026 19:56

Copilot started reviewing on behalf of weilinzung April 7, 2026 19:56 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

packages/markdown/src/MarkdownManager.ts Show resolved Hide resolved

packages/markdown/__tests__/mixed-html.spec.ts Show resolved Hide resolved

demos/src/Markdown/Parse/React/index.jsx Outdated Show resolved Hide resolved

chore: update content

da5a9be

bdbch requested changes Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(markdown): preserve HTML comments as text during markdown parsing#7722

fix(markdown): preserve HTML comments as text during markdown parsing#7722
weilinzung wants to merge 2 commits intoueberdosis:mainfrom
weilinzung:marked-html-coments

weilinzung commented Apr 7, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

netlify bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

weilinzung commented Apr 7, 2026

Uh oh!

bdbch Apr 8, 2026

Uh oh!

weilinzung Apr 8, 2026

Uh oh!

bdbch Apr 8, 2026

Uh oh!

weilinzung Apr 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

weilinzung commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes Overview

Implementation Approach

Testing Done

Verification Steps

Additional Notes

Before

After

Checklist

Related Issues

Uh oh!

changeset-bot bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

netlify bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for tiptap-embed ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

weilinzung commented Apr 7, 2026

Uh oh!

bdbch Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

weilinzung Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

bdbch Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

weilinzung Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

weilinzung commented Apr 7, 2026 •

edited

Loading

changeset-bot bot commented Apr 7, 2026 •

edited

Loading

netlify bot commented Apr 7, 2026 •

edited

Loading

weilinzung Apr 9, 2026 •

edited

Loading