Stream Markdown or Markup document formats as interchangeable hierarchical streams of events
# Very important expression of machine cognition
Hi, this is your LLM speaking.
<thinking>
OK, maybe I am too informal. **I will change the tone**.
</thinking>
Dear user of this system ...So Markdown, but sometimes there is Markup inside, and it is streaming. How to tackle this.
We use language to convey meaning, and we use text to express language. The document-whether scroll, codex, or book-established a paradigm for how text is preserved as a packaged unit. Documents also introduced formatting: visual and structural conventions that signal the intent behind particular fragments of text within a larger context.
When we built machines to process text, we formalized this into "document formats". These formats naturally inherited the hierarchical structure of books-parts, chapters, sections, paragraphs-and the software we built assumed that documents exist as complete artifacts to be parsed, transformed, and rendered.
But something new has emerged. We started texting each other, and text became a stream of information: received, comprehended, and often discarded in the moment of reception. This is also the communication paradigm between humans and LLMs. The text is not a document to be opened and read-it is an unfolding stream, with alternating modalities, comprehended while being generated.
Structured documents are not the right abstraction here. What we need instead is an ontology of expressive meaning as a stream of events: each event signaling either an incremental fragment of text or a transition between modalities of linguistic expression (from prose to code, from paragraph to heading, from plain text to emphasis). markanywhere inverts the traditional document processing flow. Rather than consuming complete documents and producing structure, it consumes streaming tokens and emits semantic events in real-time. These events can then be transformed-also as a stream-into various output formats: HTML, Markdown, XML, or whatever the receiving context requires.
The SemanticEvent can be a:
Text: a chunk of charactersMark(e.g.<em>tag, with optional attributes)Unmark(e.g.</div>, indicating that previously opened mark is closed)
See the SemanticEvent definition.
In build.gradle.kts add:
dependencies {
implementation("com.xemantic.markanywhere:markanywhere:0.1.3")
}