Skip to content

v0.1.3: EXSLT extensions, transform soundness fix (#6), CI → GitHub Actions#7

Merged
dginev merged 6 commits intomasterfrom
latexml-oxide-contributions
Apr 22, 2026
Merged

v0.1.3: EXSLT extensions, transform soundness fix (#6), CI → GitHub Actions#7
dginev merged 6 commits intomasterfrom
latexml-oxide-contributions

Conversation

@dginev
Copy link
Copy Markdown
Member

@dginev dginev commented Apr 22, 2026

Summary

Expose libexslt's EXSLT extension functions (str:*, math:*, set:*, date:*), fix an unsoundness bug in Stylesheet::transform (closes #6), and release as v0.1.3.

Stylesheets using str:tokenize et al. now work without the consumer writing their own unsafe extern "C" { fn exsltRegisterAll(); }. Also migrate CI from the long-dead .travis.yml to GitHub Actions.

Changes

  • Auto-registered EXSLTparser::parse_file / parse_bytes call register_exslt() on entry, guarded by std::sync::Once. Matches xsltproc's default behaviour.

  • libxslt::register_exslt() — new public, idempotent, thread-safe manual hook for early init.

  • bindings::exsltRegisterAll — new FFI declaration, flagged as a manual addition so bindgen regenerations don't drop it.

  • build.rs — links libexslt via pkg-config with a -lexslt fallback; libxslt gets the same symmetric handling (no more hard panic!).

  • Soundness fix (unsoundness from libxslt issue #14 #6)Stylesheet::transform now consumes the input Document by value. libxslt mutates the input while applying stylesheet-directed whitespace stripping, which was reachable UB through the previous &Document signature. Call sites change transform(&source, …) to transform(source, …); clone the Document at the call site if you need to run it through multiple stylesheets.

Compatibility

  • Source-breaking: Stylesheet::transform signature changes from (&mut self, doc: &Document, …) to (&mut self, doc: Document, …). Every existing call site needs a one-token edit (&sourcesource).
  • Link-breaking: libexslt is now a hard link-time dep — already bundled inside the main libxslt devel package on modern Debian/Ubuntu, Fedora, and macOS Homebrew, so most builds pick it up without any new install.
  • Runtime: additive — EXSLT-using stylesheets that previously failed now succeed; non-EXSLT stylesheets unaffected; transform no longer introduces UB when libxslt mutates the input during whitespace stripping.

dginev and others added 6 commits April 21, 2026 08:15
LaTeXML stylesheets rely on EXSLT extension functions (str:tokenize,
math:*, set:*, date:*). Consumers currently have to reach for their
own extern "C" { fn exsltRegisterAll(); } unsafe declaration; that's
unpleasant and duplicates the crate's FFI policy outside the wrapper.

Changes:

* build.rs — also look up libexslt via pkg-config, falling back to
  `cargo:rustc-link-lib=dylib=exslt` so systems with libexslt on the
  default search path still link when pkg-config is unhelpful.

* src/bindings.rs — add \`pub fn exsltRegisterAll();\` inside a new
  extern "C" block, with a block comment documenting purpose and
  idempotence.

* src/lib.rs — new top-level \`register_exslt()\` Once-guarded safe
  wrapper. Application code now calls this exactly-once on startup
  instead of writing its own unsafe FFI.

No behaviour change; downstream consumers can now drop their
extern "C" blocks for exsltRegisterAll.
Follow-up on a61d0c4. Shifts EXSLT from opt-in to default-on, since
the crate already links libexslt unconditionally — the previous opt-in
wrapper paid the full cost without the default benefit. `xsltproc`
and other mainstream XSLT tooling enable EXSLT by default for the
same reason.

Changes:

* src/parser.rs — `parse_file` and `parse_bytes` call `register_exslt()`
  on entry; the `Once` guard makes the FFI fire exactly once per
  process. Stylesheets that reference `str:tokenize` et al. now Just
  Work without ceremony at the call site.

* src/lib.rs — doc comment on `register_exslt` reframed: the function
  is now an idempotent manual hook for deterministic early init
  (tests, embedders), not a required call. Thread-safety credited to
  `std::sync::Once`.

* build.rs — symmetric pkg-config handling for libxslt and libexslt.
  Both probe via pkg-config and fall back to `cargo:rustc-link-lib`
  on miss, instead of libxslt panicking while libexslt falls back.

* src/bindings.rs — `MANUAL EDIT` marker on the `exsltRegisterAll`
  extern block so a future bindgen regeneration doesn't silently
  drop it.

* tests/base_tests.rs — new `exslt_str_tokenize_auto_registers`
  regression test, self-contained (inline XSL + `<root/>` source via
  `parse_bytes` / `parse_string`, no fixture files). Deliberately
  avoids a manual `register_exslt()` call so it would fail if
  auto-registration regressed.

* Cargo.toml — version bumped 0.1.2 → 0.1.3.

* CHANGELOG.md — 0.1.3 section dated 2026-22-04 with release notes;
  new empty `[0.1.4] (in development)` header opened above.

Verified: `cargo clippy --all-targets -- -D warnings` is clean;
`cargo test` passes all 16 bindings-layout tests and 3 base tests
(including the new EXSLT test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After 10a6fe0, the crate unconditionally links libexslt. Reflect
that in the two places downstream users actually look:

* .travis.yml — add `libexslt-dev` to the apt package list so the
  existing Linux build matrix keeps passing.

* README.md — new "Installation" section listing the development
  headers (libxml2, libxslt, libexslt) across Debian/Ubuntu, Fedora
  and macOS. Previously the README said nothing about system deps;
  this is also the natural spot to land the EXSLT callout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
travis-ci.org was decommissioned in 2021; the existing .travis.yml
has been non-functional for years. Port the Linux build matrix
(stable / beta / nightly) to GitHub Actions, keep clippy-on-stable,
and retire the dead configuration.

* .github/workflows/ci.yml — new. Runs on push to master and on all
  pull requests. Installs libxml2-dev / libxslt1-dev / libexslt-dev
  via apt, then `cargo build`, `cargo test`, and (stable only)
  `cargo clippy --all-targets -- -D warnings`.
* .travis.yml — removed.
* scripts/doc-upload.sh — removed; it relied on TRAVIS_* env vars
  and a TRAVIS-era GH_TOKEN secret, so it was dead code. Porting
  gh-pages deployment is a separate decision (branch, URL, token
  source) and is deliberately left out of this transition.
* Cargo.toml — drop the now-stale `exclude = ["scripts/*"]`.
* README.md — swap the Travis build badge for the GitHub Actions
  CI badge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Modern Debian/Ubuntu (Bookworm, Noble, and later) bundle the libexslt
headers, shared library, and pkg-config metadata inside libxslt1-dev.
The separate `libexslt-dev` package was dropped — the GHA run for PR
#7 failed with `E: Unable to locate package libexslt-dev` on the
ubuntu-latest (noble) image.

Verified locally:

    $ dpkg -L libxslt1-dev | grep -E 'exslt|libexslt\.'
    /usr/include/libexslt
    /usr/include/libexslt/exslt.h
    ...
    /usr/lib/x86_64-linux-gnu/pkgconfig/libexslt.pc

Drop `libexslt-dev` from the CI apt line and from the README's
per-distro install snippet; note in the README that libexslt now
rides along with the libxslt devel package on all three platforms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
libxslt mutates the input xmlDoc while applying stylesheet-directed
whitespace stripping (see
https://gitlab.gnome.org/GNOME/libxslt/-/issues/14). The previous
signature

    pub fn transform(&mut self, doc: &Document, ...)

handed that C-side mutation a shared Rust reference, which is
undefined behaviour reachable from safe code.

Switch to consuming ownership:

    pub fn transform(&mut self, doc: Document, ...)

`doc` is dropped at the end of the transform; libxml's `Document`
Drop impl frees the underlying xmlDoc. The returned `real_dom` is
a separately allocated xmlDocPtr produced by libxslt, so there is
no aliasing between input and output. Matches the resolution I
proposed on the issue.

This is a breaking API change for every caller — but the crate is
already cutting a breaking 0.1.3 release (link-time libexslt dep),
so the two fit into a single upgrade cycle. Downstreams should
replace `transform(&doc, ...)` with `transform(doc, ...)`; clone
the `Document` up front if you need to run it through multiple
stylesheets.

Verified: `cargo clippy --all-targets -- -D warnings` clean;
all 16 bindings-layout and 3 base tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dginev dginev changed the title v0.1.3: auto-register EXSLT extensions + migrate CI to GitHub Actions v0.1.3: EXSLT extensions, transform soundness fix (#6), CI → GitHub Actions Apr 22, 2026
@dginev dginev merged commit f65bc81 into master Apr 22, 2026
3 checks passed
dginev added a commit that referenced this pull request Apr 22, 2026
Modern Debian/Ubuntu (Bookworm, Noble, and later) bundle the libexslt
headers, shared library, and pkg-config metadata inside libxslt1-dev.
The separate `libexslt-dev` package was dropped — the GHA run for PR
#7 failed with `E: Unable to locate package libexslt-dev` on the
ubuntu-latest (noble) image.

Verified locally:

    $ dpkg -L libxslt1-dev | grep -E 'exslt|libexslt\.'
    /usr/include/libexslt
    /usr/include/libexslt/exslt.h
    ...
    /usr/lib/x86_64-linux-gnu/pkgconfig/libexslt.pc

Drop `libexslt-dev` from the CI apt line and from the README's
per-distro install snippet; note in the README that libexslt now
rides along with the libxslt devel package on all three platforms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unsoundness from libxslt issue #14

1 participant