Skip to content

Proposal: ship an official CJK-capable LaTeX template for PDF export (follow-up to #409) #2253

@Fengzdadi

Description

@Fengzdadi

Hi nbconvert team,

First of all, thank you for maintaining nbconvert — I use it a lot in my daily work/study, especially for generating PDF reports from notebooks.

I’m a user/developer based in China, and most of my notebooks contain a mix of English and Chinese text. I really appreciate that since nbconvert 5.0 the PDF exporter uses XeTeX instead of pdfTeX, which is great for Unicode support. However, I ran into the same problem as in #409: out-of-the-box PDF export still doesn’t render CJK text unless I manually customize the LaTeX template.

Background

  • Issue We don't support CJK fonts right now #409 (“We don't support CJK fonts right now”) was opened back in 2016 and describes exactly this problem.
  • As far as I can tell, the core behavior is still similar today:
    • PDF export works fine for Latin scripts;
    • but Chinese/Japanese/Korean characters are missing or rendered as tofu unless the user adds fontspec/xeCJK and a CJK font in a custom template.
  • Many CJK users end up copy-pasting small template hacks from blog posts or StackOverflow — it works, but it’s not easy to discover and not very “official”.

What CJK users typically do today

Right now, the common workaround looks like this (in a custom LaTeX template that extends the default one):

\usepackage{fontspec}
\usepackage{xeCJK}
\setmainfont{Latin Modern Roman}
\setCJKmainfont{Noto Sans CJK SC} % or another system CJK font

and then calling:

jupyter nbconvert notebook.ipynb --to pdf --template <custom_cjk_template>

As long as xelatex and the chosen CJK font are installed, this works very well in practice.

Proposal

Instead of changing the default LaTeX template (which I understand could be risky for non-CJK users), would the project be open to:

  1. Shipping an optional CJK-friendly LaTeX template in nbconvert itself, for example:
  • template name: latex_cjk (or any name you prefer);
  • implementation: a small template that extends the existing LaTeX base template and adds fontspec, xeCJK, and a configurable \setCJKmainfont{...}.
  1. Adding a short documentation section like “Using nbconvert with CJK languages” that explains:
  • how to use it: jupyter nbconvert notebook.ipynb --to pdf --template latex_cjk;
  • how to configure the CJK font name (e.g., via config or template variable);
  • which extra TeX packages may be required (e.g. texlive-xetex, texlive-lang-chinese, etc.).

For example, a minimal template could conceptually look like:

((*- extends 'article.tplx' -*))

((* block header *))
    ((( super() )))
    \usepackage{fontspec}
    \usepackage{xeCJK}

    \setmainfont{Latin Modern Roman}
    % CJK font name would ideally be configurable, but could default to something like Noto Sans CJK
    \setCJKmainfont{Noto Sans CJK SC}
((* endblock header *))

Of course, I’m happy to adapt this to the preferred, modern template system (.tex.j2 + conf.json) and follow whatever conventions nbconvert uses internally.

Questions for the maintainers

  • Is there already any ongoing work on CJK PDF support that I might have missed?
  • Would you be open to a PR that:
  • adds such an optional latex_cjk (or similar) template;
  • adds a small CJK test notebook; and
  • documents how to use it in the docs?
  • Do you have any preferences regarding:
  • template naming/location, and
  • which CJK fonts to use as examples (e.g. Noto Sans CJK vs. system defaults)?

If this direction sounds reasonable, I’d be happy to try putting together a PR and iterate based on your feedback.

Thanks again for all your work on nbconvert!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions