mathtext: add support for unicode mathematics fonts#31064
mathtext: add support for unicode mathematics fonts#31064llohse wants to merge 4 commits intomatplotlib:text-overhaulfrom
Conversation
8009a09 to
b8c28d2
Compare
|
I opened a new PR, because this one is based on the text-overhaul branch and I messed up rebasing the old one. @QuLogic: Would you kindly take a look at this? Is this something you would consider for merging? For further discussion in case you don't reject this feature alltogether: How should the math font be configured? This also depends a bit on how prominent it should be visible. I could not think of a good way without introducing a new parameter in rcparams. |
| 0x22d3: 0x22d2, | ||
| } | ||
|
|
||
| unicode_math_lut: dict[str, dict[CharacterCodeType, CharacterCodeType]] = { |
There was a problem hiding this comment.
Most of these are 1-to-1 mappings of a block; I wonder if there is a more compact representation that could be used? Something like:
# (start, end, new_start)
# up digits
(0x30, 0x39, 0x30),
...
# bf latin lower case
(0x61, 0x7a, 0x1d41a),
maybe plus a small dictionary with some of the exceptions, depending on how they fit into the blocks.
The lookup table could be generated from those if necessary.
There was a problem hiding this comment.
This is exactly the way I generated the lookup table, offline:
- map the entire range
- fix missing/moved codepoints based on a smaller lookup table
At some point I did consider writing special mapping functions but I figured that a lookup table might be preferable for performance.
Do you prefer to generate the lookup table (for example on module load) instead of hardcoding the entire table?
| } | ||
|
|
||
|
|
||
| class UnicodeMathFonts(TruetypeFonts): |
There was a problem hiding this comment.
IIUC, math fonts should have tables with various layout metrics. We currently have those hard-coded in the various FontsConstantsBase subclasses, and they are likely incorrect for an arbitrary math font.
So this will likely need to parse this data out of the font and implement at least get_axis_height that was added in #31046, get_xheight maybe using #31050, and get_quad from #31110. But it is likely that you will want to refactor some of those remaining uses of the constants so that they fetch the information from the fonts as well.
There was a problem hiding this comment.
I fully agree. Doing this may involve some refactoring though, because the FontsConstantsBase subclass could not be determined purely from fontname but would be dynamically populated based on the loaded OpenType font.
That said, I made some experiments locally. Unfortunately, Freetype does not parse the MATH table. We could use fonttools, which is a hard dependency anyway.
There are several open questions how to map the OpenType layout metrics to the legacy TeX-inspired variables used in mathtext. Does it make sense to postpone that to a separate PR and focus on the basics here?
a8083e3 to
5988fac
Compare
Adds basic support for generic unicode OpenType mathematics fonts such as STIX Two Math or Cambria Math to be used within the mathtext engine.
5988fac to
2a66d49
Compare
|
I have just rebased the branch, split the baseline images into a separate commit, and added logic to handle mathnormal from #31121 in the new From my perspective, it is ready for another review. |
PR summary
supersedes #31048
Add basic support for generic unicode OpenType mathematics fonts such as STIX Two Math, Cambria Math, DejaVu Math, etc.
Currently, mathematics text rendering through mathtext in matplotlib supports a hard-coded number of fonts (configured via
mathtext.fontset). Its design presumably predates the specification of mathematics alphabets in the unicode standard. While it is possible to configure custom fonts (mathtext.fontset: custom), this requires to set separate fonts for upright, italic, fraktur, double-struck, etc. variants -- which is fundamentally incompatible with the way modern mathematics fonts are designed.Unicode defines mathematical alphanumeric symbols as unique codepoints (see https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols), in contrast to different fonts all defining different styles for the same ASCII characters/codepoints.
One relatively modern way to render mathematical formulas uses mathematics fonts such as STIX Two Math or Cambria Math, Asana Math, etc.. For LaTeX, this is implemented in the unicode-math package.
Instead of choosing a font based on the style (as it is currently done in matplotlib) to render the same codepoints, this maps alphanumeric characters to different codepoints based on the style, and render them from a single font.
Shortcomings of the status quo:
Changes
This change implements basic functionality to use any installed unicode OpenType mathematics fonts for use in mathtext in a portable way. Currently, this can be enabled by setting the rcparams
I could think of different ways to configure this, though.
Internally, I have implemented a separate class
UnicodeMathFonts(TruetypeFonts)to no interfere with the existing fontsets.Running the test currently requires STIX Two Math to be installed on the system. For that reason, I have added it to the test data. One may think about vendoring STIX Two Math or DejaVu Math via mpl-data instead.
Examples
PR checklist