X Tutup
Skip to content

Support colorization of text via TeX directives#31234

Draft
MaddieM4 wants to merge 2 commits intomatplotlib:text-overhaulfrom
MaddieM4:colorize-tex
Draft

Support colorization of text via TeX directives#31234
MaddieM4 wants to merge 2 commits intomatplotlib:text-overhaulfrom
MaddieM4:colorize-tex

Conversation

@MaddieM4
Copy link

@MaddieM4 MaddieM4 commented Mar 4, 2026

PR summary

This is my work in progress to fix #6724, allowing TeX-formatted tech fields to respect color directives. I'm making this PR as a draft, because I'd like some advice on a few things, before bringing it to a more finished state. Here's some example code, and the resulting images with my patch - note that the text in the examples is partially colorized, demonstrating that it works by inline TeX directives.

# Traditional plot
from matplotlib import pyplot as plt
plt.rcParams.update({
    'pgf.texsystem': 'pdflatex',
    'pgf.preamble': r'\usepackage{color}\usepackage{dashrule}',
    'text.usetex': True,
    'text.latex.preamble':  r'\usepackage{color}\usepackage{dashrule}',
})

fig, ax = plt.subplots()
ax.set_ylabel(r'Y $\;$ \textcolor[rgb]{1.0, 0.0, 0.0}{abc\hdashrule[0.5ex]{3cm}{1pt}{1pt 0pt}}')
ax.set_xlabel(r'N $\;$ \textcolor[rgb]{0.0, 1.0, 0.0}{abc\rule[0.5ex]{3cm}{1pt}}')
plt.savefig('axes.png')

# Direct rendering to demonstrate that rotated and unrotated text work correctly
from matplotlib.backends.backend_agg import RendererAgg
from matplotlib.font_manager import FontProperties
from matplotlib.image import imsave
renderer = RendererAgg(400, 400, 120);
renderer.clear()
gc = renderer.new_gc()
props = FontProperties(size=12)
angle = -45
tex_string = r'Y $\;$ \textcolor[rgb]{1.0, 0.0, 0.0}{abc\hdashrule[0.5ex]{3cm}{1pt}{1pt 0pt}}'
renderer.draw_tex(gc, 50, 50, tex_string, props, 0)
renderer.draw_tex(gc, 50, 250, tex_string, props, angle)
imsave('rotations.png', renderer.buffer_rgba())
axes rotations

So, what are the remaining issues?

  1. At an API level, there's an incompatibility between thinking of text as a greyscale thing that can be colorized to any RGB value, vs text that brings its own full RGBA data to the table. The new logic can handle the latter, but loses support for the former.
  2. As some temporary scaffolding, I wrote the original version of this code as a parallel coexisting copy that lives next to the original logic. After I got it working, I did take a look at replacing the original code (I think that's necessary for a complete PR), but there's some logic that really depends on feeding greyscale image data into RendererAgg::draw_text_image, so that code would need to be rethought, and would likely run into the problems from Point 1 (expecting external colorization support).
  3. Because there's some subtle changes to the way that text is composited in the new code, I saw the rotation rendering test fail with a very slight difference between the expected and rendered images. We might choose to regenerate some expected images in the final version of this PR.
  4. When the API questions are sorted out, it makes sense to add some tests for this specific new functionality.

AI Disclosure

I don't use it, I don't need it.

PR checklist

@github-actions
Copy link

github-actions bot commented Mar 4, 2026

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide.
Please let us know if (and how) you use AI, it will help us give you better feedback on your PR.

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@MaddieM4
Copy link
Author

MaddieM4 commented Mar 4, 2026

The current failing tests all seem to be the image comparison test I mentioned in the main body of the PR.

______________________________ test_rotation[png] ______________________________
[gw0] linux -- Python 3.12.12 /opt/hostedtoolcache/Python/3.12.12/x64/bin/python

args = ()
kwds = {'extension': 'png', 'request': <FixtureRequest for <Function test_rotation[png]>>}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
                   ^^^^^^^^^^^^^^^^^^^
E           matplotlib.testing.exceptions.ImageComparisonFailure: images not close (RMS 0.299):
E           	result_images/test_usetex/rotation.png
E           	result_images/test_usetex/rotation-expected.png
E           	result_images/test_usetex/rotation-failed-diff.png

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/contextlib.py:81: ImageComparisonFailure

There's also a stub test failure, which I take to be because I've added a new (temporary) alternative implementation of TexManager.get_grey (get_rgba_mm4, which needs a better name, but get_rgba is taken by the external coloring API, so I'm open to ideas) without adding it to the corresponding .pyi file. I didn't think it would make sense to add the stub for a temporary method in an in-flux API, but it could be sensible for clearing up noise in the CI.

error: matplotlib.texmanager.TexManager.get_rgba_mm4 is not present in stub
Stub: in file /home/runner/work/matplotlib/matplotlib/lib/matplotlib/texmanager.pyi
MISSING
Runtime: in file /home/runner/work/matplotlib/matplotlib/lib/matplotlib/texmanager.py:347
def (tex, fontsize=None, dpi=None)

Found 1 error (checked 266 modules)

The final CI error I see is a bit opaque to me, in the middle of one of the Windows builds. It seems to not be related to the PR itself.

Starting: Upload to codecov.io
==============================================================================
Task         : Bash
Description  : Run a Bash script on macOS, Linux, or Windows
Version      : 3.268.1
Author       : Microsoft Corporation
Help         : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/bash
==============================================================================
Generating script.
"C:\Program Files\Git\bin\bash.exe" -c pwd
/d/a/_temp

"C:\Program Files\Git\bin\bash.exe" -c pwd
/d/a

========================== Starting Command Output ===========================
"C:\Program Files\Git\bin\bash.exe" /d/a/_temp/cda1b7ba-f6a4-478a-8a49-b3c1562e6fae.sh
/dev/fd/63: line 2: syntax error near unexpected token `<'
/dev/fd/63: line 2: `<html><head>'

##[error]Bash exited with code '2'.
Finishing: Upload to codecov.io

@anntzer
Copy link
Contributor

anntzer commented Mar 4, 2026

Thanks for working on this. Note that in #30039 I introduced a new way of rendering TeX in Agg (by actually parsing the dvi file and rendering the glyphs one at a time, similarly to what is already done for all the vector backends) which I hope to make the default in the future (if only because that will be necessary to support xetex/luatex); this PR will need to be adapted to handle that approach. (I believe that this amounts to parsing and using the xcolor specials that get inserted in the dvi file.)
Doing so should also allow (mostly?) handling the API incompatibility between the color specified at the TeX level and the one specified at the Matplotlib level: I'd suggest that if a glyph has its color explicitly specified by TeX (i.e. a xcolor special is currently active) then the Matplotlib color should be ignored (likely except for the alpha channel, as that cannot be specified from TeX's side), whereas glyphs with no color specified by TeX should use the Matplotlib color. [Edit: From a bit of experimentation, whether the "default" black text corresponds to no color specified in dvi or an explicit "black" being specified seems to depends on the active packages. Likely we need to explore a bit the xcolor driver behaviors...]


The codecov CI error can be ignored. I would suggest adding the typestub for now even if needs to be changed later, to make reviewing easier.

{
int x, y;

if (auto value = std::get_if<double>(&vx)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go straight for the newer API here, no need to introduce the old one.

@MaddieM4
Copy link
Author

MaddieM4 commented Mar 8, 2026

@anntzer Ooh, rendering glyphs "in-house" with AGG like that makes it way easier to preserve existing APIs by treating external colors as defaults. I'll spend some time this week updating my PR accordingly. Should this PR be targeting merge into the text-overhaul branch instead of main? I was looking at #30039's diff for awhile and trying to understand how it worked, before realizing it was merged into an intermediate branch.

@anntzer
Copy link
Contributor

anntzer commented Mar 8, 2026

You should target text-overhaul for now. Note that running the test suite on that branch is a bit tricky because the baseline images are out of sync and on another temporary branch for now (to make a long story short, that branch exists because we need to update a huge number of baseline images, but don't want to repeatedly do so at each commit because that would really bloat the repo size; so we will just make a single update of all baseline images when the branch is complete -- hopefully soon). So it's expected that CI will fail. Fortunately, your feature will all be self-contained into separate tests (for text colorization) so that hopefully shouldn't be too big of a problem for your PR.

@MaddieM4 MaddieM4 changed the base branch from main to text-overhaul March 9, 2026 14:18
@MaddieM4
Copy link
Author

MaddieM4 commented Mar 9, 2026

Alright. My branch is in a messy state for now, and will need some big changes to work with the new pipeline. But after some reverse-engineering, I can tell that matplotlib.dviread.Dvi is not yet smart enough to expose specials (in my demo code, I saw it using 0xef, or xxx1, for color directives in the generated .dvi file). Before anything else, really, I'll need to smarten up the in-house DVI reader so that even if it doesn't understand all specials, it does allow a user to read them, which will give code like draw_tex something to chew on, for a subset of recognized special directives (probably just color for now).

@anntzer
Copy link
Contributor

anntzer commented Mar 9, 2026

I guess it depends on what you mean by "not smart enough to expose specials". Specials are just strings after an xxx1/2/3/4 opcode, and these are right now just logged by the Dvi class because it doesn't know how to interpret them; what's missing is indeed actually recording and interpreting these strings, and keeping track of the glyph on which color specials apply.

@MaddieM4
Copy link
Author

MaddieM4 commented Mar 9, 2026

That's exactly what I meant, but my phrasing could have been better. One of the catches here is that, without recognizing a specific type of special, you can't know if it covers a range of glyphs, or if it's something more like "insert an image here." So if we had an interface to iterate all the DVI ops, that could easily cover specials "generically" (since it doesn't need to understand them, just report them). Since we only have a higher-level API that applies ops as state mutations, I don't see a good generic way to expose specials that matplotlib.dviread.Dvi doesn't understand, but we could cover some recognized cases and log the rest as we've been doing.

So the question is how to expose the supported "spanny" specials (covering a range of glyphs) as an API to the Dvi object. Things like font effects are properties of the existing font object, so that doesn't quite work. It wasn't my first choice, but I think it might genuinely make the most sense to add a field to matplotlib.dviread.Text, either as .color or a .props catch-all dictionary. That's going to affect all the consumers of the Text namedtuple, but that's probably a cost that's sensible to pay here, and I'll take a crack at it.

@MaddieM4
Copy link
Author

It's kinda rough, trying to catch breakage on the text-overhaul fork, because so many tests are just known to be broken image comparisons, and I find myself literally overflowing my terminal buffer when running pytest and trying to visually catch any non-image-compare failures.

Hypothetically, if I were to make the DVI reader changes as a separate PR against main and get that merged there (smaller and easier to review PR anyways), would main get merged or rebased into text-overhaul at some point? This would separate out more clearly the "getting color info from DVI files" step from the "use this color information to actually inform rendering" step.

@anntzer
Copy link
Contributor

anntzer commented Mar 10, 2026

So if we had an interface to iterate all the DVI ops, that could easily cover specials "generically" (since it doesn't need to understand them, just report them). Since we only have a higher-level API that applies ops as state mutations, ...

It's probably reasonable to split the Dvi class implementation into a reader part that just yields a list of opcodes, and a second "virtual machine" part that applies them, if that helps.

So the question is how to expose the supported "spanny" specials (covering a range of glyphs) as an API to the Dvi object. Things like font effects are properties of the existing font object, so that doesn't quite work. It wasn't my first choice, but I think it might genuinely make the most sense to add a field to matplotlib.dviread.Text, either as .color or a .props catch-all dictionary. That's going to affect all the consumers of the Text namedtuple, but that's probably a cost that's sensible to pay here, and I'll take a crack at it.

The reason I made them properties of the font object (and intend to deprecate them as properties of the glyph objects) is that this directly maps to their definition in the dvi files, especially in the xetex (_define_native_font) and luatex (DviFont.from_luatex) cases, so it felt artificial to "lift" that info to the Text objects. OTOH, the extra "color" field suggested here naturally applies to Text objects.

I think deprecating the use of Text as a namedtuple and turning it into a dataclass with an extra "color" field is reasonable. (Side point, though it absolutely doesn't have to be done here: probably we can write a decorator to help that transition, by providing deprecation warnings and temporary backcompat if someone tries to call (indirectly, e.g. via iteration) Text.__iter__() or Text.__getitem__(int)).

It's kinda rough, trying to catch breakage on the text-overhaul fork, because so many tests are just known to be broken image comparisons, and I find myself literally overflowing my terminal buffer when running pytest and trying to visually catch any non-image-compare failures.

The updated figures are actually available at #30161. @QuLogic can you remind us of your workflow? (Personally, I only run the relevant tests (as few as possible) and ignore the errors directly related to outdated images when working on that branch.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Waiting for author

Development

Successfully merging this pull request may close these issues.

Inconsistent behavior of backends when rendering latex colors

3 participants

X Tutup