X Tutup
Skip to content

Deterministic Rendering#30576

Open
melwyncarlo wants to merge 5 commits intomatplotlib:mainfrom
melwyncarlo:issue_30549
Open

Deterministic Rendering#30576
melwyncarlo wants to merge 5 commits intomatplotlib:mainfrom
melwyncarlo:issue_30549

Conversation

@melwyncarlo
Copy link
Contributor

PR summary

This change potentially solves the non-deterministic rendering of polylines.
This is caused by SNAP_AUTO and two different methods of performing pixel translation for the same polyline vertices (that is based on a line's diagonality, which is further based on the polyline's vertex order).
Subject to further tests, this PR closes #30549.

This video demonstrates the change.

@tacaswell
Copy link
Member

This seems like the wrong approach as currently we leave it to the user to control if they want snapping or not and changes it to never snap.

@melwyncarlo
Copy link
Contributor Author

Okay, so now I've changed the code to set the snap mode to SNAP_FALSE only if gc.snap_mode == SNAP_AUTO (which is None in Python, which occurs when the user either ignores this value or explicitly sets it to None).
Based on the current codebase, SNAP_AUTO leads to two different snapping methods for the same set of vertices, based on the diagonality of the lines formed by those vertices (i.e., vertex order).

@tacaswell
Copy link
Member

This is still a behavior change (hence the large number of failing tests). Changing the meaning of "auto" to mean "do not snap" also does not seem like a correct change to me.

@melwyncarlo
Copy link
Contributor Author

melwyncarlo commented Sep 18, 2025

I'm not saying this is the final solution, but it is a possible solution, as it solves #30549.
Tests are definitely required.

We are not changing any meaning.
First call is to a function in path_converter.h from _backend_agg.h. Over there, SNAP_AUTO leads to the relevant function deciding whether or not to snap based on the vertex order (i.e., the diagonality of the line formed by any two adjacent vertices). So, two sets of vertices consisting of the same points, each in different order, would internally lead either to snap=True or snap=False, which makes no sense (because the points are exactly the same).

Regardless, in the event of snap, that function (in path_converter.h) deals with the snapping; in the event of don't snap, the calling function in _backend_agg.h performs the snapping anyway by invoking the AGG's affine translation function (which produces a different result). Either way snapping takes place.

Just for the second argument in the call to the path_converter.h function, converting SNAP_AUTO to either SNAP_TRUE or SNAP_FALSE avoids the above mentioned problem. I went with SNAP_FALSE because another function-call a few lines below also sends SNAP_FALSE (and not gc.snap_mode) as its second argument.

Regarding the failing tests, I agree about that.
Could that be because of the way the tests are implemented? (i.e., the expected images are as per the original code)

image

@tacaswell
Copy link
Member

You are getting local failures on the main branch? We do have some flaky tests, but none of them are related to this sort of thing (mostly subprocess issues in CI machines).

Are you sure you either used editable mode or re-build/re-installed Matplotlib?


This change is effectively removing "auto" mode in this case which is at best a change in behavior and at worst a regression for cases where this is valuable (either way it is causing tests to fail).

I think you need to look at

static bool should_snap(VertexSource &path, e_snap_mode snap_mode, unsigned total_vertices)
{
// If this contains only straight horizontal or vertical lines, it should be
// snapped to the nearest pixels
double x0 = 0, y0 = 0, x1 = 0, y1 = 0;
unsigned code;
switch (snap_mode) {
case SNAP_AUTO:
if (total_vertices > 1024) {
return false;
}
code = path.vertex(&x0, &y0);
if (code == agg::path_cmd_stop) {
return false;
}
while ((code = path.vertex(&x1, &y1)) != agg::path_cmd_stop) {
switch (code) {
case agg::path_cmd_curve3:
case agg::path_cmd_curve4:
return false;
case agg::path_cmd_line_to:
if (fabs(x0 - x1) >= 1e-4 && fabs(y0 - y1) >= 1e-4) {
return false;
}
}
x0 = x1;
y0 = y1;
}
return true;
case SNAP_FALSE:
return false;
case SNAP_TRUE:
return true;
}
and explain why this is the wrong behavior (or at least wrong in more cases than it is right) and we should change the default behavior from "snap when it will help, otherwise don't" to "always snap".

@tacaswell
Copy link
Member

tacaswell commented Sep 18, 2025

On another bit of consideration, I think the problem is that should_snap above does not correctly consider close code; the fix is to treat it with path_cmd_line_to.

@melwyncarlo
Copy link
Contributor Author

melwyncarlo commented Sep 18, 2025

On another bit of consideration, I think the problem is that should_snap above does not correctly consider close code; the fix is to treat it with path_cmd_line_to.

path_cmd_line_to is where the problem occurs. If the line formed by the vertices is horizontal or vertical, it returns true; but if it's a diagonal, then both dx > 0 and dy > 0, and so it returns false.

I don't know how to fix this part, because I don't know the reasoning behind not snapping non-horizontal/non-vertical lines.

@melwyncarlo
Copy link
Contributor Author

melwyncarlo commented Sep 18, 2025

There was an error on my part (did not do a hard reset); the tests no longer fail for test_axes on the original branch, but they still do fail for a few other tests.

It would be helpful to know how the baseline (expected) images are generated in the first place. Could those images be incompatible with the new code if it were hypothetically to be changed?

@jklymak
Copy link
Member

jklymak commented Sep 18, 2025

Please read https://matplotlib.org/devdocs/devel/testing.html for how the images are made and can be updated.

@tacaswell
Copy link
Member

I think this is the right fix here, but needs to be verified. This at least does not fail the existing tests:

diff --git a/src/path_converters.h b/src/path_converters.h
index 1482aeed95..4aab6ecca5 100644
--- a/src/path_converters.h
+++ b/src/path_converters.h
@@ -560,7 +560,7 @@ class PathSnapper
     {
         // If this contains only straight horizontal or vertical lines, it should be
         // snapped to the nearest pixels
-        double x0 = 0, y0 = 0, x1 = 0, y1 = 0;
+        double x0 = 0, y0 = 0, x1 = 0, y1 = 0, xs = 0, ys = 0;
         unsigned code;

         switch (snap_mode) {
@@ -570,12 +570,20 @@ class PathSnapper
             }

             code = path.vertex(&x0, &y0);
-            if (code == agg::path_cmd_stop) {
+            switch (code) {
+            case agg::path_cmd_stop:
                 return false;
+            case agg::path_cmd_move_to:
+                xs = x0;
+                ys = y0;
             }

             while ((code = path.vertex(&x1, &y1)) != agg::path_cmd_stop) {
                 switch (code) {
+                case agg::path_cmd_move_to:
+                    xs = x1;
+                    ys = x1;
+                    break;
                 case agg::path_cmd_curve3:
                 case agg::path_cmd_curve4:
                     return false;
@@ -583,9 +591,16 @@ class PathSnapper
                     if (fabs(x0 - x1) >= 1e-4 && fabs(y0 - y1) >= 1e-4) {
                         return false;
                     }
+                    break;
+                case agg::path_cmd_end_poly:
+                    if (fabs(xs - x0) >= 1e-4 && fabs(ys - y0) >= 1e-4) {
+                        return false;
+                    }
                 }
                 x0 = x1;
                 y0 = y1;
+
+
             }

             return true;

I think the problem is that we are checking if the path has any lines that are not purely vertical or horizontal, however we did not consider the close_poly code which is effectively "line to back to the last move to". With out this if the points are ordered such that we only get horizontal and vertical line-to we decide we should snap, but if the points are ordered such that the lineto goes across the diagonal then we decide not to clip. With this patch we also consider the line draw by the close_poly and should consistently decide to NOT snap with all orders of the triangle points.

Another way to verify that this is the correct explanation is to rotate the triangle by 15deg or so so that it has no horizontal or vertical lines.

double x0 = 0, y0 = 0, x1 = 0, y1 = 0;
unsigned code;
{
const unsigned path_cmd_end_poly_masked = agg::path_flags_close | agg::path_cmd_end_poly;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paths can have multiple closed polygons, what happens in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand. Do you mean multiple polygons within the same commands set? (like self-intersecting polygons)
If yes, then is that practically possible (multiple polygons in one command set)?

In that case, maybe modify:

case path_cmd_end_poly_masked:
    if (fabs(xs - x0) >= 1e-4 && fabs(ys - y0) >= 1e-4) {
        return false;
    }

to:

case path_cmd_end_poly_masked:
    if (fabs(xs - x0) >= 1e-4 && fabs(ys - y0) >= 1e-4) {
        return false;
    }
    xs = x0 = x1;
    ys = y0 = y1;

and continue on . . .

I'm just guessing. It would be helpful to have a list of possible command sequences to verify against.

@melwyncarlo
Copy link
Contributor Author

Thank you very much, @jklymak . I've checked out that link before. It still does not explain how the baseline images are created in the first place.
Based on this link, the baseline images are part of the original source package. My guess is that the baseline images are generated using the same code, and are considered the new baseline images (benchmarks) based on some collective decision/approval...

@melwyncarlo
Copy link
Contributor Author

melwyncarlo commented Sep 19, 2025

The above fix is better, I agree. test_collections.py succeeds. But it still fails in test_axes.py for test_sticky_tolerance_contourf() and test_preset_clip_paths() for the proposed fix (if you remove the fix, it passes).

def test_sticky_tolerance_contourf():

def test_preset_clip_paths():

I guess the corresponding baseline images need an update?


Also, I don't know if a similar fix is required for the Cairo backend as well, as even the backend_cairo.py file has a draw_markers function.

def draw_markers(self, gc, marker_path, marker_trans, path, transform,

@tacaswell
Copy link
Member

What I had in mind was something like:

import matplotlib.pyplot as plt
from matplotlib.transforms import Affine2D
import matplotlib.patches as mpatches
import matplotlib.path as mpath

p = mpath.Path.unit_regular_polygon(3)
pp = mpath.Path.make_compound_path(
    p.transformed(Affine2D().translate(1, 1)),
    p.transformed(Affine2D().translate(1, 10)),
)

fig, ax = plt.subplots(1, 1)
ax.add_artist(mpatches.PathPatch(pp))
ax.set_xlim(0, 2)
ax.set_ylim(0, 11)
plt.show()

however, I was a bit confused and forgot that the Python side (effective) enums and the c++ side (effective) enums are not perfectly aligned and what we have from the Python side for the close poly is 79 which is exactly the | you added.

It appears you can make a pathological path that does not have the move_to between the polygons (it draws something, not 100% sure it should and if what it draws is right, but it does not fail to draw and in not obviously wrong). Probably should add the extra resetting of xs and ys on close just to be safe.

@tacaswell
Copy link
Member

Some of these failures look like they are images that should be re-generated (they are pretty clearly due to triangles being snapped differently), but I do not understand all of the failures.

I think this is indeed a bug that needs to be fixed (the bug being the "should we snap" logic missed an edge case of N horizonal/vertical lines and then a diagonal line on the close poly), we just need to understand why each of the changed images is affected to be sure we are not putting new bugs in!

@tacaswell tacaswell added this to the v3.12.0 milestone Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Inconsistent line anti-aliasing in PolyCollection

3 participants

X Tutup