Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upFailed certificates are re-tried infinitely without backoff #3222
Comments
|
/priority important-soon |
|
/good-first-issue |
|
@meyskens: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment


Describe the bug:
The rate limiter queue in the controllers does not appear to actually rate limit. Certificate failures are re-tried immediately without backoff.
Expected behaviour:
When a certificate fails to be provisioned with an error, it would be retried with an exponential back-off.
Steps to reproduce the bug:
This should work with any other setup as well as it seems like it's the generic error handling re-try mechanism, I simply only have route53 and the lets encrypt issuer set up.
Anything else we need to know?:
I have tried debugging it a bit but didn't get far. The logs lead me to the rate-limit conclusion, see the timing between re-try attempts. Based on https://github.com/jetstack/cert-manager/blob/v0.16.0/pkg/controller/acmechallenges/controller.go#L87 there should be a minimum delay of 30s between the 1st and 2nd attempt but it does not look like it is the case. I've also added another log statement to see if maybe the queue is not aware of the re-tries but it looks like that works as well (see the lines with
Item has been re-queued ... "req"="0",reqis justb.queue.NumRequeues(obj)). I am not sure what is missing.(note: the retries go on forever and exhaust the AWS R53 rate limit)
Environment details::
/kind bug