I'm new to Kubernetes and to supporting a particular website hosted in Kubernetes. I'm trying to figure out why cert-manager did not renew the certificate in the QA environment a few weeks back.
Looking at the details of various certificate-related resources, the problem seems to be that the challenge failed:
State: invalid, Reason: Error accepting authorization: acme: authorization error for [DOMAIN]: 400 urn:ietf:params:acme:error:connection: Fetching http://[DOMAIN]/.well-known/acme-challenge/[CHALLENGE TOKEN STRING]: Timeout during connect (likely firewall problem)
I assume that error means Let's Encrypt wasn't able to access the challenge file at http://[DOMAIN]/.well-known/acme-challenge/[CHALLENGE TOKEN STRING]
(Domain and challenge token string redacted)
I've tried connecting to the URL via PowerShell:
PS C:\Users\Simon> invoke-webrequest -uri http://[DOMAIN]/.well-known/acme-challenge/[CHALLENGE TOKEN STRING] -SkipCertificateCheck
and it returns a 200 OK.
However, PowerShell follows redirects automatically and checking with WireShark the Nginx web server is performing a 308 permanent redirect to https://[DOMAIN]/.well-known/acme-challenge/[CHALLENGE TOKEN STRING]
(same URL but just redirecting HTTP to HTTPS)
I understand that Let's Encrypt should be able to handle HTTP to HTTPS redirects.
Given that the URL Let's Encrypt was trying to reach is accessible from the internet I'm at a loss as to what the next step should be in investigating this issue. Could anyone provide any advice?
Here is the full output of the kubectl cert-manager plugin, checking the status of the certificate and associated resources:
PS C:\Users\Simon> kubectl cert-manager status certificate -n qa containers-tls-secret
Name: containers-tls-secret
Namespace: qa
Created at: 2020-10-16T08:40:14+13:00
Conditions:
Ready: False, Reason: Expired, Message: Certificate expired on Sun, 14 Mar 2021 17:41:12 UTC
Issuing: False, Reason: Failed, Message: The certificate request has failed to complete and will be retried: Failed to wait for order resource "containers-tls-secret-q2cwr-3223066309" to become ready: order is in "invalid" state:
DNS Names:
- [DOMAIN]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Issuing 31s (x236 over 9d) cert-manager Renewing certificate as renewal was scheduled at 2021-02-12 17:41:12 +0000 UTC
Normal Reused 31s (x236 over 9d) cert-manager Reusing private key stored in existing Secret resource "containers-tls-secret"
Warning Failed 31s (x236 over 9d) cert-manager The certificate request has failed to complete and will be retried: Failed to wait for order resource "containers-tls-secret-q2cwr-3223066309" to become ready: order is in "invalid" state:
Issuer:
Name: letsencrypt
Kind: ClusterIssuer
Conditions:
Ready: True, Reason: ACMEAccountRegistered, Message: The ACME account was registered with the ACME server
Events: <none>
Secret:
Name: containers-tls-secret
Issuer Country: US
Issuer Organisation: Let's Encrypt
Issuer Common Name: R3
Key Usage: Digital Signature, Key Encipherment
Extended Key Usages: Server Authentication, Client Authentication
Public Key Algorithm: RSA
Signature Algorithm: SHA256-RSA
Subject Key ID: dadf29869b58d05e980c390fdc8783f52369228d
Authority Key ID: 142eb317b75856cbae500940e61faf9d8b14c2c6
Serial Number: 04f7356add94a7909afab94f0847a3457765
Events: <none>
Not Before: 2020-12-15T06:41:12+13:00
Not After: 2021-03-15T06:41:12+13:00
Renewal Time: 2021-02-13T06:41:12+13:00
CertificateRequest:
Name: containers-tls-secret-q2cwr
Namespace: qa
Conditions:
Ready: False, Reason: Failed, Message: Failed to wait for order resource "containers-tls-secret-q2cwr-3223066309" to become ready: order is in "invalid" state:
Events: <none>
Order:
Name: containers-tls-secret-q2cwr-3223066309
State: invalid, Reason:
Authorizations:
URL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/10810339315, Identifier: [DOMAIN], Initial State: pending, Wildcard: false
FailureTime: 2021-02-13T06:41:59+13:00
Challenges:
- Name: containers-tls-secret-q2cwr-3223066309-2302286353, Type: HTTP-01, Token: [CHALLENGE TOKEN STRING], Key: [CHALLENGE TOKEN STRING].8b00cc-ysOWGQ8vtmpOJobWOFa2cEQUe4Sun5NUKCws, State: invalid, Reason: Error accepting authorization: acme: authorization error for [DOMAIN]: 400 urn:ietf:params:acme:error:connection: Fetching http://[DOMAIN]/.well-known/acme-challenge/[CHALLENGE TOKEN STRING]: Timeout during connect (likely firewall problem), Processing: false, Presented: false
By the way, the invoke-webrequest results show an HTML page was returned:
<!doctype html><html lang="en"><head><meta charset="utf-8"><title>Containers</title><base href="./"><meta name="viewport" content="width=device-width,initial-scale=1"><link rel="icon" href="favicon.ico…
Could that be the issue? I don't know what Let's Encrypt expects to find at the URL of the HTTP01 challenge. Is a web page allowed or is it expecting something different?
EDIT: I now suspect the HTML page returned by invoke-webrequest is not normal, since I understand the file should include the Let's Encrypt token and a key. Here is the full HTML page:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Wineworks</title>
<base href="./">
<meta name="viewport" content="width=device-width,initial-scale=1">
<link rel="icon" href="favicon.ico">
<link rel="apple-touch-icon-precomposed" href="favicon-152.png">
<meta name="msapplication-TileColor" content="#FFFFFF">
<meta name="msapplication-TileImage" content="favicon-152.png">
<script src="https://secure.aadcdn.microsoftonline-p.com/lib/1.0.16/js/adal.min.js"/>
<link href="styles.025a840d59ecfcfe427e.bundle.css" rel="stylesheet"/>
</head>
<body>
<app-root/>
<script type="text/javascript" src="inline.ce954cfcbe723b5986e6.bundle.js"/>
<script type="text/javascript" src="polyfills.7edc676f7558876c179d.bundle.js"/>
<script type="text/javascript" src="main.da3590aac44ee76e7b3a.bundle.js"/>
</body>
</html>
Any idea what might cause cert-manager to drop the wrong kind of file at the challenge location?
所有评论(0)