EC version upgrade fails with “x509: a root or intermediate certificate is not authorized to sign for this name” behind an enterprise HTTPS inspection proxy

EC version upgrade fails with “x509: a root or intermediate certificate is not authorized to sign for this name” behind an enterprise HTTPS inspection proxy

Symptom

When attempting an Embedded Cluster version upgrade through an enterprise HTTPS inspection proxy (also called a TLS-intercepting or MITM proxy), the upgrade fails with an error similar to:

failed to start upgrade service: wait for upgrade service to become ready:
upgrade service terminated. ping error: : stderr: Error: failed to bootstrap:
failed to pull archive from online: failed to pull: failed to pull:
failed to fetch upstream "replicated://": download upstream failed:
failed to download replicated app: failed to execute get request:
Get "https://replicated.app/release/?...":
tls: failed to verify certificate:
x509: a root or intermediate certificate is not authorized to sign for this name:
DNS name "*.replicated.app" is excluded by constraint "..."

The phrase “excluded by constraint” in the error is the key signal. This is not a certificate expiry or trust store gap — it is a Name Constraints enforcement failure.

What makes this confusing

This error typically appears alongside behavior that seems contradictory:

  • curl https://replicated.app succeeds from the EC host
  • App-only KOTS upgrades (Helm chart and image upgrades) succeed
  • Only EC version upgrades (where the embedded cluster binary itself is updated) fail

This asymmetry leads to the understandable assumption that the proxy is mostly working and something specific to the EC upgrade path is broken. The actual cause is more subtle.

Root cause

Enterprise HTTPS inspection proxies terminate TLS, inspect the traffic, and then present a newly generated certificate to the client — signed by the proxy’s own internal CA.

If that signing CA has an X.509 Name Constraints extension that restricts it to a specific domain suffix (e.g. only internal domains), then any certificate it generates for a domain outside that permitted subtree is technically invalid under RFC 5280 §4.2.1.10.

Go’s crypto/x509 enforces Name Constraints strictly. The EC upgrade service is a Go binary. When it validates the proxy-generated certificate for replicated.app, Go rejects it because the signing CA is not permitted to issue certificates for that domain.

OpenSSL on the host does not always enforce Name Constraints. Support for Name Constraints was added in OpenSSL 1.1.0 and is not universally enabled or configured across Linux distributions. This is why curl from the host succeeds — curl links against the system OpenSSL, which may silently accept the otherwise-invalid certificate chain.

Why app-only upgrades succeed

App-only KOTS upgrades do not invoke the EC upgrade service. The upgrade service is a separate binary launched only when the embedded cluster version itself is changing. It may also use a different CA bundle path than the KOTS pod, explaining why the KOTS pod may fail with a different error (unable to get local issuer certificate) while the upgrade service fails with the Name Constraints error.

Component TLS library Proxy CA in bundle Result
EC host curl System OpenSSL Yes Works — OpenSSL may not enforce constraint
KOTS pod curl OpenSSL in curl No (smaller bundle) Fails — unable to get local issuer cert
EC upgrade service Go crypto/x509 Yes (larger bundle) Fails — Name Constraints strictly enforced

Diagnosing Name Constraints

To confirm the proxy CA carries a Name Constraints extension, extract and inspect the certificate chain the proxy presents:

openssl s_client -connect replicated.app:443 -showcerts 2>/dev/null | \
  awk '/BEGIN CERTIFICATE/,/END CERTIFICATE/' > /tmp/chain.pem

openssl x509 -noout -text -in /tmp/chain.pem | grep -A 15 "Name Constraints"

If the output contains a Permitted subtrees block, the CA is constrained:

X509v3 Name Constraints:
  Permitted:
    DNS:<internal-domain-suffix>

Compare the issuer field to what you would expect from replicated.app's real certificate (issued by Cloudflare or a public CA). If the issuer is an internal enterprise CA, the proxy is intercepting and resigning the connection.

Resolution

Adding the proxy CA to the node trust store does not fix this. Name Constraints are encoded in the CA certificate itself — the constraint is a policy restriction on what the CA is permitted to sign. Trusting the CA more does not expand its permitted subtrees.

The correct fix is to configure the HTTPS inspection proxy to bypass certificate interception for Replicated endpoints. When the proxy is bypassed for these destinations, it passes through the real Replicated certificate (issued by a public CA with no constraints) and Go's validation succeeds.

Minimum endpoints to configure as bypass targets:

- replicated.app
- *.replicated.app
- registry.replicated.com
- proxy.replicated.com

For the complete list of required egress endpoints, see the Embedded Cluster network requirements documentation.

Airgap as a fallback

If modifying the proxy configuration is not immediately feasible, an airgap upgrade workflow avoids all HTTPS traffic to replicated.app during the upgrade process. See Performing an Embedded Cluster Airgap Upgrade for details.

Verifying the fix

After the proxy bypass is configured, retry the upgrade. You can pre-verify from the EC host and KOTS pod:

# From the host — check the issuer is now a public CA, not the proxy CA
curl -vv https://replicated.app 2>&1 | grep "issuer"

# From the KOTS pod
kubectl exec -n kotsadm deploy/kotsadm -- \
  curl -vv https://replicated.app/ 2>&1 | grep -E "issuer|SSL certificate verify"

Both should show a public issuer and SSL certificate verify ok.