KOTS >1.95.0. Doesn't deploy all manifests?

Hi,

We been running KOTS 1.91.1 for a while now, but now we need to upgrade to at least 1.96. In that version Troubleshooting.sh has been upgraded to 58, and it fixed one of the issues we were having for not collecting rook-ceph logs.*

Anyways, we upgraded to 1.96.3, and that started breaking fresh installs (on test machines), because for some reason the KOTS just stopped deploying some of the manifests (1 secret & 1 service account).

Here is what i did as a follow-up:

  • Upgraded to 1.97.0, redeployed - issue persists.
  • Downgraded to 1.91.1, redeployed - secret & service account got applied (as expected)
  • Deleted secret, upgraded to 1.97.0, redeployed - issue persists. No new secret has been added.
  • Downgraded to 1.96.0, redeployed - issue persists.
  • Downgraded to 1.95.0, redeployed - it works!

In the case where it doesn’t work, there are no errors in UI, no errors or warnings in kotsadm pod.

Here is the service manifest, which KOTS fails to deploy:

---
apiVersion: v1
data:
  tls.crt: '{{repl ConfigOption "ingress_tls_cert" }}'
  tls.key: '{{repl ConfigOption "ingress_tls_key" }}'
kind: Secret
metadata:
  name: my-tls-name
  namespace: my-app
type: kubernetes.io/tls
---
<----  Another secret here ---- >
---
apiVersion: v1
data:
  tls.crt: '{{repl ConfigOption "ingress_tls_cert" }}'
  tls.key: '{{repl ConfigOption "ingress_tls_key" }}'
kind: Secret
metadata:
  name: my-tls-name
  namespace: my-app-data
type: kubernetes.io/tls
---

*On the side know, it wasn’t easy figuring out how troubleshoot relates to KOTS. I couldn’t find any documentation around it. Also, I don’t know if it’s possible, but it would be great to have troubleshoot decoupled from KOTS (if that’s possible), so that if there is issue in troubleshoot we wouldn’t need to upgrade KOTS at the same time.

Thanks for any help!
Regards,
Dom

1 Like

With regards to troubleshoot as a dependency to KOTS concern, this is because KOTS Admin Console exposes in its UI a way to generate support bundles. For more on this please take a look at Generating Support Bundles | Replicated Docs

As for resources not being deployed as expected, this is a regression that was introduced in 1.96.0 where multiple instances of the same resource (e.g. same apiVersion, kind, and name) get deduplicated even if they have a different namespace. This has been prioritised to have fixed. The current workaround is to ensure you have different resource names for each resource.

2 Likes

Thanks for fixing it in 1.98.1!

That was super quick, I appreciate it.