Hey,
Issue
KOTS support bundle gathering hangs indefinitely.
Description
Recently we been on a few troubleshooting calls with our customers, who where having issues with their embedded clusters. During which we noticed that once they were asked to create support bundle for us, the ‘analyser’ would just hang indefinitely in both KOTS Admin Console and when running it manually via kubectl kots
.
I saw there is similar issue, however in our case it just never completed (we waited around 15 minutes, when usually it take 1min).
Cases
There were couple different cases, but here are a few:
- The
rook-ceph
was in unhealthy state (Error), and the analyser was stuck on ‘collecting CEPH data’. - Hung when trying to collect
sysctl
information.
What is our expectation?
The analyser should time-out after X amount of seconds if unable to collect data and move on, ideally reporting the pod events & logs, even if it’s in unhealthy state.
Any help appreciated.