Log Collection for Crashed and Terminated Pods

Thomas_Corvazier · January 26, 2023, 9:17am

Hi all,

Currently, our support-bundle only includes logs from running and recently terminated pods.

However, it is important to also have access to logs from crashed or terminated pods in order to troubleshoot and resolve issues effectively.

I am considering using Collectd or Fluentd with local file output.

I would appreciate any other recommendations you may have for log collection solutions.

Thanks,
Thomas

diamon_w · January 26, 2023, 9:30pm

Hello @Thomas_Corvazier

Are you referring to Pods which are no longer visible with kubectl get pods? If so, you’re correct that you’d want to aggregate the logs somewhere so they persist as Pods come and go.

We do have a suggestion on the Troubleshoot Github repo for a collector that could retrieve logs from an aggregator like ElasticSearch/Logstash, but today I believe something like Fluentd with local file output and our CopyFromHost collector would be your best path forward.

Thomas_Corvazier · January 27, 2023, 4:57pm

Thanks for your response Diamon
Yes, correct, I meant pods deleted by the cluster and not showing in kubectl get pods.
After some investigation, it seems I will need a log collector that could work with many kubernetes distributions and using fluentd seems to make a lot of assumptions on how log files are stored in the cluster (stored in /var/log/containers? JSON or plain text?) so maybe the best would be to use a k8s API based log collector instead?

diamon_w · March 2, 2023, 4:34pm

Sorry @Thomas_Corvazier for the extremely late reply here.

so maybe the best would be to use a k8s API based log collector instead?

I think I’d need some context on what you mean by a k8s API based log collector. Are you referring to what we have available today - Pod Logs - Troubleshoot Docs - Troubleshoot Docs? As you have already discovered, this will only collect logs from the containers that are still running or visible with kubectl get pods.

To capture logs from pods that no longer exist you would need some sort of log aggregation. I think your idea about fluentd would be the easy way to go about this. I imagine that each node would have the logs available at /some/file/path/to/fluentd/logs as you mentioned, and then you could scoop them up with Copy Files and Directories from Hosts - Troubleshoot Docs - Troubleshoot Docs from each node in the cluster. Does that make sense?

Topic		Replies	Views
What are my options for collecting logs? How do I? support-bundle , collectors	0	229	September 7, 2022
Collecting pod logs after an update Supporting your customers support , kurl	0	542	October 4, 2021
Possible to collect pod logs from all namespaces, or a selector of namespaces? Troubleshooting kots , support-bundle , troubleshoot	2	339	April 10, 2024
Run Copy Collector Against Wildcard Namespaces How do I? support	2	259	May 23, 2024
How can I capture all the KOTS "view deploy logs" in a support bundle? Supporting your customers kots , support-bundle	3	86	May 24, 2024

Log Collection for Crashed and Terminated Pods

Related topics