问题现象
在 EKS 集群中,部分节点上的 CloudWatch Agent Pods 持续处于 CrashLoopBackOff 状态,查看日志发现以下错误:
E! [EC2] Fetch identity document from EC2 metadata fail: EC2MetadataRequestError: failed to get EC2 instance identity document caused by: EC2MetadataError: failed to make EC2Metadata request status code: 401, request id: I! Detected the instance is OnPremise E! Failed to generate TOML configuration validation content: [Under path : /agent/ruleRegion/ | Error : Region info is missing for mode: onPrem] Configuration validation first phase failed.关键错误信息:
- 无法访问 EC2 Instance Metadata Service (IMDS)
- 被错误检测为 OnPremise 模式
- 缺少 region 信息导致配置验证失败