One possible reason for "unhealthy" status could be an incorrect Service Gateways system time.
Service Gateway backend should be UTC+0 time zone.
If time is incorrect, Service Gateway will disconnect Vision One and show "unhealthy" status.
To resolve this, add Service Gateway's NTP server by using the following Service Gateway command, then sync time to make sure Service Gateway system time is correct.
# configure ntp {NTP server address}
When connecting to the Service Gateway to enable the administrative functions, the following error is received:
"Kubernetes service starting. Wait a few minutes and try again. (3)"
Extracted CDT package from service gateway and found the following entries which lines up with when the SG went offline:
"Unable to connect to the server: x509: certificate has expired or is not yet valid: current time 2022-12-02T09:09:20Z is after 2022-11-28T23:08:50Z"
To resolve this, contact Trend Micro Technical Support to get the account login information.
For Service Gateway 1.0:
For Service Gateway 2.0:
- Execute the command "microk8s refresh-certs", and wait for it to finish.
- Reboot the OS.
- Execute the command "microk8s refresh-certs --cert /var/snap/microk8s/current/certs/server.crt", and wait for it to finish.
- Reboot the OS.
- During upgrades of the Vision One Cloud, the heartbeat tunnel may be unstable for a couple of minutes.
- When Service Gateway is performing an upgrade, the duration of the process depends on the CPU/RAM/Disk/Network performance and the number of services installed. This may last from 30 minutes to 5 hours. During the upgrade, the heartbeat will not be sent until the network is reconfigured and the appliance configuration is reapplied, which may lead the Cloud to detect it as unhealthy.
- If the network is unstable and a single heartbeat delivery fails, the heartbeat is not designed to retry to prevent message flood. If the only heartbeat message is lost within 5 minutes due to network throttling, the Cloud may mark it as unhealthy too.
- If the Service Gateway is under high load and the CPU/RAM/Disk is under high load, the heartbeat thread may be hung too long, exceeding 5 minutes.
- Run "show sg" command to collect the SG status information and take screenshot. The highlighted message has the key information:
- Run "connect" command to verify the connection status and take a screenshot, for example:
If the issue persists, use "log collect" command to collect debugging data and submit it to Trend Micro Technical Support.