Troubleshooting Unhealthy State

Modified on Mon, 9 Sep at 3:38 PM

After agents are deployed, they may appear in the Gremlin UI in the state 'Unhealthy'.


This message is generated when the Gremlin agent cannot successfully run validation tests for the ability to perform latency and CPU based attacks, and then relay the result of those validation tests to the Gremlin control plane.


If a host is appearing unhealthy, it can be useful to run similar tests locally and observe any error messages that occur. 


gremlin attack latency -l 30
gremlin attack cpu -l 30 -p 5


If both of these complete successfully, verify connectivity to the control plane and restart the gremlin daemon with:

sudo systemctl restart gremlind


It is possible in some cases that an agent will display as unhealthy but be able to successfully run attacks. This may occur when Gremlin is not able to perform experiments against some container runtimes

If issues persist, please contact Gremlin support with any error messages from the above commands and the contents of /var/log/gremlin and we will assist you in troubleshooting the issue.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article