Troubleshooting Gremlin on OpenShift

Modified on Wed, 9 Nov, 2022 at 4:40 PM

Gremlin network timeouts

TABLE OF CONTENTS


This issue is most often seen with timeout errors in both Chao and Gremlin logs.

error sending request for url (https://api.gremlin.com/v1/daemon/poll?multiple=1): operation timed out


This usually stems from network rules preventing Gremlin's access to the internet. It's important to figure out what the intended network behavior should be for Gremlin on your infrastructure with some questions:

  • What other services connect to the internet within your cluster?
  • Do services within your cluster rely on an HTTP proxy when connecting to the internet?

OpenShift Egress network policies

If you've reviewed the proxy requirements and determined that Gremlin does not need an HTTP proxy, but you are still unable to connect Gremlin to the internet, it's likely one or more OpenShift projects are preventing internet access with an EgressNetworkPolicy.

You can list such policies in any project with the following

SHELL

oc -n $PROJECT get egressnetworkpolicies
NAME   AGE
test 20m


If you look at the details of such a policy, you can see if network access for api.gremlin.com is denied. Here's an example of a policy which denies api.gremlin.com, because it only allows specific IP address ranges and host names while denying everything else.


SHELL

oc -n $PROJECT get egressnetworkpolicy test -o yaml

YAML

apiVersion: network.openshift.io/v1
kind: EgressNetworkPolicy
metadata:
name: test
namespace: test
spec:
egress:
- to:
cidrSelector: 1.2.3.0/24
type: Allow
- to:
dnsName: www.foo.com
type: Allow
- to:
cidrSelector: 0.0.0.0/0
type: Deny


Adding api.gremlin.com to such a EgressNetworkPolicy will fix this problem.


YAML

apiVersion: network.openshift.io/v1
kind: EgressNetworkPolicy
metadata:
name: test
namespace: test
spec:
egress:
- to:
cidrSelector: 1.2.3.0/24
type: Allow
- to:
dnsName: www.foo.com
type: Allow
- to:
dnsName: api.gremlin.com
type: Allow
- to:
cidrSelector: 0.0.0.0/0
type: Deny


Gremlin attacks cannot find target container (OpenShift 4.9)

This issue will generate a variation of the following error:


container details : time="2022-05-11T13:07:21Z" level=error msg="container \"2584cede1cf01e77d9d9ac8f864f99f1c155268ec1095af2bbde850e73d936a2\" does not exist"


See the Openshift 4.9 | Container does not exist article in the Gremlin knowledge base for a workaround.





























Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article