Can you please share best practices for latency experiments? I have seen in-consistent behavior with latency where sometimes latency number configured on gremlin matches with app but sometimes we see 7-10x. I would like to know if any advanced options need to be used for latency.
Gremlin Latency is injected on a Per Packet basis (layer 3 in the OCI model). For TCP this means that if your request/response doesn’t fit into a single congestion window you’re going to experience it per-round trip. For applications which reuse large connection pools on reliable networks they typically have very large congestion windows. If, however, the increase in latency causes requests to pile up (not uncommon) and overwhelm the existing pool new connections might have to get created with new windows. In HTTP/2 in particular a single connection can be reused for multiple requests mitigating this.
the other major reason you run into this is that your Application makes multiple requests to the same dependency (or multiple dependencies if you’re apply the attack to both) in serial. If it takes 1ms to talk to DepA and then 1ms to talk to DepB and you do these requests in serial, a 100ms attack will increase the application request by 200ms (100ms x2 for each).
In both of these cases applying the latency at the packet layer (layer3) is representative of the common networking failure modes (eg. queuing at network switches, or latency due to increased route distance) and Gremlin is accurately recreating what happens to the network under those conditions.
The way you are using Gremlin is the best practice, it is likely what you are revealing is a vulnerability/risk with-in the design of those applications