gcinterceptor / gci-proxy Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 1.0 554 KB

Reverse proxy which eliminates the tail latency caused by non-deterministic garbage collection

License: MIT License

Go 0.64% HTML 99.36%

garbage-collection gci reverse-proxy proxy tail-latency performance

gci-proxy's Introduction

GCI-Proxy

Disclaimer: a lot in flux

To help cloud developers to deal with the impact of non-deterministic garbage collection interventions, we implemented an easy-to-use mechanism called Garbage Collection Control Interceptor (GCI). Instead of attempting to minimize the negative impact of garbage collector interventions, GCI controls the garbage collector and sheds the incoming load during collections, which avoids CPU competition and stop-the-world pauses while processing requests.

GCI has two main parts: i) the GCI-Proxy -- a multiplatform, runtime-agnostic HTTP intermediary responsible for controlling the garbage collector and shedding the load when necessary -- and the ii) the Request Processor(RP), a thin layer which runs within the service and is usually implemented as a framework middleware. The latter is responsible for checking the heap allocation and performing a garbage collection.

Running GCI Proxy

./gci-proxy --port 3000 --url http://localhost:8080 --ygen=67108864 --tgen=6710886

Where the flags:

--ygen: size in bytes of the young generation, e.g. 67108864 (64MB)
--tgen: size in bytes of the tenured generation, e.g. 67108864 (64MB)
--port: port which the gci-proxy will listen, e.g. 3000
--url: complete endpoint to send GCI-specific commands, e.g. --url http://localhost:8080/__gci

The URL of the server being proxies is either extracted from the request or indicated by the environment variables HTTP_PROXY, HTTPS_PROXY (or the lowercase versions thereof). HTTPS_PROXY takes precedence over HTTP_PROXY for https requests.

GCI Proxy Protocol

The GCI-Proxy communicates to RPs through a simple protocol. The presence of the gci request header indicates that the RP must handle them, its absence means that RP should trigger the usual request processing chain. This header can assume three values:

ch (check heap allocation): it is a blocking HTTP call, which expects the heap allocation in bytes as a response. For generational runtimes (e.g., Java and Ruby), the usage of each generation must be separated by | (pipe symbol), for example, 157810688|78905344 means that the young generation is using 128MB and tenured generation is using 64MB.
gen1 (collect gen1 garbage): it is a blocking HTTP call, which expects the cleanup of generation 1 (e.g., young) garbage. For generational runtimes, it usually represents a minor collection.
gen2 (collect gen2 garbage): it is a blocking HTTP call, which expects the cleanup of generation 2 (e.g., tenured) garbage. For generational runtimes, it usually represents a major/full collection.

Example of RPs:

Ruby
- Rack Middleware
Go
- net/http Middleware

Want to bring GCI to your language or framework, please get in touch!

gci-proxy's People

Contributors

Stargazers

Watchers

Forkers

lucashsilva

gci-proxy's Issues

Move check heap from gci-java to here

The logic is based on a sample size, which can be found here.

Create endpoint for heap usage metric

Create a metric to return the heap usage. This metric should return a number between 0 and 100 which could be used in auto-scalability rules. For instance, the proxy could either publish or be queried by a cronjob which publishes custom metrics to AWS.

The returned number should be bigger as it gets closer to the GC to happen and smaller otherwise. So, it should be near zero just after the GC and grow monotonically as new objects are being allocated to the heap. It should be 100 when it is time to run the GC.

Flag for debbug

This PR aims at remembering us that GCI-Proxy could use a flag to define debug execution which would enable useful information logging.

Fix tests for GC and getUMAE

According to https://codecov.io/gh/gcinterceptor/gci-proxy/src/master/transport.go they are not working as intended.

Implement feedback for when a GC happen

There are quite a few runtimes which developers can not disable the GC. Even though the proxy is conservative when setting the heap consumption limits, that can still leave room for mistakes. In those cases, the upper limit must decrease.

That needs to come from a mechanism which tells the proxy that one or more spurious GC had happened. Instead of checking every answer, we could add another field from the memory check message. That could return the number of spurious GCs that happened since the last check.

Servidor não está no ar e request chega

Para os experimentos conduzidos por @dfquaresma ... precisamos que o proxy consiga lidar com o fato de estar de pé antes do servidor-alvo. O comportamento ideal é retornar bad gateway e isso não deve afetar os cálculos de indisponibilidade.

Adjust knobs automatically based on the throughput

gci-proxy has two very important knobs and two important trade-offs:

maxFraction: it defines the percentage of the genSize which is the upper bound of the shedding threshold. One the one hand, if it is too big, the GC might be triggered spuriously in some languages (e.g. Java and Node.JS), on the other hand, if it is too small, the GC would be triggered too often. In our experience, get as close as 80% (upper bound) is safe for G1GC and the GC of Node.JS.
maxSampleSize: determines the maximum number of requests processed before the next heap check. If it is too big, the amount of memory consumed might be too big, which then might trigger the GC or make the process run out of memory. If it is too small, it might incur in too much overhead. So far, gci-proxy is not targeted to super high throughput instances, so one check every 1024 requests seems good enough.

The thing we've just noticed is that those two knobs are inversely correlated. If we decrease the maxSampleSize (increasing the frequency of heap checks), is safe to get maxFraction closer to 80%. The contrary is also true, if the maxSampleSize increases (decrease the frequency of heap checks), the risk of getting a peak and consume too much memory increases, so it is better to decrease the maxFraction first.

As the overhead is a factor driving the choice of both factors, let's build an algorithm to update those knobs around it.