Giter Site home page Giter Site logo

launchdarkly / ld-relay Goto Github PK

View Code? Open in Web Editor NEW
103.0 46.0 76.0 11.27 MB

LaunchDarkly Relay Proxy

License: Other

Go 98.77% Shell 0.86% Makefile 0.27% Dockerfile 0.09%
launchdarkly golang feature-flags feature-toggles tools go launchdarkly-docker-image managed-by-terraform launchdarkly-sdk mirror

ld-relay's Introduction

LaunchDarkly Relay Proxy

Actions Status Actions Status Actions Status Actions Status

About the LaunchDarkly Relay Proxy

The LaunchDarkly Relay Proxy establishes a connection to the LaunchDarkly streaming API, then proxies that stream connection to multiple clients. It lets a number of servers connect to a local stream instead of making a large number of outbound connections to stream.launchdarkly.com.

You can configure the Relay Proxy to proxy multiple environment streams, even across multiple projects. You can also use it as a local proxy that forwards events to events.launchdarkly.com. This can be useful if you are load balancing Relay Proxy instances behind a proxy that times out HTTP connections, such as Elastic Load Balancers.

To learn more, read The Relay Proxy.

When to use the LaunchDarkly Relay Proxy

To learn more about appropriate use cases for the Relay Proxy, read Determining whether to use the Relay Proxy.

Getting started

To learn more about setting up the Relay Proxy, read Implementing the Relay Proxy.

Capabilities

SDKs can connect to the Relay Proxy in one of two modes: proxy mode or daemon mode. To learn more, read Configuring an SDK to use different modes.

Here are the differences between the two modes:

  • In proxy mode, the Relay Proxy simulates the LaunchDarkly service endpoints that LaunchDarkly SDKs use. The SDKs can connect to the Relay Proxy as if it were LaunchDarkly. To learn about proxy mode configuration, read Proxy mode.
  • In daemon mode, the Relay Proxy puts feature flag data into a database and the SDKs use that database instead of making HTTP requests. To learn about daemon mode configuration, read Daemon mode.

If you provide a mobile key and/or a client-side environment ID in the configuration for an environment, the Relay Proxy can also accept connections from mobile clients and/or JavaScript clients. To learn more, read Client-side and mobile connections.

If you enable event forwarding in the configuration, the Relay Proxy accepts analytics events from SDKs and forwards them to LaunchDarkly. To learn more, read Event forwarding.

There are some special considerations if you use the PHP SDK. To learn more, read Using PHP.

Enterprise capabilities

LaunchDarkly offers additional Relay Proxy features to customers on Enterprise plans: automatic configuration and offline mode.

Automatic configuration

Automatic configuration automatically detects when you create and update environments, removing the need for most manual configuration file changes and application restarts. Instead, you can use a simple in-app UI to manage your Relay Proxy configuration. To learn more, read Automatic configuration.

Offline mode

You can run the Relay Proxy in online mode or offline mode. When running in offline mode, the Relay Proxy gets flag and segment values from an archive on your filesystem, instead of contacting LaunchDarkly's servers.

To run the Relay Proxy in offline mode, your SDKs must be configured for proxy mode. To learn more, read Offline mode.

If you want access to these features but don’t have a LaunchDarkly Enterprise plan, contact our sales team.

Specifying a configuration

There are many configuration options, which can be specified in a file, in environment variables, or both. To learn more, read Configuration.

Deployment options

There are several ways to deploy the Relay Proxy.

In order from most common to least common uses, the methods are:

To learn more, read Deploying the Relay Proxy.

Command-line arguments

Argument Description
--config FILEPATH configuration file location
--allow-missing-file if specified, a --config option for a nonexistent file will be ignored
--from-env if specified, configuration will be read from environment variables
--version if specified, print relay's version and stop execution

If none of these are specified, the default is --config /etc/ld-relay.conf.

Persistent storage

You can configure Relay Proxy nodes to persist feature flag settings in Redis, DynamoDB, or Consul. You must use persistent storage to run your SDKs in daemon mode. To learn more, read Using a persistent store.

You can also configure the Relay Proxy to persist segment information for Big Segments in Redis or DynamoDB. To learn more, read Configuring the Relay Proxy for segments.

Segments let you target groups of contexts that encounter feature flags. Big Segments are segments with more than 15,000 targets, or that are synced from external tools. You must use either the Relay Proxy or a persistent store integration if you use server-side SDKs and Big Segments. If supporting segments is your only use case, we recommend using a persistent store integration rather than the Relay Proxy.

For persistent storage configuration details, read Persistent Storage.

Exporting metrics and traces

The Relay Proxy may be configured to export statistics and route traces to Datadog, Stackdriver, and Prometheus. To learn more, read Metrics integrations.

Logging

To learn about Relay Proxy logging, read Logging.

Service endpoints

The Relay Proxy defines many HTTP/HTTPS endpoints. Most of these are proxies for LaunchDarkly services, to be used by SDKs that connect to the Relay Proxy. Others are specific to the Relay Proxy, such as for monitoring its status.

To learn more, read Service endpoints.

Performance, scaling, and operations

We have done extensive load tests on the Relay Proxy in AWS/EC2. We have also collected a substantial amount of data based on real-world customer use. Based on our experience, we have several recommendations on how to best deploy, operate, and scale the Relay Proxy:

  • Networking performance is the most important consideration. Memory and CPU are not as critical. Deploy the Relay Proxy on boxes with good networking performance. On EC2, we recommend using an instance with Moderate to High networking performance such as m4.xlarge. On an m4.xlarge instance, a single Relay Proxy node can easily manage 20,000 concurrent connections.

  • If you use an Elastic Load Balancer in front of the Relay Proxy, you may need to pre-warm the load balancer whenever connections to the Relay Proxy cycle. This might happen when you deploy a large number of new servers that connect to the Relay Proxy, or upgrade the Relay Proxy itself.

To learn more, read Testing Relay Proxy performance.

Contributing

We encourage pull requests and other contributions from the community. For instructions on how to contribute to this project, read our contributing guidelines.

About LaunchDarkly

  • LaunchDarkly is a continuous delivery platform that provides feature flags as a service and allows developers to iterate quickly and safely. We allow you to easily flag your features and manage them from the LaunchDarkly dashboard. With LaunchDarkly, you can:
    • Roll out a new feature to a subset of your users (like a group of users who opt-in to a beta tester group), gathering feedback and bug reports from real-world use cases.
    • Gradually roll out a feature to an increasing percentage of users, and track the effect that the feature has on key metrics (for instance, how likely is a user to complete a purchase if they have feature A versus feature B?).
    • Turn off a feature that you realize is causing performance problems in production, without needing to re-deploy, or even restart the application with a changed configuration file.
    • Grant access to certain features based on user attributes, like payment plan (eg: users on the ‘gold’ plan get access to more features than users in the ‘silver’ plan). Disable parts of your application to facilitate maintenance, without taking everything offline.
  • LaunchDarkly provides feature flag SDKs for a wide variety of languages and technologies. For a complete list, read our documentation.
  • Explore LaunchDarkly

ld-relay's People

Contributors

arun251 avatar ashanbrown avatar atrakh avatar brooswit avatar bwoskow-ld avatar cwaldren-ld avatar dependabot[bot] avatar drichelson avatar eli-darkly avatar github-actions[bot] avatar jkodumal avatar joshuaeilers avatar keelerm84 avatar kparkinson-ld avatar launchdarklyci avatar launchdarklyreleasebot avatar ld-repository-standards[bot] avatar louis-launchdarkly avatar lukasmrtvy avatar mightyguava avatar pbzona avatar samhaldane avatar sarahlessner avatar sdif avatar sgandon avatar simonkotwicz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ld-relay's Issues

Flags Not Evaluating Real Time

When I use setup a client side flag via the default https://clientstream.launchdarkly.com, flags get updated in real time when I change the flag value.

When I proxy that through the ld-relay, I have to refresh the page to get the updated flag.

Is this an expected result? Is it possible to get the event stream working via the ld-relay?

Expected Resource Requirements (CPU & Memory) for use in Kubernetes

When running in a container orchestrator such as Kubernetes, it is required to list explicit resource requirements for vCPU and Memory limits.

Can you provide guidance on what the resource requirements are? And do these increase linearly with request volume?

Thanks!

Help with running ld-relay as part of an application

Describe the bug
Could you please tell me what I'm doing wrong? I'm trying to build ld-relay within an application.

To reproduce

func main() {
	r, err := relay.NewRelay(createRelayConfig(), ldlog.NewDefaultLoggers(), nil)
	if err != nil {
		log.Fatal(fmt.Sprintf("Error creating relay: %s", err))
	}

	router := mux.NewRouter()
	// router.HandleFunc("/test", handleFunc)
	router.PathPrefix("/relay").Handler(r)

	srv := &http.Server{
		Handler: router,
		Addr:    "127.0.0.1:8030",
	}
	srv.ListenAndServe()
}

func createRelayConfig() config.Config {
	var cfg config.Config
	cfg.Main.Port, _ = configtypes.NewOptIntGreaterThanZero(8030)
	cfg.Environment = map[string]*config.EnvConfig{
		"test": {
			SDKKey: config.SDKKey("-redacted-"),
		},
	}
	return cfg
}

Expected behavior
Expect curl http://localhost:8030/relay/status not to return a 404

Attempts to collect Datadog traces when DATADOG_TRACE_ADDR is not provided

Describe the bug
In the documentation, it states that the env var DATADOG_TRACE_ADDR is the

URI of the Datadog trace agent. If not provided, traces will not be collected. Example: localhost:8126"

We've found that even if we do not set DATADOG_TRACE_ADDR, the relay still attempts to post traces. In our service, we see many error logs like:
Jun 4 02:08:29: 2021/06/04 02:08:29 Datadog Exporter error: Post "http://localhost:8126/v0.4/traces": dial tcp 127.0.0.1:8126: connect: connection refused.

Below, I do a quick dive into the code that supports this hypothesis.

To reproduce
Don't populate DATADOG_TRACE_ADDR and you should see that we still attempt to post traces.

Expected behavior
We expect ld-relay to not try to emit Datadog traces if DATADOG_TRACE_ADDR is not populated so that it matches the documentation.

SDK version
ld-relay:6.1.0

Additional context
I did a cursory dive into the code which reinforces my belief that this bug exists.
I believe this is where we transform env vars into the MetricsConfig:

reader.ReadStruct(&c.MetricsConfig.Datadog, false)
if c.MetricsConfig.Datadog.Enabled {
for tagName, tagVal := range reader.FindPrefixedValues("DATADOG_TAG_") {
c.MetricsConfig.Datadog.Tag = append(c.MetricsConfig.Datadog.Tag, tagName+":"+tagVal)
}
sort.Strings(c.MetricsConfig.Datadog.Tag) // for test determinacy
}

We call datadog.NewExporter with datadog.Options that will have the traceAddr - in our use case, traceAddr is empty string.

TraceAddr: mc.Datadog.TraceAddr,
StatsAddr: mc.Datadog.StatsAddr,
Tags: mc.Datadog.Tag,
}
exporter, err := datadog.NewExporter(options)

Crossing over into opencensus-go-exporter-datadog, this is the signature of NewExporter. We call newTraceExporter. https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/datadog.go#L102-L111

newTraceExporter is defined here. We end up making this call with traceAddr - newTransport(o.TraceAddr).upload, - https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/trace.go#L58

newTransport is defined here. The first four lines show that we will default the address to 8126 if traceAddr is empty string:

func newTransport(addr string) *transport {
	if addr == "" {
		addr = defaultTraceAddr
	}

https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/transport.go#L37-L40

The transport's upload method then tries to make HTTP POST requests to the addr. In our case, this is the default address they provide, since addr was empty string. https://github.com/DataDog/opencensus-go-exporter-datadog/blob/9baf37265e837038fae9fc59949a3810bce1417f/transport.go#L74

ldrelay does not know about unintentional data change in redis backend.

We are currently using Azure Redis in basic tier (no data persistance) for DEV purposes.
There is possibility that data stored in redis without persistance (persistance is enteprise feature in higher tier) are lost based on infrastructure changes:

Redis Data Persistence: The Premium tier allows you to persist the cache data in an Azure Storage account. In a Basic/Standard cache, all the data is stored only in memory. If there are underlying infrastructure issues there can be potential data loss. We recommend using the Redis data persistence feature in the Premium tier to increase resiliency against data loss. Azure Cache for Redis offers RDB and AOF (coming soon) options in Redis persistence. For more information, see How to configure persistence for a Premium Azure Cache for Redis.

This will cause that feature flags are lost and ldrelay does not know anything about this change -> this will results in errors in #51

Of course that restart will help and feature flags will be loaded, but this is just a "quickfix".
Is possible to let ldrelay to check data in redis constantly? Or should I consider that enteprise tier of Redis (with persistant data) is better aproach? ( its 10x more expensive that basic tier)

Thanks

LD Relay Proxy failing to connect to DynamoDB

I have setup LD Relay proxy with DynamoDB persistent storage in AWS with ESC and Application load balancer. I see following error msg when the Relay initialize -

2021/02/02 23:43:37.182131 ERROR: [env: ...0b42] Data store returned error: failed to get existing items prior to Init: NoCredentialProviders: no valid providers in chain. Deprecated.
2021/02/02 23:43:37.182160 WARN: [env: ...0b42] Detected persistent store unavailability; updates will be cached until it recovers
2021/02/02 23:43:37.182192 WARN: [env: ...0b42] Unexpected data store error when trying to store an update received from the data source: failed to get existing items prior to Init: NoCredentialProviders: no valid providers in chain. Deprecated.

I have given ECS permissions to read from and write to DynamoDB. Am I missing some settings? Pls. help.

Add support for additional endpoints

Currently some endpoints return an HTTP 404 from the ld-relay host. This is problematic if running ld-relay in relay proxy mode. Switching to daemon mode can alleviate this in some cases, but adds a dependency on Redis to application code. The alternative is adding logic to the application to make some requests through the relay and others directly to the LaunchDarkly service.

Example request:

$config = [
    'feature_requester' => new LaunchDarkly\Impl\Integrations\GuzzleFeatureRequester(
        'http://ld-relay.example.com:8030', 
        $apiKey, 
       [] // config'd as needed
    )
];
$ldClient = new \LaunchDarkly\LDClient($apiKey, $config)

$ldUser = new \LaunchDarkly\LDUserBuilder($userId); // config'd as needed
$ldClient->allFlagsState($ldUser);

This adds a couple of entries to the logs and comes back with no flags.

[2019-08-01 09:05:48] LaunchDarkly.ERROR: Received error 404 for GuzzleFeatureRequester::getAll - giving up permanently [] []
[2019-08-01 09:05:48] LaunchDarkly.ERROR: Due to an unrecoverable HTTP error, no further HTTP requests will be made during lifetime of LDClient [] []

Looking in the PHP SDK code to see what URL is being requested shows /sdk/flags. Making that request via curl directly to the ld-relay host confirms what the logging stated before, a missing endpoint.

curl -X GET -H "Authorization: sdk-aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee" http://ld-relay.example.net:8030/sdk/flags
404 page not found

The same request on the LaunchDarkly app host app.launchdarkly.com returns a JSON blob of current flag data; as expected. Ideally, the request would be answered with the state already stored in ld-relay and transparently proxied through to LaunchDarkly in the event the flag state isn't currently in memory.

Relay can't handle Moved error from Clustered Redis.

Using clustered Redis on AWS. I get the following error.

[LaunchDarkly Relay (SdkKey ending with 33f4b)] 2018/10/22 16:35:00 Starting LaunchDarkly streaming connection
[LaunchDarkly Relay (SdkKey ending with 33f4b)] 2018/10/22 16:35:00 Connecting to LaunchDarkly stream using URL: https://stream.launchdarkly.com/all
[LaunchDarkly Relay (SdkKey ending with 33f4b)] 2018/10/22 16:35:00 Error initializing store: MOVED 5868 10.100.150.235:6379
[LaunchDarkly Relay (SdkKey ending with 33f4b)] 2018/10/22 16:35:10 Timeout exceeded when initializing LaunchDarkly client.

I would expect that the Launch Darkly client could handle the Redis Moved error and redirect to the appropriate node. Is no one else using clustered Redis? If not is there a recommended Redis implementation?

Invalid SDK key when starting up relay host via Docker or from source

Hi,

This issue is what I encountered when setting up the launchdarkly relay host via Docker or from source. I get errors upon following the steps provided. I get stuck on a similar invalid SDK error. I am running it on an Ubuntu instance.

image

I am new to launchdarkly and would like to test the relay host. Thanks.

Not able to import this as a library

I can't even clone your repo and make

internal/events/event-relay.go:19:2: use of internal package gopkg.in/launchdarkly/ld-relay.v5/internal/util not allowed
internal/events/event_publisher.go:16:2: use of internal package gopkg.in/launchdarkly/ld-relay.v5/internal/version not allowed
internal/metrics/events_exporter.go:9:2: use of internal package gopkg.in/launchdarkly/ld-relay.v5/internal/events not allowed

Relay Proxy DynamoDB Access with AWS IRSA

Is your feature request related to a problem? Please describe.
Hi. We'd like to run the Relay Proxy in EKS with DynamoDB through AWS's IRSA (IAM roles for service accounts), which requires at minimum version 1.23.13 (https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html).

Describe the solution you'd like
We believe that upgrading the AWS SDK to an IRSA support version would address this.

Describe alternatives you've considered

  • Pull request upgrading the AWS SDK version
  • Deploying the Relay Proxy with a sidecar pod that works with IRSA and mounts the role session credentials in a way the Relay Proxy's current SDK version can use

Support specifying Redis database

My read of the application as-is is that the Redis pool will always use the default logical database of 0. It may be useful in some cases to specify a different logical database. In our case, our Redis instance is overprovisioned and we'd like to share it across multiple services while isolating each service to its own database to keep things relatively clean.

I think we should be able to alter this line, calling instead to NewRedisFeatureStoreFromUrl , and add in a database number (default would still be 0) specified from environment variables: https://github.com/launchdarkly/ld-relay/blob/5.1.0/relay.go#L224

I could work on throwing this together if it sounds viable.

Forwarding events can produce incorrect user counts

It's come to our attention that when ld-relay is configured to proxy analytics events, it always forwards the events to the same LaunchDarkly endpoint—the one that is normally used by server-side SDKs—even if the events came from a mobile or browser client. LaunchDarkly still accepts the events, but the problem is that your usage statistics are partly based on how the events are received, so if 50% of the users were coming from a server-side client and 50% from a mobile client, they will be incorrectly counted as 100% server-side.

This will be fixed in the next release of ld-relay.

Set path prefix for relay?

Is there an option to specify a path prefix that the provided docker image will be hosted at?

Our internal service hosting system uses "path-based routing" instead of subdomains, so services will be hosted at <domain_name>/s/<service_name>/<path>. So we need the ld-relay system to handle requests in the form of example.com/s/ld-relay/flags, for example.

More verbose logging

Is possible to be more verbose in logging of ldrelay? I would like to see requests from clients.
There should be possibility that these requests (or responses) are malformed in my case.
Thanks

Currently running in:

HttpEventPublisher: 2018/12/18 10:11:24 Unexpected error while sending events: Post https://events.launchdarkly.com/bulk: EOF
HttpEventPublisher: 2018/12/18 10:11:24 Will retry posting events after 1 second
[LaunchDarkly Relay (SdkKey ending with dfcde)] 2018/12/18 10:16:05 ERROR: Error encountered processing stream: unexpected EOF
[LaunchDarkly Relay (SdkKey ending with dfcde)] 2018/12/18 10:16:05 Reconnecting in 3.0000 secs
HttpEventPublisher: 2018/12/18 10:26:24 Unexpected error while sending events: Post https://events.launchdarkly.com/bulk: EOF

And is not possible to debug what is wrong, specially if I have NO access to client`s (java) logs.

Standardize log format of relay proxy

Relay proxy emits different log formats, is there anyway to standardize the logs to one format and let the timestamps be with milli or micro or nano second precision along with TimeZone? Sorry, coming from Java land and these type of log statements seems non standard.

We are planning to push Relay Proxy logs to Elasticsearch through fluentd/fluentbit and having one standard helps the log aggregation layer.

Sample of Current Logs - Notice two different formats


DEBUG: 2020/01/07 00:17:32 logging.go:143: Request: method=GET url=/all auth=*095f0 status=200 (streaming)

DEBUG: 2020/01/07 00:17:32 logging.go:143: Request: method=GET url=/all auth=*095f0 status=200 (streaming)

[env: test-dev] 2020/01/07 00:17:32 DEBUG: Application requested server-side /all stream

Precompiled executables

Do you have precompiled executables for this project available for download? I need to deploy this on some servers, and I'd love to just download the executable, rather than install the go tooling to perform a go get. Thanks!

Intermittent invalid sdk key error when restarting relay proxy

Describe the bug
If you use the relay proxy automatic configuration profile and restart the relay proxy while some clients are running you sometimes get a 401 invalid SDK key error. Seems to be some kind of race condition where the relay proxy server doesn't wait and properly initialize itself before starting to listen for connections.

To reproduce

  • Start the relay proxy using an automatic configuration profile (eg. docker run --rm -p 8030:8030 --name ld-relay -e AUTO_CONFIG_KEY="***" launchdarkly/ld-relay)
  • Start an ld client that talks to the relay proxy
  • Restart the relay proxy
  • The client sometimes gets an error ERROR: Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)

Relay version
launchdarkly/ld-relay@sha256:2a3bdc8d8adf774a1631d7f13bacac5a8e9e89cdb2ce84f662104a3bb4be11fa

Client code

import (
	"time"

	ld "gopkg.in/launchdarkly/go-server-sdk.v5"
	"gopkg.in/launchdarkly/go-server-sdk.v5/ldcomponents"
)

func main() {
	var config ld.Config
	relayURI := "http://localhost:8030"
	config.DataSource = ldcomponents.StreamingDataSource().BaseURI(relayURI)
	ld.MakeCustomClient("***", config, 5*time.Second)
}

may be related to #112

Timeout errors to LaunchDarkly with Redis Sentinel enabled

Describe the bug
I am running an ld-relay service from the ld-relay prebuilt docker image. When I start up the service with my SDK key configured via env, the service works fine, but when I point it towards a redis sentinel host, I get a the Timeout encountered waiting for LaunchDarkly client initialization warning in the docker logs.

I am able to set key values into sentinel via the Predis php client. I am unsure why the redis sentinel affects the connection to launch darkly

To reproduce

  1. Start the ld-relay docker image (v6) with environment variables defined for:
LD_ENV_*={SDK_KEY}
USE_REDIS=1
REDIS_TLS=true
REDIS_HOST={SENTINEL_HOST}
REDIS_PORT={SENTINEL_PORT}
CACHE_TTL=30s
LOG_LEVEL=debug
  1. Tail the docker logs

Expected behavior
LD-relay should successfully connect to sentinel and set the environment's feature flag keys.

Logs

2020/11/30 21:27:18.424679 INFO: Starting LaunchDarkly relay version 6.1.0 with configuration from environment variables
2020/11/30 21:27:18.424962 INFO: Using Redis feature store: rediss://REDACTED_HOST:REDACTED_PORT with prefix:
2020/11/30 21:27:18.426017 INFO: [env: ...1387] Starting LaunchDarkly client 5.0.2
2020/11/30 21:27:18.426028 INFO: RedisDataStore: Using URL: rediss://REDACTED_HOST:REDACTED_PORT
2020/11/30 21:27:18.426283 INFO: [env: ...1387] Starting LaunchDarkly streaming connection
2020/11/30 21:27:18.426296 INFO: [env: ...1387] Waiting up to 10000 milliseconds for LaunchDarkly client to start...
2020/11/30 21:27:18.426314 INFO: [env: ...1387] Connecting to LaunchDarkly stream
2020/11/30 21:27:18.427054 DEBUG: [env: ...1387] Sending diagnostic event: {"kind":"diagnostic-init","id":{"diagnosticId":"67c9d367-e916-409d-a7b8-ada0928a2fed","sdkKeySuffix":"**"},"creationDate":1606771638426,"sdk":{"name":"go-server-sdk","version":"5.0.2"},"configuration":{"startWaitMillis":10000,"connectTimeoutMillis":3000,"customBaseURI":false,"diagnosticRecordingIntervalMillis":900000,"usingRelayDaemon":false,"userKeysFlushIntervalMillis":300000,"customEventsURI":false,"socketTimeoutMillis":3000,"eventsCapacity":10000,"eventsFlushIntervalMillis":5000,"allAttributesPrivate":false,"usingProxy":false,"streamingDisabled":false,"customStreamURI":false,"reconnectTimeMillis":1000,"dataStoreType":"custom","inlineUsersInEvents":false,"userKeysCapacity":1000},"platform":{"name":"Go","goVersion":"go1.15.2","osName":"Linux","osArch":"amd64"}}
2020/11/30 21:27:18.427154 INFO: Starting server listening on port 8030
2020/11/30 21:27:19.154240 DEBUG: [env: ...1387] Received all feature flags
2020/11/30 21:27:28.465941 WARN: [env: ...1387] Timeout encountered waiting for LaunchDarkly client initialization
2020/11/30 21:27:28.469060 ERROR: Error initializing LaunchDarkly client for "REDACTED": timeout encountered waiting for LaunchDarkly client initialization


Relay version
https://hub.docker.com/layers/launchdarkly/ld-relay/v6/images/sha256-2a3bdc8d8adf774a1631d7f13bacac5a8e9e89cdb2ce84f662104a3bb4be11fa?context=explore

TLS option not effective

Describe the bug
Setting Tls: true on the redis options is not effective.

To reproduce
Set Tls: true for Redis to connect to a TLS enabled Redis instance.

Expected behavior
LaunchDarkly client initializes.

Logs

INFO: 2019/08/05 19:37:16 relay.go:435: Using Redis feature store: <REDACTED> with prefix: <REDACTED>
INFO: 2019/08/05 19:37:16 redis.go:317: RedisFeatureStore: Using url: redis://<REDACTED>
INFO: 2019/08/05 19:37:16 relay.go:364: Proxying events for
Starting LaunchDarkly streaming connection
Connecting to LaunchDarkly stream using URL: https://stream.launchdarkly.com/all
Timeout exceeded when initializing LaunchDarkly client.

SDK version
4.10.0

Additional context
By passing in the TLS DialOption, specifying host/port instead of URL, redigo will override the TLS dialoption. https://github.com/garyburd/redigo/blob/569eae59ada904ea8db7a225c2b47165af08735a/redis/conn.go#L288

There's an additional issue here, that issues with not being able to connect to Redis shows up in logs only as not being able to connect to launch darkly. There's no indication of an issue with Redis. This is probably more important the bug itself.

SSL support

Hello,

Would it be possible to add the possibility to expose an HTTPS endpoint?

Thanks,
claudio

LD Relay doesn't repopulate Redis when Redis is restarted

Describe the bug
We use LD and LD relay for PHP apps where the recommended approach is to have LD Relay persist the data in Redis and PHP connect to Redis.

When the Redis instance is restarted, it loses data and the PHP apps will then return false for every evaluation.

After Redis has been restarted, LDRelay does not repopulate the data in Redis and we had our flags all return false for 1 hour.

A restart of LD relay fixed the issue.

To reproduce

  • Start LD Relay with persistence to Redis
  • Restart Redis
  • The data is lost in Redis and doesn't get repopulated

Expected behavior

  • Start LD Relay with persistence to Redis
  • Restart Redis
  • LD Relay detects the connection reset to Redis / restart and repopulates the data in Redis

Logs
No useful logs were found

SDK version

Language version, developer tools
PHP 7.4

OS/platform

Additional context
Our cache TTL for redis was set at 30s but we were missing flag definitions for almost an hour until we rebooted LD relay which fixed the issue.

Benchmarking/Performance based on the flag size/rules

The relay proxy docs have this:

The Relay Proxy's resource usage increases based on the size of your flags or segments, number of SDK connections, and events it proxies.

Do we have any way currently benchmark the ld-relay process (mostly to understand resource usage) by generating synthetic flag data?

Stop invalid environment keys from preventing relay start

Is your feature request related to a problem? Please describe.
It'd be nice if there was a configuration option to allow the relay to continue startup even when some of the environment SDK keys are invalid.

The situation we ran into at my company was that we were using the relay for multiple projects about 10 right now but we expect it to grow maybe even to about 100 or more. One of the teams either removed an environment or reset their keys which on the next restart of our relay caused:

ERROR: Received HTTP error 401 (invalid SDK key) for streaming connection - giving up permanently
ERROR: 2019/08/23 10:24:50 relay.go:555: Error initializing LaunchDarkly client for **********: LaunchDarkly client initialization failed

which then broke the relay for all other clients. This was in a non prod environment but got us worried about deploys to prod. We're looking at adding a check in CI to make sure all the keys are valid but this would only be during our deployment of the relay. It wouldn't help if the service was cycled for any reason on the instance.

It'd be great if there was a cli options that we could enable to have the relay start even with invalid keys. So that we can protect against teams that need to reset or change sdk keys, create or delete environments, from breaking other apps that rely on the relay.

Describe the solution you'd like
Add a cli options such as --allow-invalid-keys which would would emit warnings when failing to connect to launch darkly but would not stop other environments in the config where the keys do work. Adding these warnings as metrics that can be sent out to datadog ect... would be good for monitoring too.

Certificate error when starting docker container

I see this error when trying to start the LD docker container

INFO: [main] Starting LaunchDarkly relay version 5.11.0 with configuration from environment variables
INFO: [env: dev] Starting LaunchDarkly client 4.16.2
INFO: [env: dev] Starting LaunchDarkly streaming connection
INFO: [env: dev] Waiting up to 10000 milliseconds for LaunchDarkly client to start...
INFO: [env: dev] Connecting to LaunchDarkly stream
INFO: [main] Starting server listening on port 8030
WARN: [env: dev] Unable to establish streaming connection: Get https://stream.launchdarkly.com/all: x509: certificate signed by unknown authority

Kindly advise.

Flag Evaluation Route Returns All Flags

I recently installed the ld-relay via docker and noticed that/sdk/evalx/{clientId}/users/{user} returns all flags instead of only client side flags.

Is this expected? When the client connects to https://app.launchdarkly.com, it only returns flags that have been marked as "Make this flag available to client-side SDKs", but when I access that route via the ld-relay, it returns all flags.

Is it possible to only return client side flags?

LaunchDarkly proxy status page displays Client-side ID in plain text

I have set-up LD Proxy in AWS env using ECS and a load balancer. I have enabled persistent storage using DynamoDB and auto configuration. Everything works well, however when I used the status page (https:///status) in the browser I noticed that the client id and the DynamoDB name are displayed in plain text. Is there a way to disable LB Proxy from displaying sensitive info in plain text?

I have used ENV_DATASTORE_PREFIX=ld-flags-$CID for auto config setting, this causes the client id to display in plain text. If I dont use $CID, I get an error msg - 2021/02/03 22:47:04.503636 ERROR: Configuration error: when using auto-configuration with database storage, database prefix (or, if using DynamoDB, table name) must be specified and must contain "$CID"

Metrics not getting exposed for prometheus

On running ld-relay and prometheus on kubernetes, Metrics are not exported by ld-relay.
LD relay configuration used :
[prometheus] enabled = true port = 9100 prefix = ""

K8s conf for service monitor :
`---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: ld-service-monitor
labels:
app: ld-relay
spec:
selector:
matchLabels:
app: ld-relay
namespaceSelector:
matchNames:
-
endpoints:

  • port: metrics
    interval: 10s
    honorLabels: true`

Prometheus is scraping the specified endpoint, but data is not being exported to localhost:9100/metrics by ld-relay.

Also, no error messages are present in ld-relay logs.

/status endpoint - datastore status should be a top-level object

Is this a support request?

No.

Describe the bug

Currently the /status endpoint returns the health status of the persistent store inside an environment object. Thus, if i have two environments configured, the data store status is included in both the environments. I do not think that's correct, since we can only configure one persistent store for launchdarkly relay?

To reproduce

Documentation: https://github.com/launchdarkly/ld-relay/blob/v6/docs/persistent-storage.md

Expected behavior

{
  "environments": {
    "environment1": {
      "sdkKey": "sdk-********-****-****-****-*******99999",
      "envId": "999999999999999999999999",
      "mobileKey": "mob-********-****-****-****-*******99999",
      "status": "connected",
      "connectionStatus": {
        "state": "VALID",
        "stateSince": 10000000
      },      
    "environment2": {
      "sdkKey": "sdk-********-****-****-****-*******99999",
      "envId": "999999999999999999999999",
      "mobileKey": "mob-********-****-****-****-*******99999",
      "status": "connected",
      "connectionStatus": {
        "state": "INTERRUPTED",
        "stateSince": 12000000,
        "lastError": {
          "kind": "NETWORK_ERROR",
          "time": 12000000
        }
      },
    },
},
{
      "dataStoreStatus": {
        "state": "VALID",
        "stateSince": 10000000,
        "database": "dynamodb",
        "dbTable": "env1"
      }
    }
  },
  "status": "healthy",
  "version": "5.11.1",
  "clientVersion": "4.17.2"
}

Additional context

Metrics related to persistent store operations

I currently don't see any metrics related to persistent store operations. For example:

  • an error counter (for init and upsert operations)
  • a latency metric (for init and upsert operations)

Perhaps others:

  • data store connection pool size

Is this something that you folks have considered and have it in your radar?

We are able to work on a PR as well if you are in agreement with the idea.

Healthcheck endpoint

Consider a setup where a relay is configured to use a persistent store and is being deployed along with the application as a side-car either in a VM or a pod.

At startup, the relay is opening up its communication channels with Redis and launchdarkly.com. There is quite a bit of initial bootstrapping that i assume is happening in the relay during these first seconds at startup.

Does the relay have a healthcheck endpoint that another service can ping to ensure that the relay is ready to serve flag requests? That is, once all the initialisation has been happened.

Uneven connection distribution

Describe the bug

We experienced an ld-relay outage today that we believe was due to very unevenly distributed connections across ld-relay pods. This chart illustrates this quite clearly:

image

The distribution is quite large, ranging from 257 connections down to 51. There are around 1800 total connections.

In addition to the uneven steady state distribution, you can see the large spike in connections during a rolling restart. As pods terminate, their connections are moved to the remaining live pods, which keep them and do not redistribute them to other pods.

To reproduce
Run 10+ instances of ld-relay with many inbound connections.

Expected behavior

Connections to be evenly distributed.

We've experienced similar behaviour with long running HTTP/2 connections, and our solution was to actively terminate the connections after some time period (eg. 500ms, 1s). This was for a very high QPS service though, so that might defeat the purpose of LD using SSE.

Relay version

	gopkg.in/launchdarkly/ld-relay.v5 v5.0.0-20200122220444-c99cac201df1

Language version, developer tools
go1.13.5

Constant 503 errors from the proxy

Describe the bug
My app is using the ruby gem to connect to LD. When working with the ld service directly, everything works as expected. When i add the proxy in the middle (running it with a docker container), the proxy returns 503 errors with the message "Event proxy is not enabled for this environment".
Anything that im missing in the config?

To reproduce
Run a ruby application configured to work with the proxy's urls. use the sdk key for one of your envs.
run the command in the docker with
--from-env LD_ENV_your_env=sdkkey

Expected behavior
No errors should be returned by the process

Relay version
5.12.0

Language version, developer tools
Ruby 2.4.1

OS/platform
OSX Mojave, using RubyMine

Please use Go compatible version tags.

Go version tags must have v as the prefix. The last valid tagged version that adheres to this is v6.0.3. Correctly tagging versions will help us track ld-relay versions correctly.

Exposing more metrics

I understand that the relay proxy exposes these metrics as documented here:

  • connections
  • newconnections
  • requests

We do have the above metrics reported. But if my reading of the relay proxy monitoring is correct, then we should be able to monitor CPU and memory utilizations too. So my query is twofold:

  1. Can you help expose memory and CPU utilisation statistics?
  2. Where can I read more about the above three metrics? We are in demo/trial phase, so a more in-depth understanding would help.

Allow using Environment ID instead of SDK key in Authorization header for server routes

Is your feature request related to a problem? Please describe.
At Appian we plan to integrate LaunchDarkly into our product using the relay proxy. Customers of Appian have their own stack, so we have several thousand servers running for our customers. We plan to connect all of them to shared instances of relay proxy, and we are using the Java server SDK exclusively.

Our issue arises when we want to rotate our server SDK keys. If we want to rotate the key in our proxy we would also need to update the key in the Java SDK. Since the Java SDK is a dependency inside of our web servers, in order to update our key in the client it would require a web server restart in the running instance of Appian. We cannot restart a production Appian web server without our customer’s consent, since our customer’s application would experience downtime. This means that rotating an SDK key would require a significant amount of work.

Describe the solution you'd like
Looking at the source code for the relay proxy, it appears that the SDK key that the Java Server SDK provides is only used to identify which environment should be used. Instead of using the SDK key, we would like to being able to identify the environment through a different key such as the environment ID. It appears that this is the existing functionality for the client side Javascript SDK.

If we are able to change this behavior then we could rotate the keys for our proxy server without needing the web server restart from our customer’s application, and without having to manage additional secrets (SDK key) inside our production web servers.

Describe alternatives you've considered
We condisdered using the Daemon mode, but it looks like we'd be giving up some functionality compared to using proxy mode. Alternatively, if the proxy is only configured with a single mode, perhaps no key should be needed at all.

Support for nginx X-Accel headers

Is your feature request related to a problem? Please describe.

It would be extremely beneficial for those of us using nginx in front of ld-relay to have the relay set the X-Accel-Buffering header to inform nginx that the application server is wanting to stream responses.

Describe the solution you'd like

For ld-relay routes that do streaming, it should set the header X-Accel-Buffering: no

Describe alternatives you've considered

Currently, we are disabling proxy buffering entirely for ld-relay which is not the most appropriate solution. We are considering mapping each ld-relay endpoint that supports streaming in nginx and configure those routes for streaming. Because we are simply proxying from nginx to ld-relay, it would be very nice to have the application server inform us of the streaming intent.

Additional context

https://www.nginx.com/resources/wiki/start/topics/examples/x-accel/#x-accel-buffering

Using Heroku, fails when composing from Docker

Describe the bug
A clear and concise description of what the bug is.
Deploying with Heroku Container Registry fails to start

To reproduce
Following Heroku instructions to deploy using the Docker container

$ docker tag launchdarkly/ld-relay registry.heroku.com/<app>/web
$ docker push registry.heroku.com/<app>/web

Expected behavior
It should start the docker container successfully and run the Relay Proxy, using the Environment Variables specified in the Heroku app's config.

Logs
If applicable, add any log output related to your problem.

2021-04-25T03:39:36.839799+00:00 heroku[web.1]: Starting process with command `/usr/bin/ldr --config /ldr/ld-relay.conf --allow-missing-file --from-env`
2021-04-25T03:39:39.636712+00:00 app[web.1]: Error: Exec format error
2021-04-25T03:39:39.756062+00:00 heroku[web.1]: Process exited with status 126
2021-04-25T03:39:39.861642+00:00 heroku[web.1]: State changed from starting to crashed

SDK version
Using latest, Relay Proxy 6.1.6

Language version, developer tools
N/A

OS/platform
Heroku

Additional context
If we can get this to work (seems like a standard file permissions issue), it'd be a really easy way to spin up Relay Proxy.

DynamoDB config support for local

Please provide configuration support for local dynamoDB. We want to configure the relay proxy to local dynamoDB in local/test environments.

We need to pass in 4 dummy configurations

  1. AWS access_key
  2. AWS secret_key
  3. AWS region
  4. DynamoDB endpoint URL

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBLocal.html

A sample JAVA dynamo client

AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().withEndpointConfiguration(
new AwsClientBuilder.EndpointConfiguration("http://localhost:8000", "us-west-2"))
.build(); ```

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/CodeSamples.Java.html

Include additional configuration information about feature stores in the status endpoint

Is your feature request related to a problem? Please describe.
In our setup, we're using the proxy & sdk in daemon mode with Redis as the shared feature store. When initially setting it up, we noticed that flag values for one of our environments were copied into the incorrect prefix store. Attempting to troubleshoot we were unable to verify which redis host + prefix was being mapped to which environments. In the end, we just redeployed the service in our docker environment and the keys in Redis were synced with the correct environments. We're still not entirely sure if this was a deployment with incorrect information, or if something else happened.

Describe the solution you'd like
Having the non-secret bits of the feature store information available in the status endpoint would make this easier to troubleshoot. Specifically, per environment configured:

  • What feature store is configured
  • Whatever non-secret connection information is available for that feature store: ie. host, prefix, ttl for Redis; table name, ttl for Dynamo etc.

Perhaps including this same information in the flag store itself would also be useful to verify. Some kind of top level "storeinfo" kind of thing that had this same information.

This information is in the logs on startup, but in our environment, log retention is only a few days and we've had this instance up for a while for testing.

no errors in ld-relay output when redis is down

I have redis stopped but I don't see any errors in stdout/stderr

INFO: 2017/08/28 20:51:50 ld-relay.go:82: Starting LaunchDarkly relay version DEV with configuration file /etc/ld-relay.conf
INFO: 2017/08/28 20:51:50 ld-relay.go:248: Listening on port 8030
INFO: 2017/08/28 20:51:50 ld-relay.go:125: Using Redis Feature Store: localhost:6379 with prefix: 
INFO: 2017/08/28 20:51:50 redis.go:63: RedisFeatureStore: Using url: redis://localhost:6379
INFO: 2017/08/28 20:51:50 redis.go:81: RedisFeatureStore: Using prefix: launchdarkly 
INFO: 2017/08/28 20:51:50 redis.go:84: RedisFeatureStore: Using local cache with timeout: 30s
[LaunchDarkly]2017/08/28 20:51:50 Starting LaunchDarkly streaming connection
[LaunchDarkly]2017/08/28 20:51:50 Connecting to LaunchDarkly stream using URL: https://stream.launchdarkly.com/flags
[LaunchDarkly]2017/08/28 20:51:50 Started LaunchDarkly streaming client
[LaunchDarkly]2017/08/28 20:51:50 Successfully initialized LaunchDarkly client!
INFO: 2017/08/28 20:51:50 ld-relay.go:151: Initialized LaunchDarkly client for stage
INFO: 2017/08/28 20:51:50 ld-relay.go:157: Proxying events for environment stage

ld-relay does not support rotating automatic configuration key

Is your feature request related to a problem? Please describe.

ld-relay does not provide a way to load a new configuration key when rotating secrets. Currently the only way to do this is to restart the ld-relay process.

Describe the solution you'd like

Ideally https://github.com/launchdarkly/ld-relay/blob/v6/relay/relay.go would provide a method for reloading config.

When running ld-relay as a docker container the relay could watch the configuration key path for changes and reload the key when it changes as well.

Client invalid sdk key error using automatic config profile

Describe the bug
If you rotate an environment's SDK key and start the relay proxy, clients started with the old SDK key get a 401 invalid SDK key error.

To reproduce

  • Start the relay proxy using an automatic configuration profile (eg. docker run --rm -p 8030:8030 --name ld-relay -e AUTO_CONFIG_KEY="***" launchdarkly/ld-relay)
  • Start an ld client that talks to the relay proxy
  • Rotate an environment's sdk key so that there are two valid sdk keys (new and old)
  • At this point, the relay proxy is notified of the change in the sdk key and everything still works
  • Restart the relay proxy (you should see a log message like Old SDK key ending in *** for environment *** (*** ***) will expire at *** when the relay proxy starts)
  • The client gets an error ERROR: Error in stream connection (giving up permanently): HTTP error 401 (invalid SDK key)

Relay version
launchdarkly/ld-relay@sha256:2a3bdc8d8adf774a1631d7f13bacac5a8e9e89cdb2ce84f662104a3bb4be11fa

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.