palantir / witchcraft-go-server Goto Github PK

A highly opinionated Go embedded application server for RESTy APIs

License: Apache License 2.0

Shell 2.07% Go 97.93%

witchcraft-go-server's Introduction

witchcraft-go-server

witchcraft-go-server is a Go implementation of a Witchcraft server. It provides a way to quickly and easily create servers that work in the Witchcraft ecosystem.

Implementation

Configuration

A witchcraft server is provided with install configuration and runtime configuration. Install configuration specifies configuration values that are static -- they are read in once at server startup and are known to never change. Runtime configuration values are considered refreshable. When file-based configuration is used, whenever the runtime configuration file is updated its contents are loaded and the corresponding values are refreshed. See the section on refreshable configuration for more information on this.

The default configuration uses file-based configuration with the install configuration at var/conf/install.yml and runtime configuration at var/conf/runtime.yml. It is possible to use code to specify different sources of configuration (for example, in-memory providers).

witchcraft-server also supports using encrypted-config-value to automatically decrypt encrypted configuration values. The default configuration expects a key file to be at var/conf/encrypted-config-value.key. It is possible to use code to specify a different source for the key (or to specify that no key should be used). If the configuration does not contain encrypted values, any specified ECV key will not be read. If the install configuration contains encrypted values but the encryption key is missing or malformed, the server will fail to start. If the runtime config contains encrypted values but fails to decrypt them, a warning will be logged and the encrypted values passed to the server.

witchcraft-server defines base configuration for its install and runtime configuration. Servers that want to provide their own install and/or runtime configuration should embed the base configuration structs within the definition of their configuration structs.

Route registration

A witchcraft server is backed by a wrouter.Router and allows authors to register route handlers on the server. The router uses a specific format for path templates to specify path parameters and has rules around the kinds of paths that can be matched. witchcraft is opinionated about the path formats and does not support registering paths that cannot be expressed using its template rules. All witchcraft routes are configured to emit request logs and trace logs and update metrics for the requests using built-in middleware. The context.Context for the http.Request provided to the handlers is configured with all of the standard loggers (service logger, event logger, trace logger, etc.).

When registering routes on the router, it is also possible to specify path/header/query param keys that should be considered "safe" or "forbidden" when used as parameters in logging. These are combined with the default set of safe and forbidden header parameters defined by the req2log package in witchcraft-go-logging.

Liveness, readiness, and health

witchcraft-server registers the endpoints /status/liveness, /status/readiness and /status/health to report the server's liveness, readiness and health. By default, these endpoints use a built-in provider that reports liveness, readiness and health based on the state of the server. It is possible to configure the liveness and readiness providers in code, and health status providers can also be added via code (health supports specifying multiple sources to report health, and the server's built-in health status provider will always be one of them).

The default behavior serves both the user-registered endpoints and the status endpoints from the same server. However, if a "management port" is specified in the server's install configuration and its value differs from the "port" value in configuration, then witchcraft-server starts a second management server on the specified port and serves the status endpoints on that port. This can be useful in scenarios where all of the traffic to the main endpoints require client certificates for TLS but the status endpoints need to be served without requiring client TLS certificates.

Debug & Diagnostic Routes

Witchcraft servers register a route on the management server at /debug/diagnostic/{diagnosticType}, where diagnosticType represents a payload used for debugging a running server node. The response's Content-Type header specifies the encoding and format of the response. The following types are currently supported:

go.goroutines.v1: Plaintext representation of all running goroutines and their stacktraces
go.profile.cpu.1minute.v1: Returns the pprof-formatted cpu profile. See pprof.Profile.
go.profile.heap.v1: Returns the pprof-formatted heap profile as of the last GC. See pprof.Profile.
go.profile.allocs.v1: Returns the pprof-formatted allocs profile for all allocations in the process lifetime. See pprof.Profile.
go.trace.1minute.v1: Returns the pprof-formatted execution trace. See runtime.trace.
metric.names.v1: Records all metric names and tag sets in the process's metric registry.
os.system.clock.v1: Plaintext string representing the current time as measured by the process in the RFC 3339 Nano format.

[Deprecated] Pprof routes

The following routes are registered on the management server (if enabled) to aid in debugging and telemetry collection. These are generally deprecated in favor of the diagnostic routes described above.

/debug/pprof: Provides an HTML index of the other endpoints at this route.
/debug/pprof/profile: Returns the pprof-formatted cpu profile. See pprof.Profile.
/debug/pprof/heap: Returns the pprof-formatted heap profile as of the last GC. See pprof.Profile.
/debug/pprof/cmdline: Returns the process's command line invocation as text/plain. See pprof.Cmdline.
/debug/pprof/symbol: Looks up the program counters listed in the request, responding with a table mapping program counters to function names See pprof.Symbol.
/debug/pprof/trace: Returns the execution trace in binary form. See pprof.Trace.

Context path

If context-path is specified in the install configuration, all of the routes registered on the server will be prefixed with the specified context-path.

Security

witchcraft-server only supports HTTPS. The TLS client authentication type is configurable in code. The base install configuration has fields to specify the location of server key and certificate material for TLS connections.

Although it is not possible to run witchcraft-server using HTTP, it is possible to configure the server in code to use a generated self-signed certificate on start-up. Running the server in this mode and connecting to it using TLS without server certificate verification (equivalent of curl -k or an http.Transport with TLSClientConfig: &tls.Config{InsecureSkipVerify: true}) provides an analog to using HTTP, with the benefit that the traffic itself is still encrypted.

Logging

witchcraft-server is configured with service, event, metric, request and trace loggers from the witchcraft-go-logging project and emits structured JSON logs using zap as the logger implementation. The default behavior emits logs to the var/log directory (var/log/service.log, var/log/request.log, etc.) unless the server is run in a Docker container or has the environment variable $CONTAINER set, in which case the logs are always emitted to stdout. The use-console-log property in the install configuration can also be set to "true" to always output logs to stdout. The runtime configuration supports configuring the log output level for service logs.

The context.Context provided to request handlers is configured with all of the standard loggers (service logger, event logger, trace logger, etc.). All of the handlers are also configured to emit request logs and trace logs.

Service logger origin

By default, the origin field of the service logger is set to be the package path of the package in which the witchcraft-server is started. For example, if the server is started in the file github.com/palantir/project/server/server.go, the origin for all service log lines will be github.com/palantir/project/server.

It is possible to configure the origin to be a different value using code. The origin can be specified to be a string constant or a function can be used that returns a specific package path based on supplied parameters (for example, the function can specify that the caller package's parent package should be used as the origin). The origin can also be set to empty, in which case it is omitted from the log output.

Trace IDs and instrumentation

witchcraft-server supports zipkin-compatible tracing and ensures that every request is instrumented for tracing. witchcraft-server also recognizes that some code will use trace IDs without necessarily using full zipkin-compatible spans, so some allowances are made to support this scenario.

The built-in witchcraft-server middleware that registers loggers on the context also ensures that a zipkin span is started. If the incoming request header has valid zipkin span information (that is, it specifies both a X-B3-TraceId and X-B3-SpanId in the header), then the span created by the middleware is a child span of the incoming span. If the incoming request does not have a trace ID header, a new root span is created. If the header specifies a trace ID but not a span ID, the middleware creates a new root zipkin span, but ensures that the trace ID of the created span matches what is specified in the header. If an incoming request is routed to a registered endpoint, the built-in router middleware will create another span (which is a child span of the one created by the request middleware) whose span name is the HTTP method and template for the endpoint.

The trace information generated by the middleware is set on the header and will be visible to subsequent handlers. If the request specifies a X-B3-Sampled header, the value specified in that header is used to determine sampling. If this header is not present, whether or not the trace is sampled is determined by the sampling source configured for the witchcraft-server (by default, all traces are sampled). If a trace is not sampled, witchcraft-server will not generate any trace log output for it. However, the infrastructure will still perform all of the trace-related operations (such as creating child spans and setting span information on headers). The install configuration field trace-sample-rate represents a float between 0 and 1 (inclusive) to control the proportion of traces sampled by default. If the WithTraceSampler server option is provided, it overrides this configuration.

witchcraft-server also ensures that the context for every request has a trace ID. After the logging middleware executes, the request is guaranteed to have a trace ID (either from the incoming request or from the newly generated root span), and that trace ID is registered on the context. The witchcraft.TraceIDFromContext(context.Context) string function can be used to retrieve the trace ID from the context.

Creating new spans/trace log entries

Use the wtracing.StartSpanFromContext function to start a new span. This function will create a new span that is a child span of the span in the provided context. Defer the Finish() function of the returned span to ensure that the span is properly marked as finished (the "finish" operation will also generate a trace log entry if the span is sampled).

Middleware

witchcraft-server supports registering middleware to perform custom handling/augmenting of incoming requests. There are 2 different kinds of middleware: request and route middleware.

Request middleware is executed on every request received by the server. The function signature for request middleware is func(rw http.ResponseWriter, r *http.Request, next http.Handler). Request middleware is the most common kind of middleware. The server has built-in request middleware that adds a panic handler, sets the loggers and trade ID on the request context and updates request-related metrics. Any user-supplied request middleware is run after the built-in request middleware in the order in which they were added (which means that the context has all of the loggers configured). Request middleware is run before the request is handled by the router, which means that it is possible to rewrite the URL and other properties of the request and the router will route the modified request. However, note that the built-in logging middleware extracts the UID, SID, TokenID and TraceID from the request and sets them on the loggers before user-provided middleware is invoked, so if the user-defined middleware modifies the header in a manner that would change any of these values, the middleware should also update the request context to have loggers that use the updated values.

Route middleware is only executed on the routes that are registered on the router -- they wrap the handler registered on the route, so they are executed after the path has been matched and the handler for the router has been located and the path parameters have been extracted and set on the context. The function signature for route middleware is func(rw http.ResponseWriter, r *http.Request, reqVals RequestVals, next RouteRequestHandler). The RequestVals struct stores the path template for the route along with the path parameters and their values. The server has built-in route middleware that records a request log entry after the request has completed and creates a trace log span and logs a trace log entry after the request has completed. Route middleware is run after all of the request middleware has run, and any user-supplied route middleware is run after the built-in route middleware in the order in which they were added. Because router middleware is executed after the routing has been determined, changing the URL of the request will not change the handler that is invoked or the path parameter values that have been extracted/stored (although it may still impact behavior based on the content of the actual handler that is registered). In general, most users will likely use request middleware rather than route middleware. However, if users want to only execute middleware on matched routes and want route-specific information such as the unrendered path template and the path parameter values, then route middleware should be used.

Long-running execution not associated with a route

In some instances, a server may want a long-running task not associated with an endpoint. For example, the server may want a long-running goroutine that performs an operation at some interval for the lifetime of the server.

It is recommended that such goroutines be launched in the initialization function provided to witchcraft.With and use the ctx Context as its context. This context has the same lifecycle as the server and has all of the configured loggers (service loggers, metric loggers, etc.) already configured on it.

The provided context does not have a span or trace ID associated with it. If a trace ID is desired, create a new span with wtracing.StartSpanFromContext and the provided context to derive a new context that has a new root span associated with it. This function also updates any loggers in the context to use the new trace ID (for example, service loggers will include the trace ID).

Metrics

witchcraft-server initializes a metrics registry that uses the github.com/palantir/pkg/metrics package (which uses github.com/rcrowley/go-metrics internally) to track metrics for the server. All of the tracked metrics are emitted as metric log entries once every metric emit interval, which is 60 seconds by default (and can be configured to be a custom interval in the install configuration).

By default, witchcraft-server captures various Go runtime metrics (such as allocations, number of running goroutines, etc.) at the same frequency as the metric emit frequency. The collection of Go runtime statistics can be disabled with the WithDisableGoRuntimeMetrics server method.

SIGQUIT handling

witchcraft-server sets up a SIGQUIT handler such that, if the program is terminated using a SIGQUIT signal (kill -3), a goroutine dump is written as a diagnostic.1 log. This behavior can be disabled using server.WithDisableSigQuitHandler. If server.WithSigQuitHandlerWriter is used, the stacks will also be written in their unparsed form to the provided writer.

Shutdown signal handling

witchcraft-server attempts to drain active connections and gracefully shut down by calling server.Shutdown upon receiving a SIGTERM or SIGINT signal. This behavior can be disabled using server.WithDisableShutdownSignalHandler.

Example server initialization

Basic production server

The following is an example program that launches a witchcraft-server that registers a GET /myNum endpoint that returns a randomly generated number encoded as JSON:

package main

import (
	"context"
	"math/rand"
	"net/http"

	"github.com/palantir/conjure-go-runtime/v2/conjure-go-server/httpserver"
	"github.com/palantir/pkg/refreshable"
	"github.com/palantir/witchcraft-go-server/v2/witchcraft"
	"github.com/palantir/witchcraft-go-server/v2/wrouter"
)

func main() {
	if err := witchcraft.NewServer().
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
			if err := registerMyNumEndpoint(info.Router); err != nil {
				return nil, err
			}
			return nil, nil
		}).
		Start(); err != nil {
		panic(err)
	}
}

func registerMyNumEndpoint(router wrouter.Router) error {
	return router.Get("/myNum", http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		httpserver.WriteJSONResponse(rw, rand.Intn(100), http.StatusOK)
	}))
}

Creating a witchcraft-server starts with the witchcraft.NewServer function, which returns a new witchcraft server with default configuration. The *witchcraft.Server struct has various With* functions that can be used to configure the server, and the Start() function starts the server using the specified configuration.

The WithInitFunc(InitFunc) function is used to register routes on the server. The initialization function provided to WithInitFunc is of the type witchcraft.InitFunc, which has the following definition: type InitFunc func(ctx context.Context, info InitInfo) (cleanup func(), rErr error).

The ctx provided to the function is valid for the duration of the server and has loggers configured on it. The info struct contains fields that can be used to initialize various state and configuration for the server -- refer to the InitInfo documentation for more information.

In this example, a "GET" endpoint is registered on the router using the "/myNum" path, and rest package is used to write a JSON response.

This example server uses all of the witchcraft defaults -- it looks for install configuration in var/conf/install.yml and uses config.Install as its type, looks for runtime configuration in var/conf/runtime.yml and uses config.Runtime as its type, and looks for an encrypted-config-value key in var/conf/encrypted-config-value.key. The install configuration must also specify paths to key and certificate files to use for TLS.

Basic local/test server

The defaults for the server make sense for a production environment, but can make running the server locally (or in tests) cumbersome. We can modify the main function as follows to configure the witchcraft server to use in-memory defaults:

func main() {
	if err := witchcraft.NewServer().
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
			if err := registerMyNumEndpoint(info.Router); err != nil {
				return nil, err
			}
			return nil, nil
		}).
		WithSelfSignedCertificate().
		WithECVKeyProvider(witchcraft.ECVKeyNoOp()).
		WithRuntimeConfig(config.Runtime{}).
		WithInstallConfig(config.Install{
			ProductName: "example-app",
			Server: config.Server{
				Port: 8100,
			},
			UseConsoleLog: true,
		}).
		Start(); err != nil {
		panic(err)
	}
}

The WithSelfSignedCertificate() function configures the server to start using a generated self-signed certificate, which removes the need to specify server TLS material. The WithECVKeyProvider(witchcraft.ECVKeyNoOp()) function configures the server to use an empty ECV key source. The WithRuntimeConfig(config.Runtime{}) function configures the server to use the provided runtime configuration (in this case, it is empty), and the WithInstallConfig function specifies the install configuration that should be used (it specifies that port 8100 should be used and that log output should go to STDOUT).

With this configuration, the program can be run using go run:

➜ go run main.go
{"level":"INFO","time":"2018-11-27T05:47:02.013456Z","message":"Listening to https","type":"service.1","origin":"github.com/palantir/witchcraft-go-server/v2/app_example","params":{"address":":8100","server":"example-app"}}

Issuing a request to this server using curl produces the expected response (note that the -k option is used to skip certificate verification because the server is using a self-signed certificate):

➜ curl -k https://localhost:8100/myNum
81

You can also observe that the server emits trace and request logs based on receiving this request:

{"time":"2018-11-27T05:47:28.313585Z","type":"trace.1","span":{"traceId":"7e43bde2647413fc","id":"01228e628b3b3d22","name":"GET /myNum","parentId":"7e43bde2647413fc","timestamp":1543297648313551,"duration":29000}}
{"time":"2018-11-27T05:47:28.313719Z","type":"request.2","method":"GET","protocol":"HTTP/2.0","path":"/myNum","status":200,"requestSize":0,"responseSize":3,"duration":146,"traceId":"7e43bde2647413fc","params":{"Accept":"*/*","User-Agent":"curl/7.54.0","X-B3-Parentspanid":"7e43bde2647413fc","X-B3-Sampled":"1","X-B3-Spanid":"01228e628b3b3d22","X-B3-Traceid":"7e43bde2647413fc"}}
{"time":"2018-11-27T05:47:28.313802Z","type":"trace.1","span":{"traceId":"7e43bde2647413fc","id":"7e43bde2647413fc","name":"witchcraft-go-server request middleware","timestamp":1543297648313496,"duration":304000}}

Server using install configuration

The previous examples used the built-in install configuration. Most real servers will use custom install configuration that specifies configuration for the server. Any struct can be used as install configuration, but it must support being unmarshaled as YAML and must embed the config.Install struct. The install configuration is loaded once when the server starts (it is never reloaded), so only values that are static for the lifetime of the server should be specified in this configuration.

The following example modifies the previous example so that the endpoint returns the number defined in the install configuration instead of a random number:

package main

import (
	"context"
	"net/http"

	"github.com/palantir/conjure-go-runtime/v2/conjure-go-server/httpserver"
	"github.com/palantir/pkg/refreshable"
	"github.com/palantir/witchcraft-go-server/v2/config"
	"github.com/palantir/witchcraft-go-server/v2/witchcraft"
	"github.com/palantir/witchcraft-go-server/v2/wrouter"
)

type AppInstallConfig struct {
	config.Install `yaml:",inline"`

	MyNum int `yaml:"my-num"`
}

func main() {
	if err := witchcraft.NewServer().
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
			if err := registerMyNumEndpoint(info.Router, info.InstallConfig.(AppInstallConfig).MyNum); err != nil {
				return nil, err
			}
			return nil, nil
		},
		).
		WithSelfSignedCertificate().
		WithECVKeyProvider(witchcraft.ECVKeyNoOp()).
		WithRuntimeConfig(config.Runtime{}).
		WithInstallConfigType(AppInstallConfig{}).
		WithInstallConfig(AppInstallConfig{
			Install: config.Install{
				ProductName: "example-app",
				Server: config.Server{
					Port: 8100,
				},
				UseConsoleLog: true,
			},
			MyNum: 13,
		}).
		Start(); err != nil {
		panic(err)
	}
}

func registerMyNumEndpoint(router wrouter.Router, num int) error {
	return router.Get("/myNum", http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		httpserver.WriteJSONResponse(rw, num, http.StatusOK)
	}))
}

This example defines the AppInstallConfig struct, which embeds config.Install and also defines a MyNum field. The WithInstallConfigType(AppInstallConfig{}) function call is added to specify AppInstallConfig{} as the install struct and the initialization function logic is modified to convert the provided installConfig interface{} into an AppInstallConfig and uses the MyNum value as the value that is returned by the endpoint. The WithInstallConfig function is also updated to use configuration that specifies a value for MyNum.

Running the updated program using go run main.go and issuing curl -k https://localhost:8100/myNum returns 13.

A real program will generally read runtime configuration from disk rather than specifying it directly in code. We can modify the example above to do this by simply removing the WithInstallConfig call:

package main

import (
	"context"
	"net/http"

	"github.com/palantir/conjure-go-runtime/v2/conjure-go-server/httpserver"
	"github.com/palantir/pkg/refreshable"
	"github.com/palantir/witchcraft-go-server/v2/config"
	"github.com/palantir/witchcraft-go-server/v2/witchcraft"
	"github.com/palantir/witchcraft-go-server/v2/wrouter"
)

type AppInstallConfig struct {
	config.Install `yaml:",inline"`

	MyNum int `yaml:"my-num"`
}

func main() {
	if err := witchcraft.NewServer().
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
			if err := registerMyNumEndpoint(info.Router, info.InstallConfig.(AppInstallConfig).MyNum); err != nil {
				return nil, err
			}
			return nil, nil
		},
		).
		WithSelfSignedCertificate().
		WithECVKeyProvider(witchcraft.ECVKeyNoOp()).
		WithInstallConfigType(AppInstallConfig{}).
		Start(); err != nil {
		panic(err)
	}
}

func registerMyNumEndpoint(router wrouter.Router, num int) error {
	return router.Get("/myNum", http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		httpserver.WriteJSONResponse(rw, num, http.StatusOK)
	}))
}

By default, the install configuration is read from var/conf/install.yml. Create a file at that path relative to the Go file and provide it with the YAML content for the configuration:

product-name: "example-app"
use-console-log: true
server:
  port: 8100
my-num: 77

Running the updated program using go run main.go and issuing curl -k https://localhost:8100/myNum returns 77.

Server using runtime configuration

Runtime configuration is similar to install configuration. The main difference is that runtime configuration supports reloading configuration. When file-based runtime configuration is used, whenever the configuration file is updated, the associated values are updated as well.

The following example defines a custom runtime configuration struct and returns the refreshable int value in the runtime from its endpoint (the example uses a basic in-memory install configuration for simplicity):

package main

import (
	"context"
	"net/http"

	"github.com/palantir/conjure-go-runtime/v2/conjure-go-server/httpserver"
	"github.com/palantir/pkg/refreshable"
	"github.com/palantir/witchcraft-go-server/v2/config"
	"github.com/palantir/witchcraft-go-server/v2/witchcraft"
	"github.com/palantir/witchcraft-go-server/v2/wrouter"
)

type AppRuntimeConfig struct {
	config.Runtime `yaml:",inline"`

	MyNum int `yaml:"my-num"`
}

func main() {
	if err := witchcraft.NewServer().
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
			myNumRefreshable := refreshable.NewInt(info.RuntimeConfig.Map(func(in interface{}) interface{} {
				return in.(AppRuntimeConfig).MyNum
			}))
			if err := registerMyNumEndpoint(info.Router, myNumRefreshable); err != nil {
				return nil, err
			}
			return nil, nil
		},
		).
		WithSelfSignedCertificate().
		WithECVKeyProvider(witchcraft.ECVKeyNoOp()).
		WithInstallConfig(config.Install{
			ProductName: "example-app",
			Server: config.Server{
				Port: 8100,
			},
			UseConsoleLog: true,
		}).
		WithRuntimeConfigType(AppRuntimeConfig{}).
		Start(); err != nil {
		panic(err)
	}
}

func registerMyNumEndpoint(router wrouter.Router, numProvider refreshable.Int) error {
	return router.Get("/myNum", http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		httpserver.WriteJSONResponse(rw, numProvider.CurrentInt(), http.StatusOK)
	}))
}

The refreshable configuration warrants some closer examination. Note that the registerMyNumEndpoint takes a numProvider refreshable.Int as an argument rather than an int and returns the result of CurrentInt(). Conceptually, the numProvider is guaranteed to always return the current value of the number specified in the runtime configuration. Using this pattern removes the need for writing code that listens for updates -- the code can simply assume that the provider always returns the most recent value. refreshable.Int and refreshable.String are helper types that provide functions that return the current value of the correct type. For types without helper functions, the general refreshable.Refreshable should be used, and the interface{} returned by Current() must be explicitly converted to the proper target type (this is required because Go does not support generics/templatization).

The numProvider provided to registerMyNumEndpoint is derived by applying a mapping function to the runtimeConfig refreshable.Refreshable parameter. runtimeConfig.Map is provided with a function that, given an updated runtime configuration, returns the portion of the configuration that is required. The input to the mapping function must be explicitly cast to the runtime configuration type (in this case, in.(AppRuntimeConfig)), and then the relevant section can be accessed (or derived) and returned. The result of the Map function is a Refreshable that returns the mapped portion. In this case, because we know the result will always be an int, we wrap the returned Refreshable in a refreshable.NewInt call, which provides the convenience function CurrentInt() that performs the type conversion of the result to an int.

By default, the runtime configuration is read from var/conf/runtime.yml. Create a file at that path relative to the Go file and provide it with the YAML content for the configuration:

my-num: 99

Running the updated program using go run main.go and issuing curl -k https://localhost:8100/myNum returns 99. While the program is still running, update the content of the file to be my-num: 88, save it, then run the curl command again. The output is 88.

Full server example

The following is an example of a server that defines and uses both custom install and runtime configuration:

package main

import (
	"context"
	"net/http"

	"github.com/palantir/conjure-go-runtime/v2/conjure-go-server/httpserver"
	"github.com/palantir/pkg/refreshable"
	"github.com/palantir/witchcraft-go-server/v2/config"
	"github.com/palantir/witchcraft-go-server/v2/witchcraft"
	"github.com/palantir/witchcraft-go-server/v2/wrouter"
)

type AppInstallConfig struct {
	config.Install `yaml:",inline"`

	MyNum int `yaml:"my-num"`
}

type AppRuntimeConfig struct {
	config.Runtime `yaml:",inline"`

	MyNum int `yaml:"my-num"`
}

func main() {
	if err := witchcraft.NewServer().
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
			if err := registerInstallNumEndpoint(info.Router, info.InstallConfig.(AppInstallConfig).MyNum); err != nil {
				return nil, err
			}

			myNumRefreshable := refreshable.NewInt(info.RuntimeConfig.Map(func(in interface{}) interface{} {
				return in.(AppRuntimeConfig).MyNum
			}))
			if err := registerRuntimeNumEndpoint(info.Router, myNumRefreshable); err != nil {
				return nil, err
			}
			return nil, nil
		},
		).
		WithInstallConfigType(AppInstallConfig{}).
		WithRuntimeConfigType(AppRuntimeConfig{}).
		WithSelfSignedCertificate().
		WithECVKeyProvider(witchcraft.ECVKeyNoOp()).
		Start(); err != nil {
		panic(err)
	}
}

func registerInstallNumEndpoint(router wrouter.Router, num int) error {
	return router.Get("/installNum", http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		httpserver.WriteJSONResponse(rw, num, http.StatusOK)
	}))
}

func registerRuntimeNumEndpoint(router wrouter.Router, numProvider refreshable.Int) error {
	return router.Get("/runtimeNum", http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
		httpserver.WriteJSONResponse(rw, numProvider.CurrentInt(), http.StatusOK)
	}))
}

With var/conf/install.yml:

product-name: "example-app"
use-console-log: true
server:
  port: 8100
my-num: 7

And var/conf/runtime.yml:

my-num: 13

Querying installNum returns 7, while querying runtimeNum returns 13:

➜ curl -k https://localhost:8100/installNum
7
➜ curl -k https://localhost:8100/runtimeNum
13

In a production server, WithSelfSignedCertificate() and WithECVKeyProvider(witchcraft.ECVKeyNoOp()) would not be called and the proper security and key material would exist in their expected locations.

Refreshable configuration

The runtime configuration for witchcraft-server uses the refreshable.Refreshable interface. Conceptually, a Refreshable is a container that holds a value of a specific type that may be updated/refreshed. The following is the interface definition for Refreshable:

type Refreshable interface {
	// Current returns the most recent value of this Refreshable.
	Current() interface{}

	// Subscribe subscribes to changes of this Refreshable. The provided function is called with the value of Current()
	// whenever the value changes.
	Subscribe(consumer func(interface{})) (unsubscribe func())

	// Map returns a new Refreshable based on the current one that handles updates based on the current Refreshable.
	Map(func(interface{}) interface{}) Refreshable
}

The runtimeConfig refreshable.Refreshable parameter provided to the initialization function specified using WithInitFunc stores the latest unmarshaled runtime configuration as its current value, and the type of the value is specified using the WithRuntimeConfigType function (if this function is not called, config.Runtime is used as the default type).

For example, for the call:

witchcraft.NewServer().
    WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (func(), error) {
        return nil, nil
    }).
    WithRuntimeConfigType(AppRuntimeConfig{})

The WithRuntimeConfigType(AppRuntimeConfig{}) function specifies that the type of the runtime configuration is AppRuntimeConfig, so the value returned by runtimeConfig.Current() in WithInitFunc will have the type AppRuntimeConfig. Because Go does not have a notion of generics, the author must make this association manually and perform the conversion of the current value into the desired type when using it (for example, runtimeConfig.Current().(AppRuntimeConfig)).

The Refreshable interface supports using the Map function to derive a new refreshable based on the value of the current refreshable. This allows downstream functions that are only interested in a subset of the refreshable to observe just the relevant portion.

For example, consider the AppRuntimeConfig definition:

type AppRuntimeConfig struct {
	config.Runtime `yaml:",inline"`

	MyNum int `yaml:"my-num"`
}

A downstream function may only be interested in updates to the MyNum variable -- if updates to config.Runtime are not relevant to the function, there is no need to subscribe to it. The following code derives a new Refreshable from the runtimeConfig refreshable:

myNumRefreshable := runtimeConfig.Map(func(in interface{}) interface{} {
    return in.(AppRuntimeConfig).MyNum
})

The Current() function for myNumRefreshable returns the MyNum field of in.(AppRuntimeConfig), and the derived Refreshable is only updated when the derived value changes. Accessing a field is the most common usage of Map, but any arbitrary logic can be performed in the mapping function. Just note that the mapping will be performed whenever the parent refreshable is updated and the result will be compared using reflect.DeepEqual.

The general Refreshable interface returns an interface{} and its result must always be converted to the actual underlying type. However, if a Refreshable is known to return an int, string or bool, convenience wrapper types are provided to return typed values. For example, refreshable.NewInt(in Refreshable) returns a refreshable.Int, which is defined as:

type Int interface {
	Refreshable
	CurrentInt() int
}

The CurrentInt() function returns the current value converted to an int, which makes it easier to use in code and alleviates the need for clients to manually remember the type stored in the Refreshable.

If a Refreshable with a particular value/type is used widely throughout a code base, it may make sense to define a similar interface so that clients do not have to manually track the type information. For example, a typed Refreshable for AppRuntimeConfig can be defined as follows:

type RefreshableAppRuntimeConfig interface {
	Refreshable
	CurrentAppRuntimeConfig() AppRuntimeConfig
}

type refreshableAppRuntimeConfig struct {
	Refreshable
}

func (r refreshableAppRuntimeConfig) CurrentAppRuntimeConfig() AppRuntimeConfig {
	return rt.Current().(AppRuntimeConfig)
}

Updating refreshable configuration: provider-based vs. push-based

The "provider" model of configuration updates takes the philosophy that executing code simply needs the most up-to-date value of a Refreshable when it executes. This model makes the most sense when the value is read whenever an endpoint is executed or when a long-running or periodically executed background task executes. In these scenarios, the latest value of the Refreshable is only needed when the logic executes. This update model is typically the most common, and is achieved by passing down specific Refreshable providers for the required values to the handlers/routines.

However, in some cases, an application may want to be notified of every update to a field and react to that update immediately -- for example, if updating a specific configuration field triggers an expensive computation that should happen immediately, the logic wants to be notified as soon as the update is made.

In this scenario, the Subscribe function should be used for the Refreshable that has the value for which updates are needed. For example, consider the following configuration:

type AppRuntimeConfig struct {
	config.Runtime `yaml:",inline"`

	AssetURLs []string `yaml:"asset-urls"`
}

The AssetURLs field specifies URLs that should be downloaded by the program whenever the value is updated. This can be handled as follows:

unsubscribe := runtimeConfig.Map(func(in interface{}) interface{} {
    return in.(AppRuntimeConfig).AssetURLs
}).Subscribe(func(in interface{}) {
	assetURLs := in.([]string)
	// perform work
})
// unsubscribe should be deferred or stored and run at shutdown

The Map function returns a new Refreshable that updates only when the AssetURLs field is updated, and the Subscribe function subscribes a listener that performs work as soon as the value is updated. This ensures that the logic is run as soon as the value is refreshed every time the value is updated.

License

This project is made available under the Apache 2.0 License.

witchcraft-go-server's People

Contributors

Stargazers

Watchers

witchcraft-go-server's Issues

Allow selecting which go metrics you care about to emit

Currently, you can either have or not have the GoMetrics wholesale. This puts daemons in a tough spot where the cost of each additional metric is pretty high, but some of the GoMetrics are more useful than others. Can we allow selection of which GoMetrics we want vs not want?

wresource metrics middleware incompatible with pkg/metrics 0.10.2

See failing build here: https://github.com/palantir/witchcraft-go-server/compare/bm/bump-pkg?expand=1

It looks like we were depending on the behavior before palantir/pkg#153.

We assume that the request context will have tags here: https://github.com/palantir/witchcraft-go-server/blob/v1.6.1/witchcraft/internal/middleware/request.go#L166

This worked previously by setting them here: https://github.com/palantir/witchcraft-go-server/blob/v1.6.1/witchcraft/wresource/resource.go#L92

Reloading runtime configuration with nil logger configuration panics

Start server with runtime configuration that has a non-nil logger configuration block. For example:

my-num: 13
logging:
  level: debug

Update the file to be a configuration that does not contain logger configuration. For example:

my-num: 14

Expected

Logger configuration is either set to default value or not modified

Actual

Program panics:

goroutine 27 [running]:
github.com/palantir/witchcraft-go-server/witchcraft.(*Server).Start.func4(0x14fcd60, 0xc00000c080)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/witchcraft.go:517 +0x4d
github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*DefaultRefreshable).Update(0xc0000a7080, 0x14fcd60, 0xc00000c080, 0xc00000c080, 0x0)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_default.go:48 +0x280
github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*DefaultRefreshable).Map.func1(0x149dca0, 0xc00000c040)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_default.go:81 +0x67
github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*DefaultRefreshable).Update(0xc0000a6f80, 0x149dca0, 0xc00000c040, 0xc00000c040, 0x0)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_default.go:48 +0x280
github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*DefaultRefreshable).Map.func1(0x149dca0, 0xc00000c020)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_default.go:81 +0x67
github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*DefaultRefreshable).Update(0xc0000a6f00, 0x149dca0, 0xc00000c020, 0xc00000c020, 0x24b96f99c8f4fb9a)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_default.go:48 +0x280
github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*fileRefreshable).watchForChanges.func1(0x15fc740, 0xc000170b10, 0xc0000a6f40, 0xc0000b5320, 0x15722cd, 0xb)
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_file.go:92 +0x47a
created by github.com/palantir/witchcraft-go-server/witchcraft/refreshable.(*fileRefreshable).watchForChanges
	/Volumes/git/go/src/github.com/palantir/witchcraft-go-server/witchcraft/refreshable/refreshable_file.go:70 +0x2b7

Extend concurrency limiting middleware for more live-reloading

I went to go use the concurrency-limiting middleware package: https://github.com/palantir/witchcraft-go-server/blob/585e4072d7d3732ec36e52380ef6030a9567d217/witchcraft/ratelimit/middleware.go. I found that it didn't quite do everything I wanted to do. It currently allows refreshing the limit value, but it doesn't allow editing the (limit, match function) pairs at runtime, which could be useful (as it stands, it doesn't allow changing the match function logic or adding new limits or match functions. This is because each middleware is tied to a single (limit, match function) pair, and middleware is installed at server initialization time).

For example, say a server installs middleware that uses a match function that matches a specific user agent. If the server wants to, at runtime, also limit concurrency for an additional user agent, it has to restart. (This functionality is currently available in some of our internal Java utilities that are intended to achieve the same goal).

Add Server field to witchcraft.InitInfo

Some initialization functions require a way of launching a goroutine that may eventually call Shutdown on the server. Currently this is being worked around by creating the initFn as a variable after constructing the server, using a reference to the server inside the anonymous function, then setting the initFn on the server before starting it.

Instead, we could provide a Server field on InitInfo. It should have descriptive documentation explaining the state(s) the server can be in when initfn is called.

Default server/client metrics too verbose, no option to trim down

Some of the default request/server metrics being emitted break out into 12 distinct values (15m, 1m, 5m, Count, Max, Mean, meanRate, min, p50, p95, p99, stddev), which are probably not all equally valuable and are expensive, especially when emitted by daemons deployed on every single host.

I propose we move these metrics to a whitelist model, where we have a default filter of values that we care about for a given metric (i.e if i only care about P99, i should be able to only get that) and allow products to override that filter.

This gives us 1) ability for us to have a sane default that we can iterate on over time, depending on the value of metric vs cost trade-off 2) have products that have different constraints make different decisions (i.e daemons with health end points only might not care about any of those).

pprof/goroutine returns 404

pprof includes an endpoint named goroutine which returna a stack trace of all current go routines. However, when trying to access this endpoint, the server returns a message 404 page not found.

Expose listener address when configured for port 0

Configuring the server on port 0 tells the kernel to assign a random port and return the address. We could handle this more nicely:

easy: pass instantiated listener to ServeTLS instead of using ListenAndServeTLS so we can log the address
more involved: expose this info over an API (maybe on the *Server object?)

Encrypted values in configuration should be decoded while maintaining YAML structure

What happened?

WGS crashed when attempting to start up using install configuration that contained the following, where the content of the encrypted value is a certificate with line breaks as part of the data:

client-ca-certs:
      test-ca: '${enc:encrypted_bytes}'

Failed to unmarshal install base configuration YAML
yaml: line 13: could not find expected ':'
github.com/palantir/witchcraft-go-server/v2/witchcraft.(*Server).initInstallConfig
	/go/src/.../vendor/github.com/palantir/witchcraft-go-server/v2/witchcraft/witchcraft.go:834
...

What did you want to happen?

WGS should start and load the proper configuration

Analysis

The issue here is that, currently, if the configuration contains any ECV values, all of those values are decrypted within the configuration bytes and then the resulting bytes are unmarshalled. This means that, if the decrypted values render in a manner that semantically change the meaning of the YAML (for example, by specifying key-value pairs), this impacts the manner in which the decryption occurs.

Instead, the ideal would be to read in the YAML nodes and then, for any text nodes that are encrypted, use the decrypted text as the value.

log canceled request as 499 instead of 500

The server currently responds to any request, even if the request is canceled. This makes it possible for the client to receive an error (for example) related to a request they no longer care about. Based on a current need to not send such errors, the needed change seems like a reasonable behavior to extend to the underlying server for the general case. Nick Miyake was very helpful in researching that the implementation would be doable.

I'm specifically proposing adding an opt-in configuration that if enabled would wrap the http.ResponseWriter to not write a response if the request is canceled (which is bidirectionally correlated to the parent context from the request being canceled as per https://golang.org/pkg/net/http/#Request.Context). To help with telemetry, I'm also proposing request logs with a 499 Client Closed Request status code (part of an nginx extension of HTTP status codes) in these cases.

Specifically looking for feedback from @k-simons @bmoylan @jdhenke Thanks!

Attempt graceful shutdown on SIGTERM

Witchcraft should attempt to gracefully shutdown servers upon receiving a SIGTERM. A graceful shutdown should include responding false from its readiness endpoint, attempting to drain active connections, and responding to new requests with a 503 Service Unavailable.

Provide mechanism to share logger output locations

witchcraft-go-server has a notion of logging and is opinionated on output -- it decides on the io.Writer to use for writing logs based on factors such as configuration (useConsoleLog) and environment (isDocker or isJail), and when it performs file-based logging it does so using a lumberjack.Logger.

In some instances, a program may want to perform logging before the witchcraft.Server is created or started, but still wants to log in the same manner that the server would.

In order to do so, I propose the following:

Define a new FileWriterProvider interface that, given a path, returns an io.Writer that writes to that path
Define a LumberjackFileWriterProvider implementation that returns a lumberjack.Logger as it is currently defined in newDefaultLogOutput
Define a CreateLogWriter(logOutputPath string, logToStdout bool, stdoutWriter io.Writer, fileWriterProvider FileWriterProvider) io.Writer function
Add a WithFileWriterProvider(FileWriterProvider) function to the builder for witchcraft.Server (if not specified, it will use the LumberjackFileWriterProvider by default)

These primitives provide a general way to do a few things:

It makes it possible to configure the file output writer in code. Right now, a file writer is hard-coded to be a lumberjack.Logger with a specific set of parameters. This change would allow file writers to be customized at a code level.
With the above, a client can provide an implementation of FileWriterProvider that caches results and returns the same instance of a *lumberjack.Logger for a given path
- This means that multiple loggers can be instantiated with the same lumberjack logger, and will thus not need to worry about overwriting each other
With the above, a client can use CreateLogWriter to get the io.Writer that is appropriate for their environment/setup (log to stdout based on environment, etc.)

The only real downside I see is that this does expose a vector for customizing the file-based log output location that did not previously exist. However, this configuration point is purely in code (not in end-user configuration), so I don't think it's a huge risk.

If we really wanted to be stringent about this we could modify the FileWriterProvider API to more targeted (for example, only allow it to return a lumberjack.Logger and always override portions of that config after it is returned), but I think that's overkill and is over-fitting the specific problem trying to be solved here.

Add a default health check for a sliding window of events

For certain controllers one extremely common pattern for health checks is as follows:
The controller registers some events (example: OnSuccess, OnAttempt, OnFailure) and the health check is a function of all the events received within the last x minutes. In other words, there is a sliding window of size x that ends in the current time for events that we care.
Since we have 5 controllers using this pattern, we should have a first class API to deal with this.

Why can't this be called just 'witchcraft-go'?

rest errors do not serialize to JSON

See https://github.com/palantir/witchcraft-go-server/compare/bm/rest-test where we're getting the correct error code but empty responses

Allow disabling metric logging via config

Today, if the metric-emit-frequency is set to 0 we use the default of 60s. This is different than trace-sample-rate, which disables logging of <= 0.

It would be nice to be able to disable metrics via config alone.

Additional logging during server shutdown

Logging during server shutdown is currently relatively sparse. Additional logging during shutdown can provide context helpful in debugging issues both within the given service as well as dependencies of the given service.

Liveness endpoint should return non-empty JSON object

What happened?

Java witchcraft has diverged from the SLS spec and returns a simple "alive" for the liveness endpoint. Consuming services now expect that, while witchcraft-go returns an empty JSON object, which results in some consuming services not considering witchcraft-go services "live"

What did you want to happen?

Witchcraft-go should follow the lead of Java witchcraft and return a non-empty JSON response on the liveness endpoint

(feel free to reach out internally for more context - Lilliput#360)

Change log level of health check content change w/o status change to DEBUG

Services whose health check contents change frequently (particularly when the check is healthy) produce noisy INFO-level logs with these changes. We should consider either emitting the log line at the DEBUG level, or emitting the log line at the DEBUG level if the check is healthy.

See https://github.com/palantir/witchcraft-go-server/blob/develop/status/status.go#L152

Panics in goroutines spawned from `Start` should be logged as errors

Any panics that occur in the Start() function are logged as errors per #10.

However, panics that occur from goroutines started in Start() (notably, goroutine for watching configuration file reloads and goroutine for running management server) are not logged in the same way. This should be fixed.

Opt-in diagnostic log upon turning unhealthy

Right now, triggering diagnostic logs is a manual process (usually involving getting on the host where the process is running and then pkill -3 <process-name>). The diagnostic log is also time-sensitive (i.e. I want it as soon as I go unhealthy instead of after whatever period of time it takes me to run that pkill command). It'd be nice if we could expose optionality for outputting a diagnostic log as soon as a process's health check returns an error state.

@bmoylan @nmiyake for thoughts
I am happy to tackle implementation if we want to go forward here

Metric loggers should emit 0 value before regular value for certain metric types

What happened?

For certain metric types (specifically, "meter", "timer", and "histogram"), an important consumption model is to monitor when a value has changed. However, because metrics are currently only emitted when they are recorded and a zero value is not explicitly emitted, the first emitted value cannot be analyzed in this manner. For example, if a metric records a value of "1", because the value of "0" was never recorded, the type of analysis described above cannot be performed.

What did you want to happen?

When a metric of one of these types ("meter", "timer" or "histogram") is first emitted, a zero-value entry for the same metric (where a metric is identified by the name, type and set of tags) should be emitted first. This will ensure that a diff can be observed.

server.Start() should svc1log any errors it encounters

server.Start is the de-facto function for application lifecycle. It returns an error that it expects its caller to deal with, but the caller does not have access to the service.1 logger instatiated by Witchcraft. Start should log any errors it encounters before returning (it should still return the error).

In the case that we error before constructing the "real" service logger, we should use the sane defaults that are applied if no configuration is provided.

cc @nmiyake

Periodic health check logic uses time that check function started rather than completed

The resultsWithTimes slice in the doPoll function (status/health/periodic/source.go:132) should record the result of the check with the time the check completed. The logic is currently written as follows:

		resultsWithTimes = append(resultsWithTimes, resultWithTime{
			time:   time.Now(),
			result: check(ctx),
		})

This is incorrect because time.Now() is executed before chec(ctx) (because field assignment happens in order), so the time that is recorded is effectively when the check function was invoked, when semantically it should record when the check function finished/when the resultWithTime struct initialization has completed.

Reduce default trace rate to 1%

What happened?

Currently, the default trace rate for the tracer on the main server context is 100%. This is rarely the right tracing rate, and accidentally using it can have a significant fan out effect downstream.

What did you want to happen?

We should reduce the default trace rate to 0.01 (1%), which is a reasonable default for most services. We could do this either as a default in the Install config yaml annotations or by changing the fallback sampler to one with 1% sample rate here.

Management server is slow to start

The witchcraft management server sometimes takes a fairly long time (up to 30 seconds) to start after the server start command is issued.

DEBUG [2019-01-22T21:31:07.34280755Z] First logline
INFO  [2019-01-22T21:31:16.742606779Z] Listening to https (address: 0.0.0.0:5559, server: management)
[2019-01-22T21:31:19.440819761Z] "GET /status/liveness HTTP/2.0" 503 3 113
[2019-01-22T21:31:23.340134411Z] "GET /status/readiness HTTP/2.0" 503 3 77

We're initializing a fairly standard server:

return witchcraft.NewServer().
    WithSelfSignedCertificate().
    WithInstallConfigType(config.Install{}).
    WithInstallConfig(installConfig).
    WithRuntimeConfigType(config.Runtime{}).
    WithRuntimeConfigFromFile(viper.GetString(runtimeConfigFlag)).
    WithECVKeyProvider(witchcraft.ECVKeyNoOp()).
    WithOrigin(svc1log.CallerPkg(0, 1)).
    WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (cleanup func(), rErr error) {
        // init
    }).Start()

Ability to shutdown server with error

Feature Request

A common pattern is to run background tasks (goroutines) from an init server function (e.g. a controller). When these background tasks fail, (consider the case where due to a programmer bug there is a panic)

It would be nice to write an init func something like

func main() {
    err := startServer()
    if err != nil {
        os.Exit(1)
    }
    os.Exit(0)
}

func startServer() error {
	return witchcraft.NewServer().
		WithInitFunc(initFunc).
		Start()
}

func initFunc(ctx context.Context, info witchcraft.InitInfo) (cleanup func(), rErr error) {
	go func() {
        err := doBackgroundTask(ctx)

        // ... log / handle error

		if err != nil{
			err := info.ShutdownServerWithError(ctx, err)

            // ... handle err

		} else {
            info.ShutdownServer(ctx)
        }
	}()
	return nil, doRun(ctx,conf)
}

And have server.Start() actually return an error as desired.

Strawman proposal would be to add a new method to InitInfo:

	// ShutdownServerWithErr closes the server, waiting for any in-flight requests to finish (or the context to be cancelled).
	// The server is shutdown with an error, causing `Start` to return an error.
	ShutdownServer func(context.Context, err error) error

I'd be happy to implement this if this sounds good.

TestNewInflightLimitMiddleware test flakes

The TestNewInflightLimitMiddleware test appears to fail pretty consistently (although not 100%) in CI. Example:

https://circleci.com/gh/palantir/witchcraft-go-server/1022

Sample output:

=== RUN   TestNewInflightLimitMiddleware
--- FAIL: TestNewInflightLimitMiddleware (60.21s)
    ratelimit_test.go:101: 
        	Error Trace:	ratelimit_test.go:101
        	            				ratelimit_test.go:117
        	Error:      	Received unexpected error:
        	            	Post https://localhost:43351/example/post: context deadline exceeded
        	Test:       	TestNewInflightLimitMiddleware
    ratelimit_test.go:101: 
        	Error Trace:	ratelimit_test.go:101
        	            				ratelimit_test.go:112
        	            				asm_amd64.s:1333
        	Error:      	Received unexpected error:
        	            	Post https://localhost:43351/example/post: context deadline exceeded
        	Test:       	TestNewInflightLimitMiddleware
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x84cadd]

goroutine 221 [running]:
testing.tRunner.func1(0xc0003f6100)
	/usr/local/go/src/testing/testing.go:792 +0x387
panic(0x8dd720, 0xd74e20)
	/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/palantir/witchcraft-go-server/integration.TestNewInflightLimitMiddleware(0xc0003f6100)
	/go/src/github.com/palantir/witchcraft-go-server/integration/ratelimit_test.go:118 +0x72d
testing.tRunner(0xc0003f6100, 0x991ea0)
	/usr/local/go/src/testing/testing.go:827 +0xbf
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:878 +0x35c
FAIL	github.com/palantir/witchcraft-go-server/integration                                        	61.522s

Do not require ECV key if there are no encrypted values

Library equivalent to Java's ManagedScheduledTask

What happened?

For our java witchcraft services that use witchcraft-core (sadly cannot link here as this is internal) we have a library and builtin support for ManagedScheduledTasks. This makes it trivial to add background jobs to any (java) witchcraft service where you want some runtime config field to control that job frequency (or even disable it entirely). The library takes care of starting/stopping the job when witchcraft starts/stops and correctly updates the job frequency in response to live changes to the runtime config.

What did you want to happen?

Can we get a go version of this so that I don't need to implement this from scratch myself? Please see internal issue deployability/infra-projects issue 79 for more details.

Missing pprof routes

What happened?

Several pprof routes are missing and 404 from the pprof index. This code fixes it:

	pprofs := []string{"goroutine", "block", "allocs", "mutex", "threadcreate"}
	for _, prof := range pprofs {
		err = info.Router.Get("/debug/pprof/"+prof, pprof.Handler(prof))
		if err != nil {
			return nil, fmt.Errorf("register %v route: %w", prof, err)
		}
	}

The reason is that the witchcraft router does not support handlers accepting subpaths, and the net/http/pprof Index handler tries to automatically handle these routes.

What did you want to happen?

RouteHandlerMiddleware unwieldy for endpoint-specific handling

Existing method signature:

type RouteHandlerMiddleware func(rw http.ResponseWriter, r *http.Request, reqVals RequestVals, next RouteRequestHandler)

When adding RouteHandlerMiddleware, the RouteSpec within RequestVals is difficult to use safely given the path template of a RouteSpec is a construction of:

the server's configured context path
other subrouter paths
the path of the endpoint of interest

Upon my first attempt at writing a RouteHandlerMiddleware, ensuring I matched a specific endpoint proved challenging - I had to choose between either taking an implicit assumption on witchcraft-go-server's usage of the context-path configuration point (I guess this is a pretty safe assumption) or just matching the path using strings.HasSuffix. I am unsure if there's something a little more ergonomic to provide users here.

Separate issue: when used in conjunction with conjure-go generated server code, an additional snag is having to copy the string literal for the route from the generated server code. Perhaps a separate ergonomics improvement there is defining those string literal paths as exported constants in conjure-go-generated code

404s do not produce request logs

When a request path does not match a registered route, RouteHandlerMiddlewares do not run. This means that request logs are not emitted for the request and it is not included in server.response metrics.

As part of fixing this, we should also return a conjure NotFound error body rather than using the default text response provided by the upstream router implementations. I've opened a draft change at #187 for this.

Improved panic logging

What happened?

I was investigating an error in a service which ended up being a failed type coercion that caused a panic. I was surprised that the panic itself was not logged - I only discovered it because the server response included a panic message.

What did you want to happen?

Panics should be logged in recovery, ideally with a relevant traceId. This would have made debugging the case above trivial - the panics would have been discoverable in the logs generally, and drilling down into a specific traceId would have given the full picture of what happened during during the request.

HealthReporter should support component removal

Current, the HealthReporter API only allows for initializing and getting HealthComponents. Sometimes, pieces of an application that we want to measure health for can be live-reloaded away, so HealthReporter should expose a way to unregister the HealthComponents associated with those pieces. Something roughly along the lines of:

func UnregisterHealthComponent(name string) error {
    // Acquire lock
    // Remove from both healthComponents and currentStatus.Checks
}

Runtime config refreshable produces zero value if cannot parse yaml

If var/conf/runtime.yml starts as valid then is changed to something that cannot be parsed as YAML into the specified RuntimeConfigType, the runtime config refreshable provided in the InitFunc starts returning a zero value runtime config type.

Instead , it should return the previous value from the version of the file that was parseable.

Additionally, it might be worth adding a health check that verifies the current version of the config on disk is parseable and in use.

Here's a program that reproduces this behavior:

package main

import (
	"context"
	"fmt"
	"github.com/palantir/witchcraft-go-server/witchcraft"
	"io/ioutil"
	"os"
	"time"
)

type RuntimeConfig struct {
	Foo int
}
/*
This program starts a simple withcraft server that prints the value of its runtime config's Foo field every second.

It starts with a valid config, then after three seconds changes the file on disk to be an invalid config, after which
the runtime config refreshable produces a zero value for the field rather than the previous valid value, which seems
wrong.

	$ go run main.go
	{"level":"INFO","time":"2019-09-09T19:19:39.165365Z","message":"Listening to https","type":"service.1","origin":"github.palantir.build/jhenke/repro-type-bug","params":{"address":"127.0.0.1:0","server":""}}
	Foo 42
	Foo 42
	Foo 42
	Wrote new config file
	{"level":"ERROR","time":"2019-09-09T19:19:42.100559Z","message":"Failed to unmarshal runtime configuration","type":"service.1","origin":"github.palantir.build/jhenke/repro-type-bug","stacktrace":"yaml: unmarshal errors:\n  line 1: cannot unmarshal !!str `some-ki...` into int"}
	Foo 0
	Foo 0
	Foo 0
	exit status 1
 */
func main() {
	if err := os.MkdirAll("var/conf", 0755); err != nil {
		panic(err)
	}
	if err := ioutil.WriteFile("var/conf/install.yml", []byte("use-console-log: true\nserver:\n  address: 127.0.0.1\n"), 0644); err != nil {
		panic(err)
	}
	if err := ioutil.WriteFile("var/conf/runtime.yml", []byte("foo: 42\n"), 0644); err != nil {
		panic(err)
	}
	go func() {
		time.Sleep(3 * time.Second)
		if err := ioutil.WriteFile("var/conf/runtime.yml", []byte("foo: some-kinda-text\n"), 0644); err != nil {
			panic(err)
		}
		fmt.Println("Wrote new config file")
		time.Sleep(3 * time.Second)
		os.Exit(1)
	}()
	if err := witchcraft.NewServer().
		WithSelfSignedCertificate().
		WithRuntimeConfigType(RuntimeConfig{}).
		WithInitFunc(func(ctx context.Context, info witchcraft.InitInfo) (cleanup func(), rErr error) {
			go func() {
				for range time.Tick(1 * time.Second) {
					fmt.Println("Foo", info.RuntimeConfig.Current().(RuntimeConfig).Foo)
				}
			}()
			return nil, nil
		}).Start(); err != nil {
			panic(err)
	}
}

Optional middleware for rejecting requests when unhealthy

This could give clients more helpful errors if we know an endpoint will fail

cc @gdearment

Period health check does not behave as documented

The polling health check source does not behave as documented.

Based on the documentation (and from what I can garner from the code/messages), if a health check returns "healthy", then even if it returns a non-healthy state after that, the over-all health should still be reported as "healthy" as long as the last "healthy" status was within the grace period.

As currently implemented, if a non-success status is returned after a success status, the "last success" status is not preserved, and the check incorrectly reports that there were no successful checks during the grace period.

Separate out trace sampling for application routes and management routes

Right now these use the same trace sampling configuration and tracer
These should be broken out into their own tracers and made configurable

connection metrics

It would be useful if witchcraft-go-server came with metrics that would describe details around the number of active connections, failed connections, etc. This would prove useful when interacting with services that establish a lot of connections with proxy services like envoy.

Periodic health check source could use health check source internally

I've got a use case for a health check that I would like to have refresh periodically in some background thread, but have the actual endpoint use some cached value of this check. I saw that there's a periodic package that almost does what I want. I'm wondering if it would make sense to update this package to take a status.HealthCheckSource instead of the current poll function? This would give people control over the parameters fields of the health check result, which can be useful for supplementing the health state.

Time-based log rotation

Some log archival tools only operate on rotated/compressed logfiles. For services which do not log very often, it can take weeks/months to reach the 1GB limit to trigger a rotation.

FR: Allow configuring a maximum logfile age before rotation. This likely requires changing Lumberjack. See some previous discussion on natefinch/lumberjack#17 and natefinch/lumberjack#54.

If we do not think it will be straighforward to change Lumberjack, natefinch/lumberjack#17 proposes the following goroutine to force rotations on a periodic schedule:

log := lumberjack.Logger{ /* some config */ }
go func() {
    for {
        <-time.After(time.Hour*24)
        log.Rotate()
    }
}()

This will not preclude size-based rotations, and if the timer fires soon after a size-based rotation, you may end up with weirdly small files.

Logger runtime configuration reload should only watch logger configuration

Run server that uses runtime configuration with logger configuration
Modify the base runtime configuration in a manner that modifies the base runtime configuration, but does not modify the logger configuration (for example, add or modify the health-checks: shared-secret value)

Expected

Logic that reloads logger configuration should not run

Actual

Logic that reloads logger configuration runs (even though logger configuration has not changed)

Custom yaml unmarshaling (e.g. defaults)

witchcraft calls yaml.Unmarshal for you, but this makes custom handling like loading defaults difficult.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

palantir / witchcraft-go-server Goto Github PK

witchcraft-go-server's Introduction

witchcraft-go-server

Implementation

Configuration

Route registration

Liveness, readiness, and health

Debug & Diagnostic Routes

[Deprecated] Pprof routes

Context path

Security

Logging

Service logger origin

Trace IDs and instrumentation

Creating new spans/trace log entries

Middleware

Long-running execution not associated with a route

Metrics

SIGQUIT handling

Shutdown signal handling

Example server initialization

Basic production server

Basic local/test server

Server using install configuration

Server using runtime configuration

Full server example

Refreshable configuration

Updating refreshable configuration: provider-based vs. push-based

License

witchcraft-go-server's People

Contributors

Stargazers

Watchers

Forkers

witchcraft-go-server's Issues

Expected

Actual

What happened?

What did you want to happen?

Analysis

What happened?

What did you want to happen?

What happened?

What did you want to happen?

What happened?

What did you want to happen?

Feature Request

What happened?

What did you want to happen?

What happened?

What did you want to happen?

What happened?

What did you want to happen?

Expected

Actual

Recommend Projects

Recommend Topics

Recommend Org