Giter Site home page Giter Site logo

pilot's Introduction

Development has moved to istio/istio

Istio Pilot

Prow build Status CircleCI Go Report Card GoDoc codecov.io

Istio Pilot provides platform-independant service discovery, and exposes an interface to configure rich L7 routing features such as label based routing across multiple service versions, fault injection, timeouts, retries, circuit breakers. It translates these configurations into sidecar-specific configuration and dynamically reconfigures the sidecars in the service mesh data plane. Platform-specific eccentricities are abstracted and a simplified service discovery interface is presented to the sidecars based on the Envoy data plane API.

Please see Istio's traffic management concepts to learn more about the design of Pilot and the capabilities it provides.

Istio Pilot design gives an architectural overview of its internal components - cluster platform abstractions, service model, and the proxy controllers.

To learn how you can contribute to Istio Pilot, please see the Istio contribution guidelines.

Quick start

  1. Install Bazel: Bazel 0.6.1 or higher. Debian packages are available on Linux. For OS X users, bazel is available via Homebrew.

NOTE 1: Bazel tool is still maturing, and as such has several issues that makes development hard. While setting up Bazel is mostly smooth, it is common to see cryptic errors that prevent you from getting started. Deleting and restarting everything generally helps.

NOTE 2: If you are developing for the Kubernetes platform, for end-to-end integration tests, you need access to a working Kubernetes cluster.

  1. Setup: Run make setup. It installs the required tools and vendorizes the dependencies.

  2. Write code using your favorite IDE. Make sure to format code and run it through the Go linters, before submitting a PR. make fmt to format the code. make lint to run the linters defined in bin/check.sh

    If you add any new source files or import new packages in existing code, make sure to run make gazelle to update the Bazel BUILD files.

  3. Build: Run make build to compile the code.

  4. Unit test: Run make test to run unit tests.

    NOTE: If you are running on OS X, //proxy/envoy:go_default_test will fail. You can ignore this failure.

  5. Dockerize: Run make docker HUB=docker.io/<username> TAG=<sometag>. This will build a docker container for Pilot, the sidecar, and other utilities.

  6. Integration test: Run make e2etest HUB=docker.io/<username> TAG=<sometag> with same image tags as the one you used in the dockerize stage. This step will run end to end integration tests on Kubernetes.

Detailed instructions for testing are available here.

pilot's People

Contributors

andraxylia avatar ayj avatar chxchx avatar cmluciano avatar costinm avatar esnible avatar greghanson avatar gyliu513 avatar ijsnellf avatar istio-testing avatar jmuk avatar kimikowang avatar kyessenov avatar ldemailly avatar lookuptable avatar mandarjog avatar myidpt avatar nlandolfi avatar objectiser avatar qiwzhang avatar rkpagadala avatar rshriram avatar rvkubiak avatar sebastienvas avatar smawson avatar vadimeisenbergibm avatar vbatts avatar wattli avatar yutongz avatar zcahana avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pilot's Issues

connection errors with loopback http2 upgrade

PR #100 fails in "a" -> "a" case:

kubectl exec -it a-3708111894-q8dx0 client http://a/a
-- error
kubectl exec -it a-3708111894-q8dx0 client http://b/a
-- works fine

Here the debug log from envoy:

[2017-02-08 18:06:58.109][17][info][main] [C16] new connection
[2017-02-08 18:06:58.110][17][info][client] [C17] connecting
[2017-02-08 18:06:58.111][17][info][client] [C17] protocol error: The user callback function failed
[2017-02-08T18:06:58.110Z] "GET /a HTTP/1.1" 503 UC 0 57 2 - "-" "Go-http-client/1.1" "06c8f613-0618-408f-9a62-230286b91d2c" "a" "tcp://10.0.0.114:8080"

Data race in platform/kube/kube_test.go

See snippet below for output of golang race detector. CreateRESTConfig modifies api.Scheme which is a global variable and update and read (indirectly) by all tests in the kube package.

For production system this isn't an issue since we only create one client/controller per binary, but the global access does mess up the race detector. Short term fix is to exclude kube_test.go from race detection check.

$ go test -race istio.io/manager/platform/kube -v
=== RUN   TestCamelKabob
--- PASS: TestCamelKabob (0.00s)
=== RUN   TestConvertProtocol
--- PASS: TestConvertProtocol (0.00s)
=== RUN   TestDecodeIngressRuleName
--- PASS: TestDecodeIngressRuleName (0.00s)
=== RUN   TestIsRegularExpression
--- PASS: TestIsRegularExpression (0.00s)
=== RUN   TestThirdPartyResourcesClient
--- PASS: TestThirdPartyResourcesClient (1.88s)
=== RUN   TestController
--- PASS: TestController (2.07s)
=== RUN   TestControllerCacheFreshness
--- PASS: TestControllerCacheFreshness (0.34s)
=== RUN   TestControllerClientSync
--- PASS: TestControllerClientSync (2.88s)
=== RUN   TestServices
==================
WARNING: DATA RACE
Write at 0x00c4202868a0 by goroutine 39:
  runtime.mapassign1()
      /usr/local/go1.7.4/src/runtime/hashmap.go:442 +0x0
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime.(*Scheme).AddKnownTypes()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/scheme.go:180 +0x2e8
  istio.io/manager/platform/kube.CreateRESTConfig.func1()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/platform/kube/client.go:94 +0x204
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime.(*SchemeBuilder).AddToScheme()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/scheme_builder.go:29 +0x83
  istio.io/manager/platform/kube.CreateRESTConfig()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/platform/kube/client.go:107 +0x361
  istio.io/manager/platform/kube.NewClient()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/platform/kube/client.go:115 +0x49
  istio.io/manager/platform/kube.makeClient()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/platform/kube/kube_test.go:341 +0x381
  istio.io/manager/platform/kube.TestServices()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/platform/kube/kube_test.go:213 +0x86
  testing.tRunner()
      /usr/local/go1.7.4/src/testing/testing.go:610 +0xc9

Previous read at 0x00c4202868a0 by goroutine 38:
  runtime.mapaccess2()
      /usr/local/go1.7.4/src/runtime/hashmap.go:326 +0x0
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime.(*Scheme).New()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/scheme.go:276 +0xa6
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime.UseOrCreateObject()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/codec.go:111 +0x22d
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/serializer/json.(*Serializer).Decode()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/serializer/json/json.go:153 +0x837
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/serializer/versioning.DirectDecoder.Decode()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/serializer/versioning/versioning.go:266 +0xa7
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/serializer/versioning.(*DirectDecoder).Decode()
      <autogenerated>:1 +0xee
  istio.io/manager/vendor/k8s.io/client-go/pkg/runtime.Decode()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/runtime/codec.go:54 +0x9b
  istio.io/manager/vendor/k8s.io/client-go/pkg/watch/versioned.(*Decoder).Decode()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/watch/versioned/decoder.go:61 +0x464
  istio.io/manager/vendor/k8s.io/client-go/pkg/watch.(*StreamWatcher).receive()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/watch/streamwatcher.go:93 +0x139

Goroutine 39 (running) created at:
  testing.(*T).Run()
      /usr/local/go1.7.4/src/testing/testing.go:646 +0x52f
  testing.RunTests.func1()
      /usr/local/go1.7.4/src/testing/testing.go:793 +0xb9
  testing.tRunner()
      /usr/local/go1.7.4/src/testing/testing.go:610 +0xc9
  testing.RunTests()
      /usr/local/go1.7.4/src/testing/testing.go:799 +0x4ba
  testing.(*M).Run()
      /usr/local/go1.7.4/src/testing/testing.go:743 +0x12f
  main.main()
      istio.io/manager/platform/kube/_test/_testmain.go:76 +0x1b8

Goroutine 38 (running) created at:
  istio.io/manager/vendor/k8s.io/client-go/pkg/watch.NewStreamWatcher()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/watch/streamwatcher.go:60 +0x128
  istio.io/manager/vendor/k8s.io/client-go/rest.(*Request).Watch()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/rest/request.go:686 +0xbfd
  istio.io/manager/platform/kube.NewController.func10()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/platform/kube/controller.go:121 +0x246
  istio.io/manager/vendor/k8s.io/client-go/tools/cache.(*ListWatch).Watch()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/tools/cache/listwatch.go:96 +0x83
  istio.io/manager/vendor/k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/tools/cache/reflector.go:292 +0x79f
  istio.io/manager/vendor/k8s.io/client-go/tools/cache.(*Reflector).RunUntil.func1()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/tools/cache/reflector.go:198 +0x4a
  istio.io/manager/vendor/k8s.io/client-go/pkg/util/wait.JitterUntil.func1()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/util/wait/wait.go:96 +0x6f
  istio.io/manager/vendor/k8s.io/client-go/pkg/util/wait.JitterUntil()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/util/wait/wait.go:97 +0xbd
  istio.io/manager/vendor/k8s.io/client-go/pkg/util/wait.Until()
      /usr/local/google/home/jasonyoung/work/src/istio.io/manager/vendor/k8s.io/client-go/pkg/util/wait/wait.go:52 +0x5a

Investigate TPR revisions and generations in Registry.Put

Kubernetes has three verbs for updating configuration objects: PUT, PATCH, POST.
We need to understand the semantics of each and, if necessary, record etcd revisions as part of the controller cache. Since we convert TPR objects to Protos, revision metadata is lost in the process.

The goal is to specify consistency guarantees of the concurrent Status (from the controller) and Spec (from the operator or another controller) updates.

Cached registry implementation for Kubernetes

Current implementation of the registry makes individual requests to kube-API for every config get/put.
We should utilize kubernetes work queue to cache all third party resources and watch for changes.

Consider auto-generating golang structs for envoy configuration from JSON schema

Hand generating and maintaining envoy configuration structs (i.e. proxy/envoy/resources.go) is a bit cumbersome. We could possibly auto-generate golang-based structs with JSON annotations from envoy's JSON schema definitions using one of the tools list here. Envoy's configuration is defined as multiple separate JSON schema so we'd still end up writing some additional generation code even with an existing JSON-schema-to-golang library. This approach may end up being more trouble than it's worth, but it might be worth considering.

nil pointer issue in envoy config generation

proxy manager crashes, when an upstream-cluster resource is posted. The following are the error logs

ubuntu@ubuntu-xenial:~/go/src/istio.io/manager/test/integration$ kubectl logs gateway-350896508-4j3vp -c proxy
I0209 04:18:50.550543       1 main.go:59] flags: &main.args{kubeconfig:"", namespace:"", client:(*kube.Client)(nil), server:main.serverArgs{sdsPort:8080}, proxy:envoy.MeshConfig{DiscoveryAddress:"manager:8080", MixerAddress:"", ProxyPort:5001, AdminPort:5000, BinaryPath:"/usr/local/bin/envoy", ConfigPath:"/etc/envoy", RuntimePath:"", AccessLogPath:""}}
I0209 04:18:50.592076       1 client.go:149] Resource already exists: "istio-config.istio.io"
I0209 04:18:50.592109       1 client.go:173] Checking for TPR resources
I0209 04:18:50.594859       1 watcher.go:43] host IPs: map[172.17.0.5:true]
I0209 04:18:50.697672       1 controller.go:129] Event add: key "default/helloworld-v2-3915138836-bjd43"
I0209 04:18:50.797679       1 controller.go:129] Event add: key "default/manager-2809908815-sn55n"
I0209 04:18:50.897205       1 controller.go:129] Event add: key "kube-system/kube-addon-manager-minikube"
I0209 04:18:50.996839       1 controller.go:129] Event add: key "kube-system/kube-dns-v20-xj39d"
I0209 04:18:51.096855       1 controller.go:129] Event add: key "kube-system/kubernetes-dashboard-nps8z"
I0209 04:18:51.196828       1 controller.go:129] Event add: key "default/gateway-350896508-4j3vp"
I0209 04:18:51.297333       1 controller.go:129] Event add: key "default/helloworld-v1-3574023954-74fcd"
I0209 04:18:51.397199       1 controller.go:129] Event add: key "kube-system/kubernetes-dashboard"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x63202e]

goroutine 8 [running]:
panic(0x1353f60, 0xc420016040)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-79/execroot/manager/external/io_bazel_rules_go_toolchain/src/runtime/panic.go:500 +0x1a1
istio.io/manager/proxy/envoy.enumerateServiceVersions(0xc4203b9480, 0x6, 0x8, 0xc420460348, 0x1, 0x1, 0x0, 0x0)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-1/execroot/manager/bazel-out/local-fastbuild/bin/proxy/envoy/go_default_library.a.dir/istio.io/manager/proxy/envoy/config.go:589 +0x36e
istio.io/manager/proxy/envoy.Generate(0xc4204600c8, 0x1, 0x1, 0xc4203b9480, 0x6, 0x8, 0xc420460280, 0x1, 0x1, 0xc420460348, ...)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-1/execroot/manager/bazel-out/local-fastbuild/bin/proxy/envoy/go_default_library.a.dir/istio.io/manager/proxy/envoy/config.go:77 +0x84
istio.io/manager/proxy/envoy.(*watcher).reload(0xc4200df2c0)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-1/execroot/manager/bazel-out/local-fastbuild/bin/proxy/envoy/go_default_library.a.dir/istio.io/manager/proxy/envoy/watcher.go:81 +0x1de
istio.io/manager/proxy/envoy.NewWatcher.func1(0xc4200f6af0, 0x1)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-1/execroot/manager/bazel-out/local-fastbuild/bin/proxy/envoy/go_default_library.a.dir/istio.io/manager/proxy/envoy/watcher.go:56 +0x2a
istio.io/manager/platform/kube.(*Controller).AppendServiceHandler.func1(0x14a2440, 0xc420615000, 0x1, 0x0, 0x0)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-77/execroot/manager/bazel-out/local-fastbuild/bin/platform/kube/go_default_library.a.dir/istio.io/manager/platform/kube/controller.go:453 +0x9d
istio.io/manager/platform/kube.(*chainHandler).apply(0xc42017a840, 0x14a2440, 0xc420615000, 0x1, 0xc4203ca2d0, 0x1)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-77/execroot/manager/bazel-out/local-fastbuild/bin/platform/kube/go_default_library.a.dir/istio.io/manager/platform/kube/queue.go:115 +0x6c
istio.io/manager/platform/kube.(*chainHandler).(istio.io/manager/platform/kube.apply)-fm(0x14a2440, 0xc420615000, 0x1, 0x0, 0x0)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-77/execroot/manager/bazel-out/local-fastbuild/bin/platform/kube/go_default_library.a.dir/istio.io/manager/platform/kube/controller.go:150 +0x48
istio.io/manager/platform/kube.(*queueImpl).Run(0xc4201414a0, 0xc4201b4600)
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-77/execroot/manager/bazel-out/local-fastbuild/bin/platform/kube/go_default_library.a.dir/istio.io/manager/platform/kube/queue.go:96 +0x187
created by istio.io/manager/platform/kube.(*Controller).Run
	/home/ubuntu/.cache/bazel/_bazel_ubuntu/becfd49efb6a468239d86aafc0e60745/bazel-sandbox/9a35f287-e3d0-47ee-9fc1-217cd6e26c78-77/execroot/manager/bazel-out/local-fastbuild/bin/platform/kube/go_default_library.a.dir/istio.io/manager/platform/kube/controller.go:210 +0x68

Steps to reproduce: (from the helloworld_test branch)

kubectl create -f test/integration/manager.yaml
kubectl create -f test/integration/helloworld.yaml
##post two upstream resources
cat test/integration/helloworld-default-upstream.yaml | ../../bazel-bin/cmd/manager/manager config put upstream-cluster helloworld-upstreams
##now all pods crash
ubuntu@ubuntu-xenial:~/go/src/istio.io/manager/test/integration$ kubectl get po
NAME                             READY     STATUS    RESTARTS   AGE
gateway-350896508-z5czt          1/2       Error     1          1m
helloworld-v1-3574023954-plvst   1/2       Error     1          1m
helloworld-v2-3915138836-7ftpp   1/2       Error     0          1m
manager-2809908815-wnph2         1/1       Running   0          1m

## view logs
kubectl logs gateway-350896508-z5czt -c proxy

Non-deterministic proxy config

Recent changes introduced a bug where envoy config is no longer the same on re-list. This is bad since we get non-deterministic bugs. For example, here a request from "a" pod sent to "a" gets routed to another "a" pod. It will be ping-ponged across proxies never coming to the application!

              "virtual_hosts": [
                {
                  "name": "a.default.svc.cluster.local:http",
                  "domains": [
                    "a:80",...
                  ],
                  "routes": [
                    {
                      "prefix": "/",
                      "cluster": "inbound:8080"
                    }
                  ]
                },
                {
                  "name": "a.default.svc.cluster.local:http-alternative",
                  "domains": [
                    "a:8080",...                  ],
                  "routes": [
                    {
                      "prefix": "/",
                      "cluster": "outbound:a.default.svc.cluster.local:http-alternative"
                    }
                  ]
                },

@rshriram I think it's likely your change. Do you want to rollback your change to unblock the tests or fix this?

Alternate build system: Glide/scripts

To simplify development on local machine, need to have a build system that is completely independent of bazel. A potential option is Glide. Once the dependencies are generated using bazel (one time task), we can create a glide.yaml and glide.lock file.

The canonical build system used by Jenkins will still be bazel. We should have scripts that update glide.yaml files if someone were to use bazel to add additional dependencies (and vice versa).

Need a simple Makefile that runs gofmt, and gometalinter on the code before a commit, and a script that runs the e2e integration tests, invoked via Makefile targets (bazel already has these targets).

Allow user to explicitly specify service name for pod

The manager code currently does not filter rules based on the source field. This is a bug as per the routing rule specifications (and will break the demo as well).

In order to do source based routing, we need to know the service to which the pod belongs to, so that we can filter the rules that don't apply to the pod and generate the right envoy config (also use that for service_cluster in the envoy command line args).

Since pods can be late bound to a service, and a pod can belong to multiple services, there is no way to derive the association automatically in advance before generating the routing rules. There is another complication where, the pod need not belong to any service.

We cannot route by pod names because they do not correlate with the source service field in the route rules. Nor should we complicate the route rules any further by asking users to specify the pod names instead of the source field.

There are two solutions:

  1. Delay generation of routing rules until the pod is bound to a service. This means that the proxy should not accept any traffic from the app until it has all the routing rules in place. Once we get the initial membership information, we also need to continuously re-evaluate the service(s) to which a pod belongs to and re-evaluate rules, generate new config as this relationship changes.

  2. Ask user to specify via env vars or cli args, the name of the k8s service to which the pod belongs to.

The latter option is the easiest to begin with. That is the mode in which we operate

The former option will work only if the pod belongs to some service. if the pod does not belong to any service (i.e. a wildcard source, which means all routing rules apply), there is no way to automatically determine this relationship, unless explicitly indicated by the user, which leads us to option 2 again.

At the same time, imposing this automatic inference logic on the end user has correctness implications as well, because we cannot assume that if a pod does not belong to any service, it can route to all services. This might result in unexpected behavior that the user does not want.

Simple example to illustrate the issue above.

user sets routing rules: a (tags v1) goes to b tags v1 only.

time t=0: pod a-v1 is launched (service A is not bound yet). Envoy config at pod a-v1: empty

t=1: pods b-v1 and b-v2 are launched and bound to serviceB. Envoy config at pod a-v1: route to b-v1/b-v2 for calls to serviceB . This will not be an issue because pod a-v1 gets no traffic (hopefully).

t=2: serviceA is launched. Information has not propagated to pod a-v1 or the proxy agent in pod a-v1 has not picked up these changes yet. Envoy config at pod a-v1: route to b-v1/b-v2 for calls to serviceB . This will be a problem now, because traffic coming to serviceA enters pod a-v1 and a-v1 will now route to either b-v1 or b-v2 (user wants traffic to go to only b-v1).

t=3: proxy agent at pod a-v1 picks up the changes (that its bound to serviceA). It recomputes the routing rules and sets up envoy such that it only routes to b-v1 (this is the route expected by the user).

One could apply similar examples to the mixer logic as well (e.g., I could set an ACL saying serviceA should not talk to serviceB except when cookie is user=shriram).

It is certainly possible to argue that this will be the kubernetes behavior, but it is asinine to assume that an average end user will be able to comprehend all this complexity related to concurrency, eventual consistency, etc.

Bottom line: Telling the user that initial route settings (default routes) will be eventually consistent results in a lot of confusion on the end user part.

cc @zcahana @elevran @louiscryan @mandarjog @kyessenov @frankbu

FWIW, in the previous version of amalgam8, we asked users to explicitly specify the service to which the pod/container belonged to, as a conscious decision to avoid this complexity. In the current k8s integration (similar to the manager), we have the same problem as described above.

A possible middleground is to set cluster level policies at the mixer such that if a pod does not belong to any service, no traffic goes out of it via its proxy. This would mean result in mixer needing to know about routing rules as well (atleast service versions). (we would also need to revert PR #116 that uses the pod name as the sourceClusterName for proxy, in order to generate service graph using prometheus).

iptables rules interference

This is something that is inevitably going to pop up: how do we deal with multiple sidecars operating on iptables rules. At the very least, we should document how we can exclude traffic per port or per container/process from istio proxy traffic capture. For per node set-up, dealing with kube-proxy iptables rules in conjunction with istio proxy rules needs to be addressed.

Generate envoy virtual host entry for pod IP and port

Ingress controllers like Nginx use the pod IP instead of service IP. In such a case, Envoy drops the incoming request as the host headers no longer match. Currently the proxy config generator creates virtual hosts only for all variations of service name and the service cluster IP. It needs to take the local pod IP into account as well.
cc @kyessenov

UDP support

Proxy manager should be in charge of routing UDP traffic. There are many pieces needed for this to work:

  • Envoy UDP routing support
  • IP tables rules to trap UDP traffic
  • Code to handle UDP protocol in the Manager services model

Attaching service tags (versions) to network endpoints

Kubernetes API provides two basic discovery methods: get all methods and get all network endpoints. For Istio, we introduce the concept of a service tag (used to call versions) to provide finer-grained routing in Envoy (e.g. partitioning service endpoints between A/B for A/B testing).

How do we represent this tag information in the Kubernetes API?
Amalgam8 uses pod labels and performs an inverse lookup from endpoint IP to pod spec. This is a fine approach, but I'm wondering if that's sufficient. Using pod labels has drawbacks:

  • requires deployments to be aware of these labels or pod labels get overwritten
  • not clear if we can change pod tags dynamically without restarting the pod (and not causing replicasets or whatever created the pod to kill it)
  • all ports for the same network endpoint carry the same tags

In the future, we want to use service tags to implement dynamic config update without pod restarts. So perhaps, an approach where we use both pod labels as well as some other registry would be the actual solution.

Uniqueness of destination policies

We define a Destination policy schema that provides routing policies for a service version destination, e.g. use "round_robin" for service "a" versions "v1". Currently, you can define multiple such policies for the same service version, with unspecified semantics of what happens then. The right solution would be to enforce uniqueness per service version. This can be accomplished in etcd by using service name and version as part of the resource name. Should we use the service key (a:version=v1) as the name of the policy?

stray namespaces

PR #119 leaves namespaces around when tests fail. Overtime, there would be a lot of stray namespaces. Need to make the user aware that namespaces will not be deleted upon a failed test.

Upstream envoy bug in DNS/kubeDNS

There is something very wrong with DNS resolution in envoy that creates a 10 minute pause at the startup:
[] starting async DNS resolution for manager
[10 minutes after] loading 3 listener(s)

This is happening at commit fa1d9680d809668fef2ec9386769c79486029f04.
I am rolling back Istio proxy to an earlier commit until c-ares DNS properly lands in envoy.

Include GIT SHA in the binary builds

Sometimes, its hard to identify if Bazel built the binary or not.. It would be nice to include all the git commit versions into the binary so that we can be sure that it is using the current version.

For example, in amalgam8 we use something like this:

BUILD_SYM	:= github.com/amalgam8/amalgam8/pkg/version
LDFLAGS		+= -X $(BUILD_SYM).version=$(APP_VER)
LDFLAGS		+= -X $(BUILD_SYM).gitRevision=$(shell git rev-parse --short HEAD 2> /dev/null  || echo unknown)
LDFLAGS		+= -X $(BUILD_SYM).branch=$(shell git rev-parse --abbrev-ref HEAD 2> /dev/null  || echo unknown)
LDFLAGS		+= -X $(BUILD_SYM).buildUser=$(shell whoami || echo nobody)@$(shell hostname -f || echo builder)
LDFLAGS		+= -X $(BUILD_SYM).buildDate=$(shell date +%Y-%m-%dT%H:%M:%S%:z)
LDFLAGS		+= -X $(BUILD_SYM).goVersion=$(word 3,$(shell go version))

We need an equivalent in bazel. Could someone help with this please?

Add pre-submit checks and CI

The tests in Manager require an access to a GKE cluster to run the integration test.
That means Jenkins/GKE is probably a better choice for CI than Travis.

Add pod introspection API

We need to discover the list of service instances co-located in the proxy pod.
This should be based on the downward API in kubernetes.

Virtual host with port redirect

According to rfc2616, host header may have the port value as a suffix. It seems that with the recent port redirection, Envoy "domain" "a" does not accept requests with a header "a:80" .

Add TLS certificates configuration

As part of the Mesh Config, we need to distribute certificates to Envoy proxies.
This can be done using Kubernetes Secrets and envoy.MeshConfig fields that are configurable as flags at start-up time. In the future, MeshConfig will be dynamically updated, but as the first step, we need to get certs loaded statically in Envoy.

Support HTTPS routing

We need to investigate how to route HTTPS traffic for pod ingress/egress traffic. TCP-level sufficient but maybe SNI works better.

Ensure that proxy injection + setup mechanism works for other proxies

Our Envoy solution pulls lots of tricks (iptables, init containers) that are either not applicable to some deployments or not supported by other proxies.
We should make sure that we can support nginx, linkerd, etc by leaving all the customizations as pluggable hooks that third-party contributors can fill in. Ideally, the choice of a proxy mesh should be an implementation detail to the rest of Istio.

Service/cluster/routing discovery component

We need to provide an abstraction layer over the service network for the proxy configuration that can enumerate services and their IP endpoints and work cross-platform.

Although this information is not a user-provided intent but the platform-specific status, Manager should encapsulate this for Istio Proxy.

Egress proxy for external services

For services that don't have pods backing them, there's just an external IP and no deployment with a proxy. We need to deploy a set of proxies to handle cluster egress traffic, and configure them to capture and route traffic.

@vaikas-google

Implement Ingress Controller for Envoy

As a proof of concept of the abstraction model we should implement standard Ingress support for Envoy. Should not need any additional configuration artifacts beyond what K8S supports today.

Manager and proxy mesh self-monitoring

This is a tracking issue to investigate mechanisms for self-monitoring in Manager and Proxies.
We would like to monitor health of manager service, proxies, proxy agents, completion of config reloads across Istio deployment in a consistent way.

Potential options to explore:

  • align with Mixer to trigger alerts and log status
  • align with Mixer's way of self-monitoring
  • utilize platform means (pod annotations or events in Kubernetes)

There are work items spread around the code to handle failure cases that need to be addressed.

Proxy injection without k8s admission controller

/issues/57 is tracking long-term support for transparent proxy injection for per-pod case via k8s admission controller. We need an interim solution that does not require a special k8s build for managed k8s, e.g. GKE. This could simply consist of manually inserting additional pre-prepared containers into each pod spec, e.g. istio-init-container, istio-proxy-container. There is precedence for this in other sidecar implementation, e.g. linkerd.

Update proxy agent config generation to TCP src/dst routing

envoyproxy/envoy/pull/377 allows the tcp_proxy filter to pick the destination cluster based on a combination of L4 connection parameters (source/destination IP address/port). As the PR notes, this is a breaking change and old tcp_proxy configs will be rejected. The proxy agent needs to be updated to generate the newer tcp_proxy filter configuration.

This change must be synchronized with rebasing istio/proxy to pick up the latest version of lyft/envoy.

cc @qiwzhang, @kyessenov

Ingress route rule compatibility

Tracking some issues with mismatch between ingress and route rule:

  • ingress consists of many routes; do we map ingress to a list of rules or make route rule a list?
  • ingress points to a port name on the backend service; do we put port names into the core model?
  • defaulting is different for ingress proxy; it always replies with 404 and doesn't use sidecar default route rules.

Rename copyright headers to Istio Authors

Need to rename all copyright headers in source files to "Istio Authors" instead of using company affiliation specific headers, as code gets refactored back and forth, with contributions from different companies in the same code.

The kubernetes community follows the same format as well.

Copyright 2017 The Istio Authors.

Assigning this to everyone on the team so that folks are aware of this for future purposes. Need to put this in the README.

@istio/manager-hackers @istio/mixer-hackers @istio/proxy-hackers @istio/api-hackers @istio/istio-hackers

Support TCP routing

We need to route ingress/egress TCP traffic based on src or dst IP in the iptabes solution we have.

Naming services and service versions in Kubernetes

The routing rules (see #12) will refer to multiple versions of the same logical service.
If the logical service A has versions "v1" and "v2", we will need to represent both versions as Kubernetes service.

There are approaches:
(a) The natural approach is to identify Kubernetes service A with Istio service A. Now each pod in service A is identical and runs the same proxy container. The only way to differentiate between them is to register them in the Manager and assign pod IPs to versions "v1" and "v2" in the Manager. This approach does not work if the service container is different between "v1" and "v2".
CORRECTION: we can deal with this by binding version to the pod template in the deployment.

(b) The other approach is to instantiate two Kubernetes services "A-v1" and "A-v2". Then the name encodes both the logical service name and version. Due to restrictions in the naming scheme, "A-v1" is likely to be the choice here.

@rshriram any insights on this question?
@mjog how do we map "subjects" in the rules to the concrete service and pod name?

Development guidelines documentation

Things to cover:

  • build instructions for various set-ups
    • dependencies on golang, bazel
    • brief intro to bazel
  • code organization
  • code style
    • documentation for new features
    • naming conventions for files, structs, interfaces
    • logging framework
  • git workflow
    • policy on review
    • policy on merging PRs
  • testing infrastructure
    • coverage requirements for unit tests
    • integration tests overview
      • troubleshooting
    • docker images

Anything else I left out?

(dumb) proxy agent for proxy management

The configuration for the layer 7 proxy would be available at a third party resource endpoint such as /proxyconfig/<proxyID>. We need an agent to watch this endpoint for changes, obtain the new config and reload the proxy when needed. The agent should also take care of managing the proxy process, reaping zombies and properly exiting the container.

cc @kyessenov (feel free to edit)

p.s.: can the same channel (third party resource endpoint) be used for obtaining status info about the proxy? (i.e., whether the config was successfully applied or not).

Investigate Envoy watcher race

Integration tests sometimes fail with pods going into crash loop. I had a look and it seems like Envoy restart failure, with epochs n, n+2 running but not n+1. We need to narrow down the root cause.

IP table rules for a single port deployment

Once Envoy receives support for indirect port listeners, we should add a script to program IP tables along with the proxy. We should do that work in the proxy directory.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.