apollographql / router Goto Github PK

View Code? Open in Web Editor NEW

762.0 28.0 245.0 25.35 MB

A configurable, high-performance routing runtime for Apollo Federation 🚀

Home Page: https://www.apollographql.com/docs/router/

License: Other

Shell 0.20% Rust 99.16% Dockerfile 0.03% Handlebars 0.03% HTML 0.07% TLA 0.34% Smarty 0.06% TypeScript 0.11%

graphql federation apollo graph-router federated-graph gateway gateway-api

router's Issues

Support for reporting of HTTP response headers to Studio as part of a trace

On our infrastructure side, we have built API surface area (and corresponding UI) that supports showing the response headers of traces - however we don't currently send response headers as part of the trace. We can and should allow this, but much in the same way we do for request headers, we should consider additional data privacy concerns that need to be addressed with any implementation here.

Improving caching on CircleCI

We should change those (in the 3 platform):

            - rust/build:
                # This is prefixed in the orb with 'cargo-'
                cache_version: v2-macos
                crate: --workspace --tests

To do:

cargo build --workspace --tests --no-default-features --features otlp-tonic
cargo build --workspace --tests --no-default-features --features otlp-tonic,tls
cargo build --workspace --tests --no-default-features --features otlp-grpcio (not on windows)
cargo build --workspace --tests --no-default-features --features otlp-http

Hot reloading (file watching) probably does not work on OSX

While working on the release workflow I had to ignore some tests for OSX because they were not working. I believe it is possible the hot reloading of the configuration and schema is not working on OSX.

cc @BrynCooke

Setup `cargo-deny`

Requirements

We should make sure we're including packages with compatible licenses for the project. For simplicity in implementing, let's merely say MIT to start with, but we can consider adding more in the future.

One suggestion for a tool is cargo-deny.

Add integration test for Open Telemetry

Describe the solution you'd like
Open telemetry works always.

Additional context
We recently had an uncaught regression as there were no integration tests for Otel.
We should take the time to add a test for each type of collector.

Replace serde_json::Value with proper struct to parse the selection thingies

I think I know what is happening. We are processing a serde_json::Value while actually we should create our own struct that parses the JSON properly. With a proper struct we will be able to manage parsing errors earlier and avoid complicating our code like this.

Generate license file from dependencies

We should generate this file based on our dependencies — perhaps using cargo-about?) — and include this file in the packaged distributions. Our LICENSE should also be included.

WASM integration POC

Is your feature request related to a problem? Please describe.
Users need to be able to perform limited processing on requests without using in tree extensions.
The advantages are that it hugely reduces the surface area that we need to support, and we can leverage https://github.com/tetratelabs/proxy-wasm-go-sdk

Describe the solution you'd like
Have a go creating a simple filter using https://github.com/tetratelabs/proxy-wasm-go-sdk
Demonstrate that there is a path for us to provide extensions specific to our the router.
Provide a simple description of what the user workflow would look like when creating and using such a filter.

Describe alternatives you've considered
In tree extensions - This will be difficult to support.
Hard coded functionality - We should not rule this out, but ideally we have one method for manipulating requests. We should have a set of bundled WASM filters that users can use.

Schema update is not propagated to the HTTP server

Describe the bug
maybe I'm missing something, but it looks like UpdateSchema messages change the schema in the state machine: https://github.com/apollographql/router/blob/07dfdddedc3e90cf6a7e0124ceb9f1a23446cae6/crates/apollo-router/src/state_machine.rs#L150-L172

but does not pass it to the HTTP server. It is only changed here: https://github.com/apollographql/router/blob/07dfdddedc3e90cf6a7e0124ceb9f1a23446cae6/crates/apollo-router/src/state_machine.rs#L193-L196

To Reproduce

At this line: https://github.com/apollographql/router/blob/07dfdddedc3e90cf6a7e0124ceb9f1a23446cae6/crates/apollo-router/src/lib.rs#L579

add the following:

SchemaKind::Instance(Schema::from_str("").unwrap())
                .into_stream()
                .boxed(),

This will first set up a server with the configuration and an empty schema, then try to update the schema. Any query to the router will fail because the schema used is still the empty one

start a configuration reload from a HUP signal

Right now we support configuration reload through filesystem watch. This can be unreliable sometimes (missing udpdates), and it will try to reload files on each save even if we're not done modifying (the router should check the configuration file and schema before replacing, but that's another topic).

Should we support a signal to tell the router to reload its configuration? Specifically, SIGHUP is commonly accepted for that. That would fit well with tools like systemd and its ExecReload command

studio: Schema reporting

Requirements

We need to be able to report the currently active schema to Apollo Studio. There's some existing art and a TypeScript reference implementation to follow along with here, so might be worth looking at, e.g., apollographql/apollo-server#5187 for inspiration and talking points in building out the design.

Automated publish of docker image

Create docker images and publish them.
Consider using https://github.com/GoogleContainerTools/distroless

Aliasing support

Query plan has alias in selections format.
Currently we are not using this.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Repository problems

These problems occurred while renovating this repository. View logs.

WARN: Found renovate config warnings

Pending Approval

These branches will be created by Renovate only once you click their checkbox below.

fix(deps): update cargo tracing packages (patch) (tracing, tracing-core)
chore(deps): update cargo tracing packages (minor) (opentelemetry, opentelemetry-aws, opentelemetry-datadog, opentelemetry-http, opentelemetry-jaeger, opentelemetry-otlp, opentelemetry-prometheus, opentelemetry-semantic-conventions, opentelemetry-stdout, opentelemetry-zipkin, opentelemetry_sdk, tracing-opentelemetry)
🔐 Create all pending approval PRs at once 🔐

Rate-Limited

These updates are currently rate-limited. Click on a checkbox below to force their creation now.

Edited/Blocked

These updates have been manually edited so Renovate will no longer make changes. To discard all commits and start over, click on a checkbox.

fix(deps): update all non-major packages >= 1.0 (@apollo/server, anyhow, arc-swap, aws-config, aws-credential-types, aws-sigv4, aws-smithy-runtime-api, aws-types, brotli, bytes, clap, dd-trace, flate2, indexmap, insta, jsonwebtoken, memchr, once_cell, paste, regex, rhai, rust-embed, semver, serde, serde_json, serial_test, similar, tempfile, thiserror, tokio, typescript, uuid, walkdir, zstd-safe)
chore(deps): update rust docker tag to v1.78.0
fix(deps): update rust crate http-serde to v2

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

chore(deps): update actions/checkout digest to 0ad4b8f
chore(deps): update cargo pre-1.0 packages (minor) (axum, base64, cargo-scaffold, env_logger, graphql_client, heck, hyper-rustls, jsonpath-rust, jsonschema, mockall, multimap, nu-ansi-term, reqwest, rustls, rustls-native-certs, serde_yaml, strum_macros, test-span, tokio-rustls, tokio-tungstenite, tonic, tonic-build, tower-http, wiremock)
chore(deps): update rust crate http to v1
fix(deps): update rust crate http-body to v1
fix(deps): update rust crate hyper to v1
fix(deps): update rust crate multer to v3
fix(deps): update rust crate rustls-pemfile to v2
Click on this checkbox to rebase all open PRs at once

Detected dependencies

cargo

Cargo.toml

apollo-compiler =1.0.0-beta.16

apollo-parser 0.7.6

apollo-smith 0.5.0

async-trait 0.1.77

http 0.2.11

insta 1.38.0

once_cell 1.19.0

reqwest 0.11.24

schemars 0.8.16

serde 1.0.197

serde_json 1.0.114

serde_json_bytes 0.2.2

tokio 1.36.0

tower 0.4.13

apollo-federation/Cargo.toml

time 0.3.34

derive_more 0.99.17

indexmap 2.1.0

lazy_static 1.4.0

multimap 0.10.0

petgraph 0.6.4

strum 0.26.0

strum_macros 0.26.0

thiserror 1.0

url 2

apollo-federation/cli/Cargo.toml

clap 4.5.1

apollo-router-benchmarks/Cargo.toml

criterion 0.5

memory-stats 1.1.0

arbitrary 1.3.2

apollo-router-scaffold/Cargo.toml

anyhow 1.0.80

clap 4.5.1

cargo-scaffold 0.11.0

regex 1

str_inflector 0.12.0

toml 0.8.10

tempfile 3.10.0

copy_dir 0.1.3

apollo-router/Cargo.toml

askama 0.12.1

access-json 0.1.0

anyhow 1.0.80

arc-swap 1.6.0

async-channel 1.9.0

async-compression 0.4.6

axum 0.6.20

base64 0.21.7

bloomfilter 1.0.13

buildstructor 0.5.4

bytes 1.5.0

clap 4.5.1

console-subscriber 0.2.0

cookie 0.18.0

ci_info 0.14.14

dashmap 5.5.3

derivative 2.2.0

derive_more 0.99.17

dhat 0.3.3

diff 0.1.13

directories 5.0.1

displaydoc 0.2

flate2 1.0.28

fred 7.1.2

futures 0.3.30

graphql_client 0.13.0

hex 0.4.3

http-body 0.4.6

heck 0.4.1

humantime 2.1.0

humantime-serde 1.1.1

hyper 0.14.28

hyper-rustls 0.24.2

indexmap 2.2.3

itertools 0.12.1

jsonpath_lib 0.3.0

jsonpath-rust 0.3.5

jsonschema 0.17.1

jsonwebtoken 9.2.0

lazy_static 1.4.0

libc 0.2.153

linkme 0.3.23

lru 0.12.2

maplit 1.0.2

mediatype 0.19.18

mockall 0.11.4

mime 0.3.17

multer 2.1.0

multimap 0.9.1

notify 6.1.1

nu-ansi-term 0.49

once_cell 1.19.0

opentelemetry 0.20.0

opentelemetry_sdk 0.20.0

opentelemetry_api 0.20.0

opentelemetry-aws 0.8.0

opentelemetry-datadog 0.8.0

opentelemetry-http 0.9.0

opentelemetry-jaeger 0.19.0

opentelemetry-otlp 0.13.0

opentelemetry-semantic-conventions 0.12.0

opentelemetry-zipkin 0.18.0

opentelemetry-prometheus 0.13.0

paste 1.0.14

pin-project-lite 0.2.13

prometheus 0.13

prost 0.12.3

prost-types 0.12.3

proteus 0.5.0

rand 0.8.5

rhai =1.17.1

regex 1.10.3

router-bridge =0.5.21+v2.7.5

rust-embed 8.2.0

rustls 0.21.11

rustls-native-certs 0.6.3

rustls-pemfile 1.0.4

shellexpand 3.1.0

sha2 0.10.8

semver 1.0.22

serde_derive_default 0.1

serde_urlencoded 0.7.1

serde_yaml 0.8.26

static_assertions 1.1.0

strum_macros 0.25.3

sys-info 0.9.1

thiserror 1.0.57

tokio-stream 0.1.14

tokio-util 0.7.10

tonic 0.9.2

tower-http 0.4.4

tower-service 0.3.2

tracing 0.1.37

tracing-core 0.1.31

tracing-futures 0.2.5

tracing-subscriber 0.3.18

trust-dns-resolver 0.23.2

url 2.5.0

urlencoding 2.1.3

uuid 1.7.0

yaml-rust 0.4.5

wiremock 0.5.22

wsl 0.1.0

tokio-tungstenite 0.20.1

tokio-rustls 0.24.1

http-serde 1.1.3

hmac 0.12.1

parking_lot 0.12.1

memchr 2.7.1

brotli 3.4.0

zstd 0.13.0

zstd-safe 7.0.0

aws-sigv4 1.1.6

aws-credential-types 1.1.6

aws-config 1.1.6

aws-types 1.1.6

aws-smithy-runtime-api 1.1.6

sha1 0.10.6

tracing-serde 0.1.3

time 0.3.34

similar 2.4.0

console 0.15.8

bytesize 1.3.0

axum 0.6.20

ecdsa 0.16.9

fred 7.1.2

futures-test 0.3.30

maplit 1.0.2

memchr 2.7.1

mockall 0.11.4

num-traits 0.2.18

opentelemetry-stdout 0.1.0

opentelemetry 0.20.0

opentelemetry-proto 0.5.0

p256 0.13.2

rand_core 0.6.4

reqwest 0.11.24

rhai 1.17.1

serial_test 3.0.0

tempfile 3.10.0

test-log 0.2.14

test-span 0.7

basic-toml 0.1

tower-test 0.4.0

tracing-subscriber 0.3

tracing-opentelemetry 0.21.0

tracing-test 0.2.4

walkdir 2.4.0

wiremock 0.5.22

tonic-build 0.9.2

basic-toml 0.1

uname 0.1.1

uname 0.1.1

hyperlocal 0.8.0

hyperlocal 0.8.0

tikv-jemallocator 0.5

rstack 0.3.3

fuzz/Cargo.toml

libfuzzer-sys 0.4

env_logger 0.10.2

log 0.4

router-bridge =0.5.21+v2.7.5

anyhow 1

xtask/Cargo.toml

anyhow 1

camino 1

clap 4.5.1

cargo_metadata 0.18.1

chrono 0.4.34

console 0.15.8

dialoguer 0.11.0

flate2 1

graphql_client 0.13.0

itertools 0.12.1

libc 0.2

memorable-wordlist 0.1.7

nu-ansi-term 0.49

once_cell 1

regex 1.10.3

reqwest 0.11

serde 1.0.197

serde_json 1

tar 0.4

tempfile 3

tinytemplate 1.2.1

tokio 1.36.0

which 6.0.1

walkdir 2.4.0

insta 1.35.1

base64 0.21

zip 0.6

circleci

.circleci/config.yml

gh 2.3.0

slack 4.12.6

secops 2.0.7

cimg/redis 7.2.4

jaegertracing/all-in-one 1.54.0

openzipkin/zipkin 2.23.2

docker-compose

docker-compose.yml

dockerfiles/docker-compose-redis.yml

docker.io/bitnami/redis-cluster 7.2

docker.io/bitnami/redis-cluster 7.2

docker.io/bitnami/redis-cluster 7.2

docker.io/bitnami/redis-cluster 7.2

docker.io/bitnami/redis-cluster 7.2

docker.io/bitnami/redis-cluster 7.2

dockerfiles/tracing/docker-compose.datadog.yml

ghcr.io/apollographql/router v1.46.0

dockerfiles/tracing/docker-compose.jaeger.yml

ghcr.io/apollographql/router v1.46.0

dockerfiles/tracing/docker-compose.zipkin.yml

openzipkin/zipkin 3.0.6

fuzz/docker-compose.yml

dockerfile

apollo-router-scaffold/templates/base/Dockerfile

dockerfiles/Dockerfile.router

dockerfiles/diy/dockerfiles/Dockerfile.repo

dockerfiles/tracing/datadog-subgraph/Dockerfile

node 20-bullseye

dockerfiles/tracing/jaeger-subgraph/Dockerfile

node 20-bullseye

dockerfiles/tracing/router/Dockerfile

dockerfiles/tracing/zipkin-subgraph/Dockerfile

node 20-bullseye

github-actions

.github/workflows/docs-publish.yml

.github/workflows/github_projects_tagger.yml

abernix/github-issue-pull-api-hook v2.0.1

.github/workflows/main.yml

mskelton/changelog-reminder-action v3.0.0

.github/workflows/update_apollo_protobuf.yaml

actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11

peter-evans/create-pull-request v6

.github/workflows/update_uplink_schema.yml

actions/checkout v4@b4ffde65f46336ab88eb53be808477a3936bae11

peter-evans/create-pull-request v6

helm-values

helm/chart/router/values.yaml

npm

dockerfiles/tracing/datadog-subgraph/package.json

@apollo/federation ^0.38.0

@apollo/server ^4.0.0

dd-trace ^5.0.0

express ^4.18.1

graphql ^16.5.0

typescript 5.3.3

dockerfiles/tracing/jaeger-subgraph/package.json

@apollo/federation ^0.38.0

@apollo/server ^4.0.0

express ^4.18.1

graphql ^16.5.0

graphql-tag ^2.12.6

jaeger-client ^3.19.0

opentracing ^0.14.7

typescript 5.3.3

dockerfiles/tracing/zipkin-subgraph/package.json

@apollo/federation ^0.38.0

@apollo/server ^4.0.0

express ^4.18.1

graphql ^16.5.0

graphql-tag ^2.12.6

jaeger-client ^3.19.0

opentracing ^0.14.7

zipkin-javascript-opentracing ^3.0.0

typescript 5.3.3

regex

apollo-router-scaffold/templates/base/Dockerfile

rust 1.76.0

dockerfiles/diy/dockerfiles/Dockerfile.repo

rust 1.76.0

docs/source/customizations/custom-binary.mdx

rust 1.76.0

rust-toolchain.toml

rust 1.76.0

Check this box to trigger a request for Renovate to run again on this repository

Dynamic CORS header configuration

Description

This is perhaps something that will require out-of-tree extensions or more extensive YAML configuration, as it asks for the ability to run conditionals on the origin. E.g.:

If the domain ends with zyx.com #965
If the path includes /admin/ #1444
Using Rhai script

Response construction

Is your feature request related to a problem? Please describe.
The response generated from subgraph responses does not match the query.
Examples with the federation demo:

field order

query ExampleQuery { me  {
  identifiant: id
  reviews { body }
 id 
} }

returns:

"data": {
    "moi": {
        "identifiant": "1",
        "__typename": "User",
        "id": "1",
        "reviews": [{
            "body": "Love it!"
        }, {
            "body": "Too expensive."
        }]
    }
}
}

The id field should appear after the reviews field, but because a first query to the accounts subgraph must be performed to obtained the id key, before requesting the review, the id is added to the response before the reviews

unnecessary data

query ExampleQuery {
 me  {
   reviews { body }
} }

returns

{
    "data": {
        "me": {
            "__typename": "User",
            "id": "1",
            "reviews": [{
                "body": "Love it!"
            }, {
                "body": "Too expensive."
            }]
        }
    }
}

The id field should not be there, but since it was used for the join, it was added by the response from the accounts subgraph.

These inconsistencies in the responses will block the work on integration testing #47, because they will generate a lot of differences between the gateway and router responses.

Describe the solution you'd like
The response should be created using the query plan, with only the required fields, in order, then subgraph responses would only be merged where it is necessary.

Describe alternatives you've considered
This work could be done as part of the future Rust query planner, but it looks like it can be done independently, only using the common query plan format.

Investigate lifting output shaping tests from gateway-js/

Event metrics

Description

The Router should be able to emit metrics that enable gauges which show the server's performance. As a baseline suggestion to our implementors, we might suggest routing them through the Open Telemetry Collector as a central point for Traces, Metrics and Logs.

Some variables to consider in the implementation include:

Metric candidates

Request response time
Request totals
Request error totals
Downstream response time per service
Downstream error total
Downstream total requests per service

Metric attribute candidates

Each metric candidate may also have metric attributes attached to it. Some plausible options here include:

Apollo Studio Graph Ref for each metric
Client ID for each metric

Metric format candidates

Prometheus
OTLP

CORS default configuration

Followup to the previous work on 5b36237

Comments on the introspection efforts raised the fact we're not quite aligned on what we would like the cors default setting to be, and what we would like to customize it wit.
[Allow any origin might ease use and setup but it might not be considered as the safe default option

Let's write our expected CORS behavior down, so I can add unit tests. We can of course revisit it anytime

Add selections subtype checking (Phase 2)

This follows up ae9c728 — which implemented Phase 1 — but is currently blocked awaiting the apollo-rs primitives.

Goal

Follow up with Phase 2 of the project, once we have apollo-rs.

Phase 1 (Already completed / BEFORE THIS ISSUE / Stopgap)

apollo-rs will provide an AST API
Parse schema using apollo-rs at existing schema load point, and pass into federated execution.
Selection construction will currently check __typename for an exact match against the query plan. This will be relaxed to also include subtypes.
We will navigate the AST manually to check subtypes.

Phase 2 (THIS ISSUE)

apollo-rs will eventually provide an higher-level intermediary representation and API. At minimum the following features are required:
1. Get a GraphQL type by name
2. isSubtypeOf method to check if a type is a subtype of another type.
Rework the router code to use the HIR.

This is issue should track the implementation of Phase 2

Selective inbound header propagation

Headers propagation should be supported on a per service basis.
Add to the config file and FederatedGraph.

Execution integration testing

Requirements

Conditional executable directive (`@skip/@include`) support

This is to make sure we consider our support for the GraphQL Specification specified conditional executable directives — @skip and @include.

query Hero($episode: Episode, $withFriends: Boolean!) {
  hero(episode: $episode) {
    name
    friends @include(if: $withFriends) {
      name
    }
  }
}

Not in scope

apollographql/router#71

Error handling audit

Create a comprehensive list of all errors.
Ensure that error messages have:

Information about what happened in a user friendly language.
Information about what to do about the error in user friendly language.

Sensitive information must not be leaked.
Consider metrics for infrastructure errors.

Per request memory tracking/limiting

Limit the amount of memory a request can use on a per request basis.
Prevent users from causing failure by exhausting memory.

Faster Caching

We're currently using a HashMap to cache introspection and queries. Which might not be the best thing to do (the default hashing function is SipHash 1-3, which is robust but not the fastest one.)

We might need to investigate for better hash functions, or caching crates alltogether.

Graphql subgraph error forwarding

We're currently forwarding subgraph errors to the router caller, which proves inconsistent when it comes to the error path, given the query a caller performed, and the query the router has made differ.

We need to have a look at what the Gateway does, and port it if it makes sense. we might uncover some additional things that would come in handy in the process.

Apollo Tracing (e.g., inline tracing support)

Adds FTV1 support.

A new open telemetry exporter has been added that will convert regular traces to Apollo traces.

A buffer of spans is collected on the server side which will retain spans until the root request span is completed.
Once a request is completed the trace will be reconstructed and sent to Apollo.

Span attributes that are only relevant to Apollo tracing are prefixed with apollo_private. and are filtered out of other APM data.

@glasser Has given some guidance on how we should improve tracing, but this'll be left to followup tickets as this PR is large and has been ongoing for a significant period.

#1728.
#1729.
As an aside, this PR demonstrates that spans can be used for Apollo tracing, and that we could move to a native Otel based solution in future.

query sanitization for tracing

Is your feature request related to a problem? Please describe.
We would like to send the complete query with tracing spans (could be used by Studio for analysis). Unfortunately, queries can contain inline sensitive data in input arguments, so we cannot send them as is.

Describe the solution you'd like
We need a way to sanitize queries and remove private data. The future query planner using apollo-rs could be used to recognize raw input values, replace them with variables in the query, and put the values in the variables. That is apparently possible but we don't know what impact this will have on our users

Describe alternatives you've considered
The current way in the server is to modify the AST before sending the usage report: https://github.com/apollographql/apollo-tooling/blob/b1bd747861bcdb733a5e357c019885a6c0293ec7/packages/apollo-graphql/src/operationId.ts#L69-L78

Additional context
We might need to make query reporting more configurable, with options to send or not the query depending on the operation, or deciding whether to send variables, as is done in apollo-server https://www.apollographql.com/docs/apollo-server/api/plugin/usage-reporting/

Compare apollo-rs with the JS query planner on every query

Describe the solution you'd like
now that apollo-rs is available in the list of dependencies to parse the schema, we should also use it at the query planning stage, to compare its results (and run time) with the JS one. At first in a non blocking way, only reporting differences, then gradually replacing the JS code.

I've started that work in a branch last week, I'm now updating it to the public apollo-rs release

CI: run cargo update and check if everything is still working

It would be to have a scheduled workflow on the branch main that runs cargo update and then run all the test.

Introspection query normalization via apollo-rs

We currently don't run any normalization, which means the router won't be able to hit it's caches (or NaiveIntrospection alltogether) for queries that have different indents.

Studio Explorer Boilerplate HTML

Requirements

This replaces the behavior of redirecting directly to studio with the more complete implementation which includes a landing-page that is served locally and offers a redirect to Studio. This will match the behavior of Apollo Server and Apollo Gateway today, exactly. This offers more transparency to the user to understand what about to happen (the redirect) and allows them to optionally make the behavior sticky (on account of a browser cookie) for future requests.

Rather than redirect on the Router's configured endpoint, this replaces that redirect with serving of HTML boilerplate (e.g., with content-type: text/html) when the appropriate accept header with a satisfying text/html value is met.
Renders the same boilerplate HTML that Apollo Server uses, which the implementation for should be found here

Effectively, this should produce this experience:

Will Resolve #380

Decide Runtime Targets

We want the Router to be able to run "Anywhere" but we need to start with a practical list of targets that we can ship and be cognizant of the constraints that some "nice to haves" would put on us.

Candidates

As a conversation starter, I suggest we have only Tier 1 target architectures as candidates. Lifting that list from the linked page, it's:

target	notes
`aarch64-unknown-linux-gnu`	ARM64 Linux (kernel 4.2, glibc 2.17+) [^missing-stack-probes]
`i686-pc-windows-gnu`	32-bit MinGW (Windows 7+)
`i686-pc-windows-msvc`	32-bit MSVC (Windows 7+)
`i686-unknown-linux-gnu`	32-bit Linux (kernel 2.6.32+, glibc 2.11+)
`x86_64-apple-darwin`	64-bit macOS (10.7+, Lion+)
`x86_64-pc-windows-gnu`	64-bit MinGW (Windows 7+)
`x86_64-pc-windows-msvc`	64-bit MSVC (Windows 7+)
`x86_64-unknown-linux-gnu`	64-bit Linux (kernel 2.6.32+, glibc 2.11+)

Note that this list does not include aarch64-unknown-linux-musl. That omission doesn't by itself preclude us from running on Alpine, though we are further constrained in this regard by our dependence on V8 for the query planner. This is only intended to be a constraint until the point that we can re-write the query planner in Rust natively, or until the rusty-v8 project can run more easily on MUSL. See this issue. It's surmountable, but it's probably more work than it's worth for experimental phases.

In Scope

TBD

Out of Scope

TBD

HealthCheck Support

Is your feature request related to a problem? Please describe.
The TS server implementations provide a healthcheck endpoint and an onHealthCheck callback, we might want to provide one as well

Describe the solution you'd like
We could provide a healthcheck endpoint that returns a simple 200 status code and { status: 'pass' } for now.

Describe alternatives you've considered
As a follow up we could expose user defined callbacks ( impl Future<Output = ()> ) we could await with a timeout, or directives or anything, this would probably require deeper thoughts though.

Automated performance testing

Blocked by https://github.com/apollographql/router-perf-testing/issues/11

Automated daily performance tests.
First cut does not need to be complex, create something that we can add to.

Trigger perf test from github.

PRs should auto run perf tests

studio: Field usage reporting

Add field usage to spaceport.
Extract exporter to metrics exporter
Send field usage and fields stats to metrics exporter

Post processing compliance with schema/operation

Pre-Reqs

Reqs

Make the output match the expected shape.
This is not true Value completion.

preview @defer support

Current plan

Implement defer only for now, in the router, with some query planner modification.

@defer RFC

Related issues:

To be defined:

response stream format: right now the stream is apparent in all signatures, maybe we could have instead response types that holds the headers then a (generic) stream of graphql responses
adapting telemetry: telemetry now registers one operation per response (deferred or not), that might not be what we want in the end. But also, telemetry does not really have a concept of deferred responses

make sure that the schema is not replaced by an invalid new schema

currently we will generate a new graph from a schema, even if it is invalid: https://github.com/apollographql/router/blob/2cc92b110acf2f5858aad2854b11ccc9ceba84a8/crates/apollo-router/src/graph_factory.rs#L13-L20

The schema update feature should refuse invalid schema instead of replacing the existing one, with an error log (not a panic)

Add a ROADMAP.md doc to the repository

Help explain the direction we're headed and what we've got in mind.

Outbound header propagation

Distinct from Selective inbound header propagation #83, but for going back to the client (e.g., selectively picking headers from certain subgraphs to expose directly to the client, like a Set-Cookie header)

Update Router to use new Fed2 query planner

Requirements

This is currently blocked – pending the arrival of this functionality landing in another Federation repository — but this is just a tracking item to make sure we pick it up

In theory, this should be as simple as updating the crate to use a newer crate. Currently, that crate is harmonizer, but depending on if we de-couple that into distinct "composer" and "query planner" crates (which I personally recommend), it may be slightly more involved. (Perhaps the interface changes slightly.) It is now just a matter of updating the router-bridge, which is a dedicated package after apollographql/federation#1090.

Anyhow, I suspect this is a size/small.

Success criteria

Under the hood the requirement is that we ensure we're using the new @apollo/[email protected] npm package, but that's an implementation detail of the crate. It should be verified though!

Studio Agent

Requirements

TBD. An agent which communicates with Studio and acts as the foundation for operation (signature-based), field (shape) and latency (trace) stats to power Apollo Studio functionality.

Relevant Subtasks / Links

These are likely components/candidates of this agent!

Add support for JSON logging (e.g., maybe bunyan?)

Introspection support

Depends on apollo-rs

Documentation

Description

This is an overall tracking issue for the documentation and documentation infrastructure that we need to put together before general availability (GA). We can do without this for pre-alpha and for much of our alpha and preview phases, but not for GA.

Components

Setup documentation deployment (gatsby to netlify)
More TBD

Opentelemetry-jaeger scalability

Describe the solution you'd like
when running benchmarks with Jaeger as trace collector, opentelemetry generates a lot of errors, of two kinds:

thrift agent failed with message too long
OpenTelemetry trace error occurred. cannot send span to the batch span processor because the channel is full

To understand the errors, here's how the current tracing system works:

tracing spans are generated by the tracing crate
there's an opentelemetry subscriber that receives the span
a batch span processor accumulates spans and sends them to the exporter once enough spans have accumulated, or a ticker triggered
opentelemetry-jaeger converts the opentelemetry spans to the jaeger format then calls the uploader
the uploader serializes the spans to thrift then sends an UDP message https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-jaeger/src/exporter/uploader.rs#L48-L56https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-jaeger/src/exporter/agent.rs#L72-L92

So the first error message happens because when the batch is sent, we have no way of knowing if it will be too large for a single UDP packet (the default limit in the crate is 65000 bytes). This can be solved in two ways:

reducing batch size. The appropriate value for the batch size will depend on the traffic: the batch size is the number of top level spans, so if we have a few spans with a lot of subspans (like a query with a lot of subqueries) we could be under the batch limit but above the UDP packet limit. The batch size and other options can be managed with environment variables defined here: https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry/src/sdk/trace/span_processor.rs#L48-L63
using the auto_split parameter that was added recently: open-telemetry/opentelemetry-rust@a0899d9#diff-05c33517a7162e2e901acc0d121d9c1cfa701baf2aee85148c7b7097289b4dc2 it tries to split the batch to make smaller packets https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-jaeger/src/exporter/agent.rs#L177-L211 (this is not really vectored IO, oh well)

In my tests auto_split works most of the time. Except for very large spans, that would be larger than a UDP packet when serialized. Is there a way to send a span in multiple pieces and let jaeger aggregate them?

The second error happens because the jaeger exporter is spending too much time serializing the batch, especially in the build_span_tags function: https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-jaeger/src/exporter/mod.rs#L693-L716 (see https://gist.github.com/Geal/0b9588bdeaa05e1494636e63ef431f96 for example code reproducing our tracing pipeline and a flamegraph, the highest column is the serialization task). This function could be optimized a bit, but I don't know how much.

We're bound to run into more issues with a large number of traces, so we should investigate sampling:
https://docs.rs/opentelemetry/0.16.0/opentelemetry/sdk/trace/enum.Sampler.html

it can be set up when creating the pipeline:

let tracer = opentelemetry_jaeger::new_pipeline()
 .with_trace_config(
            trace::config()
                .with_sampler(Sampler::AlwaysOn)
) .install_batch(opentelemetry::runtime::Tokio)?;

Sampling can be controlled by a custom implementation of ShouldSample. unfortunately, the sampling decision is done before we go through the span, probably to avoid collecting too much data. I'd like to have heavy sampling for successful queries (since they would be the most frequent ones), but no sampling at all for failing queries (we want to know about those). That might require patching opentelemetry.

Variable types are not validated

Describe the bug
When passing in variables their types should be validated against the schema and the entire request rejected if they are incorrect.

Requires schema parsing to complete.

apollographql / router Goto Github PK

router's Issues

Requirements

Requirements

Repository problems

Pending Approval

Rate-Limited

Edited/Blocked

Open

Detected dependencies

Description

Description

Metric candidates

Metric attribute candidates

Metric format candidates

Goal

Phase 1 (Already completed / BEFORE THIS ISSUE / Stopgap)

Phase 2 (THIS ISSUE)

Requirements

Not in scope

Requirements

Candidates

In Scope

Out of Scope

Pre-Reqs

Reqs

Current plan

Requirements

Success criteria

Requirements

Relevant Subtasks / Links

Description

Components

Recommend Projects

Recommend Topics

Recommend Org