TL;DR I see a path where CCN’s ?
could be leveraged by smart clients to safely expose the true resolver-nullability of fields directly to product code.
Prelude: CCN Behavior Definition
Since Client Controlled Nullability (CCN) may have a different meaning to different people, I’ll start by specifying my hope for how CCN will work. The rest of this post assumes this behavior:
Under CCN, the !
and ?
annotations would allow the query to override the schema nullability of a field within a selection for the purposes of the execution of that selection.
!
means: Treat this field, in this selection, as if it were non-nullable
?
means: Treat this field, in this selection, as nullable
In other words, for the purposes of executing a query selection, every place the spec refers a field’s schema nullability, it would instead refer to the field’s nullability within the selection, which may or may not have been modified by CCN annotations in the query. Beyond that, all error handling and null bubbling behaviors of the current spec would be unchanged. Note that this includes the fact that errors thrown by a ?
field would still be included in the response errors metadata.
Client-defined resiliency
GraphQL’s current recommended approach to providing response resiliency in the face of resolver errors is to make fields in the schema nullable by default. Unfortunately, this has the effect of obscuring the true nullability of fields. Clients, and even users, can’t tell from the schema alone if null is expected as a possible value, or if the field will only return null in exceptional (error) cases.
In this world of nullable-by-default schemas, the Client Controlled Nullability (CCN) proposal is primarily a tool to add assertions via !
. While this is a marked ergonomic improvement, these assertions must be added blindly, without knowing if null is an expected value or not. This is at best awkward and at worst dangerous.
However, CCN’s ?
, opens up the up the possibility of a different mechanism to achieve request resiliency. One which avoids obscuring the true nullability of fields. Specifically, an approach where we shift expectation from “it is the server/schema’s responsibility to make requests resilient to errors by typing fields as nullable” to “it is the client/queries responsibility to make requests resilient to errors by annotating non-nullable fields with ?
”.
With this approach to resiliency, the schema could specify the “true” nullability of the fields.
For simplistic clients, e.g. Curl, the client/user can now see the true nullability of each field in the schema and add the appropriate amount of resilience for their use case using CCN’s ?
. In a sense this is the same as CCN’s !
applied to a fully nullable schema, in that the client is empowered to declare which fields it can manage without, and which fields it requires. We’ve “simply” inverted the default. Of course, defaults are tremendously powerful and this tradeoff should be considered carefully. See “The power of defaults” below.
The opportunity for smart clients
For smart clients, this approach can not only let users “see” the true nullability, it can actually let product code interacted with generate types that model this true nullability. I see this as a fundamental solution of the actual problem that CCN initially set out to solve.
If smart clients can transform errored fields into contained thrown exceptions, that would mean product code should never encounter a null value due to a resolver error. In that case, the types that the smart client generates for its fragments/queries could safely express the true nullability of those fields on the server.
This is something that we are currently, actively exploring for Relay. I’d encourage you to read the linked issue, but in short rather than containing errors with null bubbling, we contain errors with error boundaries. For cases where the user wants to imperatively handle the error case, they may add a @catch
directive to the field which behaves very similarly to CCNs
?` and would hopefully some day be subsumed by it.
Note that compiler-based smart clients like Relay transform the queries/fragments defined by the user before sending them to the server. This means Relay can auto-insert ?
s on all non-nullable fields, ensuring resiliency is the default behavior and we will always render as much of the UI as possible, given the data that the server was able to send.
So, Relay would use ?
in two different was:
- As a hidden implementation detail used to ask the server to not apply null bubbling
- As a user-facing feature to allow components to locally handle errors instead of relying on error boundaries
A pattern not a feature
One appealing aspect of this vision is that it’s simply composed from existing, or at least proposed, GraphQL spec primitives. It does not require any additional spec changes, and can be optionally adopted by those who find it a good tradeoff.
Appendix/Caveats
This solution is not a silver bullet. It may not be viable for other clients, and even for Relay there are significant challenges that would need to be solved first. I propose it here more as a long-term vision than as an immediate next step. Here are list of concerns/caveats/complicating factors:
Missing Data
In Relay, there are actually two reasons that we type all fields as optional:
- The field might return null due to error
- The field might be missing due to normalization
To make Relay fields non-nullable by default, we’ll need to first provide a mechanism for Relay to handle refetching (or erroring) in the face of missing data. I believe missing data is a fundamental gap in Relay today and is deserving of a project to resolve that gap.
The power of defaults
Shifting responsibility from the server to the client makes it harder to enforce this best practice of resiliency. Opinionated smart client frameworks may be able take over the role of enforcing resiliency by auto-inserting ?
s, but the story for simplistic clients is less clear.
Users will instinctively take the path of least resistance. If adding resiliency is extra work that is not forced upon them by the server or a client framework, it is likely that client code will tend not to go the extra mile to handle potential errors.
Error boundaries
This approach is dependent upon having a client architecture that allows product code to contain errors thrown during render. React Error Boundaries provide this primitive, but client architectures without such a feature may not have a clear path to adding explicit error handling, which is a necessary ingredient for this approach to work.
Even in Relay, explicit error handling has not yet been validated, though we hope to ship it to production soon.
Breaking changes
Another reason that GraphQL recommends that all fields be nullable, even if their current implementation is non-nullable, is that it allows us to turn a non-nullable field into a nullable field as a non-breaking change. This is especially important on mobile where clients live essentially forever. Being able to make a field nullable can be key to being able to delete code.
I don’t have a solution to this problem, but I am curious to learn how well it works in practice. Have users of this approach actually be able to routinely make fields nullable without breaking old clients? Are product engineers really designing apps that gracefully degrade in the face of any field being null? The convergent evolution of @required
and CCN’s !
makes me wonder.
Worst case, the approach I outline here would only be viable for clients with a finite support window.
Alternatives to CCN
Our use of CCN to enable this new model, is more opportunistic than designed. CCN offers primitives that smart clients can leverage behind the scenes as a compiler implementation detail. The core behavior we really want is:
- A schema that exposes the true nullability of fields, at the same time as…
- An execution model that performs no null bubbling
This works because we can expect the smart client to intercept error fields before they reach product code, shielding it from nulls in non-nullable locations.
If we think this model is broadly valuable, it’s possible we would want to explore a more explicit mechanism to enable this execution model rather than simply allowing smart clients to fake this execution model via compiler-inserted CCN annotations.