Giter Site home page Giter Site logo

graphql-crunch's Introduction

graphql-crunch

NPM version

Optimizes JSON responses by minimizing duplication and improving compressibility.

On Banter.fm, we see a 76% reduction in raw JSON size and a 30% reduction in gzip'd size. This leads to reduced transfer time and faster JSON parsing on mobile.

Client support

graphql-crunch is client agnostic and can be used anywhere that sends or receives JSON. We provide examples for integration with apollo-client as we use this in a GraphQL environment.

Installation

This library is distributed on npm. In order to add it as a dependency, run the following command:

$ npm install graphql-crunch --save

or with Yarn:

$ yarn add graphql-crunch

How does it work?

We flatten the object hierarchy into an array using a post-order traversal of the object graph. As we traverse we efficiently check if we've come across a value before, including arrays and objects, and replace it with a reference to it's earlier occurence if we've seen it. Values are only ever present in the array once.

Note: Crunching and uncrunching is an entirely lossless process. The final payload exactly matches the original.

Motivation

Large JSON blobs can be slow to parse on some mobile platforms, especially older Android phones, so we set out to improve that. At the same time we also wound up making the payloads more amenable to gzip compression too. GraphQL and REST-ful API responses tend to have a lot of duplication leading to huge payload sizes.

Example

In these examples, we use the SWAPI GraphQL demo.

Small Example

Using this query we'll fetch the first 2 people and their first 2 films and the first 2 characters in each of those films. We limit the connections to the first two items to keep the payload small:

{
  allPeople(first: 2) {
    people {
      name
      gender
      filmConnection(first: 2) {
        films {
          title
          characterConnection(first: 2) {
            characters {
              name
              gender
            }
          }
        }
      }
    }
  }
}

We get this response:

{
  "data": {
    "allPeople": {
      "people": [
        {
          "name": "Luke Skywalker",
          "gender": "male",
          "filmConnection": {
            "films": [
              {
                "title": "A New Hope",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              },
              {
                "title": "The Empire Strikes Back",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "name": "C-3PO",
          "gender": "n/a",
          "filmConnection": {
            "films": [
              {
                "title": "A New Hope",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              },
              {
                "title": "The Empire Strikes Back",
                "characterConnection": {
                  "characters": [
                    {
                      "name": "Luke Skywalker",
                      "gender": "male"
                    },
                    {
                      "name": "C-3PO",
                      "gender": "n/a"
                    }
                  ]
                }
              }
            ]
          }
        }
      ]
    }
  }
}

After we crunch it, we get:

{
  "data": [
    "male",
    "Luke Skywalker",
    { "gender": 0, "name": 1 },
    "n/a",
    "C-3PO",
    { "gender": 3, "name": 4 },
    [2, 5],
    { "characters": 6 },
    "A New Hope",
    { "characterConnection": 7, "title": 8 },
    "The Empire Strikes Back",
    { "characterConnection": 7, "title": 10 },
    [9, 11],
    { "films": 12 },
    { "filmConnection": 13, "gender": 0, "name": 1 },
    { "filmConnection": 13, "gender": 3, "name": 4 },
    [14, 15],
    { "people": 16 },
    { "allPeople": 17 }
  ]
}

The transformed payload is substantially smaller. After converting both payloads to JSON (with formatting removed), the transformed payload is 49% fewer bytes.

When the client receives this, we simply uncrunch it and get back the exact original version for the client to handle.

Large Example

In real-world scenarios, we'll have modularized our shcema with fragments and have as well as connections that have more than two items in them. Here's a query similar to the one above except we don't limit the size of the connections and we request a standard set of selections on Person objects.

{
  allPeople {
    people {
      ...PersonFragment
      filmConnection {
        films {
          ...FilmFragment
        }
      }
    }
  }
}

fragment PersonFragment on Person {
  name
  birthYear
  eyeColor
  gender
  hairColor
  height
  mass
  skinColor
  homeworld {
    name
    population
  }
}

fragment FilmFragment on Film {
  title
  characterConnection {
    characters {
      ...PersonFragment
    }
  }
}

The resulting response from this query is roughly 1MB of JSON (989,946 bytes), but with tons of duplication. Here is how crunching impacts the payload size:

Raw Crunched Improvement
Size 989,946B 28,220B 97.1%
GZip'd Size 22,240B 5,069B 77.2%

This is an admittedly extreme result, but highlights the potential for crunching payloads with large amounts of duplication.

Usage

Server-side

With apollo-server you can supply a custom formatResponse function. We use this to crunch the data field of the response before sending it over the wire.

import { ApolloServer } from "apollo-server";

const server = new ApolloServer({
  // schema, context, etc...
  formatResponse: (response) => {
    if (response.data) {
      response.data = crunch(response.data);
    }
    return response;
  },
});

server.listen({ port: 80 });

To maintain compatibility with clients that aren't expecting crunched payloads, we recommend conditioning the crunch on a query param, like so:

import url from "url";
import querystring from "querystring";
import { ApolloServer } from "apollo-server";

const server = new ApolloServer({
  // schema, context, etc...
  formatResponse: (response, options) => {
    const parsed = url.parse(options.context.request.url);
    const query = querystring.parse(parsed.query);

    if (query.crunch && response.data) {
      const version = parseInt(query.crunch) || 1;
      response.data = crunch(response.data, version);
    }

    return response;
  },
});

server.listen({ port: 80 });

Now only clients that opt-in to crunched payloads via the ?crunch=2 query parameter will receive them.

Your client can specify the version of the crunch format to use in the query parameter. If the version isn't specified, or an unknown version is supplied, we default to v1.0.

Client-side

On the client, we uncrunch the server response before the GraphQL client processes it.

With apollo-client, use a link configuration to setup an afterware, e.g.

import { ApolloClient } from 'apollo-client';
import { ApolloLink, concat } from 'apollo-link';
import { HttpLink } from 'apollo-link-http';
import { uncrunch } from 'graphql-crunch';

const http = new HttpLink({
  credentials: 'include',
  uri: '/api'
});

const uncruncher = new ApolloLink((operation, forward) =>
  forward(operation)
    .map((response) => {
      response.data = uncrunch(response.data);
      return response;
    });
);

const client = new ApolloClient({link: concat(uncruncher, http)});

graphql-crunch's People

Contributors

andrewprins avatar binchik avatar dependabot[bot] avatar jamesreggio avatar jekiwijaya avatar kachkaev avatar kpman avatar lfades avatar mistereo avatar rich-harris avatar stevekrenzel avatar zephraph avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

graphql-crunch's Issues

Integrate directly into resolvers

I really like graphql-crunch; great work! Out of curiosity, have you considered extending the "crunching" to go all the way into the resolver runtime itself?

I.e. if I've already called authorResolver.name(a1), just don't call that name for a1 again during the rest of this request.

Granted, this would need to rely on __typename+id semantics for identity, and also an understanding from resolvers that they won't change their output based on "where in the graph the object was fetched", i.e. anything in the 4th/info param of the resolver.

But, assuming that was the case, it seems like this could slice off a whole slew of work that apollo/graphql tools puts into making a giant JSON-with-repeated-info tree that graphql-crunch is going to immediately throw away ~70% of.

Thanks!

Not compiled to es5

graphql-crunch doesn't follow the convention of publishing compiled libraries as es5. This can lead to a site appearing to funcion normally while bombing outright on Microsoft Edge. :-(

Please follow this convention. It prevents accidents.

How do you use version 2 of api with apollo?

I encountered that Apollo would error on { crunched: Object, version:number } response due to its parseAndCheckHttpResponse method looking for data and errors keys and finding those above. To make things worse we have to make it friendly with batchedHttpLink, it crunches responses across all responses in the batch array. What do you use crunch v2 with?

Further Crunching On Key Level

Hey there!

I noticed that the values are given numerical values, but not keys. There seems to be a lot of space savings on that front too.

What was the reason for not applying the same optimizations to keys? GraphQL properties can only be alphabetical characters, so we know that the keys can never be numbers.

Thanks for open sourcing this!

Wrong response if response in date string

Actual response

{
  "data": {
    "notifications": [
      {
        "id": "16",
        "createdAt": "2019-05-04T17:10:40.509Z"
      },
  }
}

crunched response:

{
  "data": [
    "16",
    {},
    {
      "id": 0,
      "createdAt": 1
    },
    [
      2
    ]
  ]
}

somehow the createdAt converted into empty object

version: 1

Hi i am facing problem in apolloclient config

i am getting the following error:

Error: no value resolved
at Object.complete (chrome-extension://jdkknkkbebbapilgoeccciglkfbmbnfm/dist/devtools.js:1:695191)
at complete (chrome-extension://jdkknkkbebbapilgoeccciglkfbmbnfm/dist/devtools.js:1:376930)
at t.s (chrome-extension://jdkknkkbebbapilgoeccciglkfbmbnfm/dist/devtools.js:1:673115)
at t.n.emit (chrome-extension://jdkknkkbebbapilgoeccciglkfbmbnfm/dist/devtools.js:1:307830)
at chrome-extension://jdkknkkbebbapilgoeccciglkfbmbnfm/dist/devtools.js:1:381756

please suggest.

Thanks & Regards
gmchaturvedi

Unexpected efficiency results with crunching

Hi, I tried graphql-crunch on our largest json response: a newsfeed with posts. Common repeating elements are the posting user and associated groups.

Version 1 crunch

  • Seems to extract the common elements properly
  • The response file size seems actually larger than without crunching

Version 2 crunch

  • It extracts common elements at a lower level when extracting at a higher level would be valid and more efficient.
  • E.g. a postUser contains a user and an optional impersonated group. Instead of referencing the post user as a whole, it only references the user and impersonated group, repeating the post user structure everywhere.

I do like the fact that v2 is much more readable.

I've attached example files.

posts+crunch.zip

[Feature-Request][How-To] Use with Apollo-Server-Express

Is it or would it be possible to use with Apollo-Server-Express? I have it setup as follows

Server:

const server = new ApolloServer({
  typeDefs,
  resolvers,
  context: ({ req, res }) => ({
    models,
    user: req.user,
    req,
    res
  }),
  introspection: process.env.NODE_ENV === 'production' ? false : true,
  onHealthCheck: async () => {
    try {
      const result = await onHealthCheck(req, res)
      res.json({ status: 'pass', ...(result || {}) })
    } catch (err) {
      res.status(503).json({ status: 'fail' });
    }
  },
  formatResponse: (response) => {
    if (response.data) {
      response.data = crunch(response.data);
      return response;
    }
  }
});

Client:

const middlewareLink = new ApolloLink((operation, forward) => {
  operation.setContext({
    headers: {
      accessToken: localStorage.getItem('x-access-token') || null,
      refreshToken: localStorage.getItem('x-refresh-token') || null
    }
  });
  // return forward(operation);
  return forward(operation).map((response) => {return uncrunch(response.data)});
});

But am getting the following error message when I attempt to query data
{"errors":[{"message":"crunch is not defined","extensions":{"code":"INTERNAL_SERVER_ERROR","exception":{"stacktrace":["ReferenceError: crunch is not defined"," at Object.formatResponse (D:\\Repos\\DBI\\graphql-apollo\\src\\server.js:71:7)"," at Object.<anonymous> (D:\\Repos\\DBI\\graphql-apollo\\node_modules\\apollo-server-core\\dist\\requestPipeline.js:193:50)"," at Generator.next (<anonymous>)"," at fulfilled (D:\\Repos\\DBI\\graphql-apollo\\node_modules\\apollo-server-core\\dist\\requestPipeline.js:5:58)"," at processTicksAndRejections (internal/process/task_queues.js:97:5)"]}}}]}

Integrate a chrome extension for previews

this is more of a suggestion but it would be really nice to be able to have a preview of the API response when using this with Apollo graphql. it would make debugging way easier

Explanation of the v2 `encode` function

Hey!
I have most of my (GraphQL) servers written in Rust and wanted to port the algorithm used here to make it compatible with the JS world.

However, I do not really understand the need of the encode function when it comes down to numbers. Why are they multiplied by 2 (and sometimes incremented by 1)?
Additionally, wouldn't multiplying the value by 2 decrease the available number space to half its normal size? Meaning that servers can't use the whole number range?

formatResponse(response, options) field options.context.request is undefined

I want to access the request URL inside formatResponse handler.

const server = new ApolloServer({
  // schema, context, etc...
  formatResponse: (response, options) => {
    const parsed = url.parse(options.context.request.url);
    const query = querystring.parse(parsed.query);

    if(query.crunch && response.data) {
      const version = parseInt(query.crunch) || 1;
      response.data = crunch(response.data, version);
    }

    return response;
  },
});

but the request field is undefined inside const parsed = url.parse(options.context.request.url);

even typescript complains about it.

I know it's related to apollo-server. but it's an example provided in graphql-crunch I think there may be something wrong with this example or maybe with apollo-server

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.