Giter Site home page Giter Site logo

mbostock / arquero Goto Github PK

View Code? Open in Web Editor NEW

This project forked from uwdata/arquero

2.0 2.0 0.0 759 KB

Query processing and transformation of array-backed data tables.

Home Page: https://uwdata.github.io/arquero

License: BSD 3-Clause "New" or "Revised" License

JavaScript 100.00%

arquero's Introduction

Arquero

Arquero is a JavaScript library for query processing and transformation of array-backed data tables. Following the relational algebra and inspired by the design of dplyr, Arquero provides a fluent API for manipulating column-oriented data frames. Arquero supports a range of data transformation tasks, including filter, sample, aggregation, window, join, and reshaping operations.

  • Fast: process data tables with million+ rows.
  • Flexible: query over arrays, typed arrays, array-like objects, or Apache Arrow columns.
  • Full-Featured: perform a variety of wrangling and analysis tasks.
  • Extensible: add new column types or functions, including aggregate & window operations.
  • Lightweight: small size, minimal dependencies.

To get up and running, start with the Introducing Arquero tutorial, part of the Arquero notebook collection.

Arquero is Spanish for "archer": if datasets are arrows, Arquero helps their aim stay true. ๐Ÿน Arquero also refers to a goalkeeper: safeguard your data from analytic "own goals"! ๐Ÿฅ… โœ‹ โšฝ

API Documentation

  • Top-Level API - All methods in the top-level Arquero namespace.
  • Table - Table access and output methods.
  • Verbs - Table transformation verbs.
  • Op Functions - All functions, including aggregate and window functions.
  • Expressions - Parsing and generation of table expressions.
  • Extensibility - Extend Arquero with new expression functions or table verbs.

Example

The core abstractions in Arquero are data tables, which model each column as an array of values, and verbs that transform data and return new tables. Verbs are table methods, allowing method chaining for multi-step transformations. Though each table is unique, many verbs reuse the underlying columns to limit duplication.

import { all, desc, op, table } from 'arquero';

// Average hours of sunshine per month, from https://usclimatedata.com/.
const dt = table({
  'Seattle': [69,108,178,207,253,268,312,281,221,142,72,52],
  'Chicago': [135,136,187,215,281,311,318,283,226,193,113,106],
  'San Francisco': [165,182,251,281,314,330,300,272,267,243,189,156]
});

// Sorted differences between Seattle and Chicago.
// Table expressions use arrow function syntax.
dt.derive({
    month: d => op.row_number(),
    diff:  d => d.Seattle - d.Chicago
  })
  .select('month', 'diff')
  .orderby(desc('diff'))
  .print();

// Is Seattle more correlated with San Francisco or Chicago?
// Operations accept column name strings outside a function context.
dt.rollup({
    corr_sf:  op.corr('Seattle', 'San Francisco'),
    corr_chi: op.corr('Seattle', 'Chicago')
  })
  .print();

// Aggregate statistics per city, as output objects.
// Reshape (fold) the data to a two column layout: city, sun.
dt.fold(all(), { as: ['city', 'sun'] })
  .groupby('city')
  .rollup({
    min:  d => op.min(d.sun), // functional form of op.min('sun')
    max:  d => op.max(d.sun),
    avg:  d => op.average(d.sun),
    med:  d => op.median(d.sun),
    // functional forms permit flexible table expressions
    skew: ({sun: s}) => (op.mean(s) - op.median(s)) / op.stdev(s) || 0
  })
  .objects()

Usage

In Browser

To use in the browser, you can load Arquero from a content delivery network:

<script src="https://cdn.jsdelivr.net/npm/arquero@latest"></script>

Arquero will be imported into the aq global object.

Alternatively, you can build and import arquero.min.js from the build directory, or build your own application package.

Arquero uses modern JavaScript features, and so will not work with some outdated browsers. To use Arquero with older browsers including Internet Explorer, set up your project with a transpiler such as Babel.

In Node.js or Application Bundles

First install arquero as a dependency, via npm install arquero --save or yarn add arquero. Arquero assumes Node version 12 or higher.

Import using CommonJS module syntax:

const aq = require('arquero');

Import using ES module syntax, import all exports into a single object:

import * as aq from 'arquero';

Import using ES module syntax, with targeted imports:

import { op, table } from 'arquero';

Build Instructions

To build and develop Arquero locally:

arquero's People

Contributors

dworthen avatar jheer avatar john-guerra avatar mbostock avatar natoverse avatar pgte avatar t829702 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.