Giter Site home page Giter Site logo

bcgraph's Introduction


Quiver is a distributed graph learning library for PyTorch Geometric (PyG). The goal of Quiver is to make distributed graph learning easy-to-use and achieve high-performance.

Documentation Status


Why Quiver?


The primary motivation for this project is to make it easy to take a PyG program and scale it across many GPUs and CPUs. A typical scenario is: Users can use the easy-to-use APIs of PyG to efficiently develop graph learning programs, and rely on Quiver to run these PyG programs at large scale. To make such scaling effective, Quiver has several novel features:

  • High performance: Quiver enables GPUs to be effectively used in accelerating performance-critical graph learning tasks: graph sampling, feature collection and data-parallel training. Quiver thus often significantly out-perform PyG and DGL even with a single GPU (see benchmark results below), especially when processing large-scale datasets and models.

  • High scalability: Quiver can achieve (super) linear scalability in distributed graph learning. This is contributed by Quiver's novel adaptive data/feature/processor management techniques and effective usage of fast networking technologies (e.g., NVLink and RDMA).

  • Easy to use: To use Quiver, developers only need to add a few lines of code in existing PyG programs. Quiver is thus easy to be adopted by PyG users and deployed in production clusters.

Below is a chart that describes a benchmark that evaluates the performance of Quiver, PyG (2.0.1) and DGL (0.7.0) on a 4-GPU server that runs the Open Graph Benchmark.

e2e_benchmark

We will add multi-node result soon.

For system design details, see Quiver's design overview (Chinese version: 设计简介).

Install


Pip Install

To install Quiver:

  1. Install Pytorch
  2. Install PyG
  3. Install the Quiver pip package
$ pip install torch-quiver

We have tested Quiver with the following setup:

  • OS: Ubuntu 18.04, Ubuntu 20.04
  • CUDA: 10.2, 11.1
  • GPU: P100, V100, Titan X, A6000

Test Install

You can download Quiver's examples to test installation:

$ git clone git@github.com:quiver-team/torch-quiver.git && cd torch-quiver
$ python3 examples/pyg/reddit_quiver.py

A successful run should contain the following line:

Epoch xx, Loss: xx.yy, Approx. Train: xx.yy

Install from source

To build Quiver from source:

$ git clone git@github.com:quiver-team/torch-quiver.git && cd torch-quiver
$ sh ./install.sh

Use Quiver with Docker

Docker is the simplest way to use Quiver. Check the guide for details.

Quick Start

To use Quiver, you need to replace PyG's graph sampler and feature collector with quiver.Sampler and quiver.Feature. The replacement usually requires only a few changes in existing PyG programs.

Use Quiver in Single-GPU PyG Scripts

Only three steps are required to enable Quiver in a single-GPU PyG script:

import quiver

...

## Step 1: Replace PyG graph sampler
# train_loader = NeighborSampler(data.edge_index, ...) # Comment out PyG sampler
train_loader = torch.utils.data.DataLoader(train_idx) # Quiver: PyTorch Dataloader
quiver_sampler = quiver.pyg.GraphSageSampler(quiver.CSRTopo(data.edge_index), sizes=[25, 10]) # Quiver: Graph sampler

...

## Step 2: Replace PyG feature collectors
# feature = data.x.to(device) # Comment out PyG feature collector
quiver_feature = quiver.Feature(rank=0, device_list=[0]).from_cpu_tensor(data.x) # Quiver: Feature collector

...
  
## Step 3: Train PyG models with Quiver
# for batch_size, n_id, adjs in train_loader: # Comment out PyG training loop
for seeds in train_loader: # Use PyTorch training loop in Quiver
  n_id, batch_size, adjs = quiver_sampler.sample(seeds)  # Use Quiver graph sampler
  batch_feature = quiver_feature[n_id]  # Use Quiver feature collector
  ...
...

Use Quiver in Multi-GPU PyG Scripts

To use Quiver in multi-GPU PyG scripts, we can simply pass quiver.Feature and quiver.Sampler as arguments to the child processes launched in PyTorch's DDP training, as shown below:

import quiver

# PyG DDP function that trains GNN models
def ddp_train(rank, feature, sampler):
  ...

# Replace PyG graph sampler and feature collector with Quiver's alternatives
quiver_sampler = quiver.pyg.GraphSageSampler(...)
quiver_feature = quiver.Feature(...)

mp.spawn(
      ddp_train, 
      args=(quiver_feature, quiver_sampler), # Pass Quiver components as arguments
      nprocs=world_size,
      join=True
  )

A full multi-gpu Quiver example is here.

Run Quiver

Below is an example command that runs a Quiver's script examples/pyg/reddit_quiver.py:

$ python3 examples/pyg/reddit_quiver.py

Quiver has the same launch command on both single-GPU servers and multi-GPU servers. We will provide multi-node examples soon.

Examples

We provide rich examples to show how to enable Quiver in real-world PyG scripts:

Documentation

Quiver provides many parameters to optimise the performance of its graph samplers (e.g., GPU-local or CPU-GPU hybrid) and feature collectors (e.g., feature replication/sharding strategies). Check Documentation for details.

Community

We welcome contributors to join the development of Quiver. Quiver is currently maintained by researchers from the University of Edinburgh, Imperial College London, Tsinghua University and University of Waterloo. The development of Quiver has received the support from Alibaba and Lambda Labs.

bcgraph's People

Contributors

lgarithm avatar zenotan avatar baizx98 avatar eedalong avatar luomai avatar yl16417 avatar lausannel avatar austincheang avatar wyj-19 avatar huwan avatar yaox12 avatar ningsir avatar l1nkr avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.