Giter Site home page Giter Site logo

gdn0101 / bend Goto Github PK

View Code? Open in Web Editor NEW

This project forked from higherorderco/bend

0.0 0.0 0.0 70.58 MB

A massively parallel, high-level programming language

Home Page: https://higherorderco.com

License: Apache License 2.0

Rust 99.84% Just 0.16%

bend's Introduction

Bend

Bend is a massively parallel, high-level programming language.

Unlike low-level alternatives like CUDA and Metal, Bend has the feeling and features of expressive languages like Python and Haskell, including fast object allocations, higher-order functions with full closure support, unrestricted recursion, even continuations. Yet, it runs on massively parallel hardware like GPUs, with near-linear speedup based on core count, and zero explicit parallel annotations: no thread spawning, no locks, mutexes, atomics. Bend is powered by the HVM2 runtime.

A Quick Demo

Bend live demo

Using Bend

Currently not working on Windows, please use WSL2 as a workaround.

First, install Rust nightly. Then, install both HVM2 and Bend with:

cargo +nightly install hvm
cargo +nightly install bend-lang

Finally, write some Bend file, and run it with one of these commands:

bend run    <file.hvm> # uses the Rust interpreter (sequential)
bend run-c  <file.hvm> # uses the C interpreter (parallel)
bend run-cu <file.hvm> # uses the CUDA interpreter (massively parallel)

You can also compile Bend to standalone C/CUDA files with gen-c and gen-cu, for maximum performance. But keep in mind our code gen is still on its infancy, and is nowhere as mature as SOTA compilers like GCC and GHC.

Parallel Programming in Bend

To write parallel programs in Bend, all you have to do is... nothing. Other than not making it inherently sequential! For example, the expression:

(((1 + 2) + 3) + 4)

Can not run in parallel, because +4 depends on +3 which depends on (1+2). But the following expression:

((1 + 2) + (3 + 4))

Can run in parallel, because (1+2) and (3+4) are independent; and it will, per Bend's fundamental pledge:

Everything that can run in parallel, will run in parallel.

For a more complete example, consider:

# Sorting Network = just rotate trees!
def sort(d, s, tree):
  switch d:
    case 0:
      return tree
    case _:
      (x,y) = tree
      lft   = sort(d-1, 0, x)
      rgt   = sort(d-1, 1, y)
      return rots(d, s, lft, rgt)

# Rotates sub-trees (Blue/Green Box)
def rots(d, s, tree):
  switch d:
    case 0:
      return tree
    case _:
      (x,y) = tree
      return down(d, s, warp(d-1, s, x, y))

(...)

This file implements a bitonic sorter with immutable tree rotations. It is not the kind of algorithm you'd expect to run fast on GPUs. Yet, since it uses a divide-and-conquer approach, which is inherently parallel, Bend will run it multi-threaded. Some benchmarks:

  • CPU, Apple M3 Max, 1 thread: 12.15 seconds

  • CPU, Apple M3 Max, 16 threads: 0.96 seconds

  • GPU, NVIDIA RTX 4090, 16k threads: 0.21 seconds

That's a 57x speedup by doing nothing. No thread spawning, no explicit management of locks, mutexes. We just asked Bend to run our program on RTX, and it did. Simple as that.

Bend isn't limited to a specific paradigm, like tensors or matrices. Any concurrent system, from shaders to Erlang-like actor models can be emulated on Bend. For example, to render images in real time, we could simply allocate an immutable tree on each frame:

# given a shader, returns a square image
def render(depth, shader):
  bend d = 0, i = 0:
    when d < depth:
      color = (fork(d+1, i*2+0), fork(d+1, i*2+1))
    else:
      width = depth / 2
      color = shader(i % width, i / width)
  return color

# given a position, returns a color
# for this demo, it just busy loops
def demo_shader(x, y):
  bend i = 0:
    when i < 5000:
      color = fork(i + 1)
    else:
      color = 0x000001
  return color

# renders a 256x256 image using demo_shader
def main:
  return render(16, demo_shader)

And it would actually work. Even involved algorithms parallelize well on Bend. Long-distance communication is performed by global beta-reduction (as per the Interaction Calculus), and synchronized correctly and efficiently by HVM2's atomic linker.

Note

It is very important to reinforce that, while Bend does that it was built to (i.e., scale in performance with cores, up to 10000+ concurrent threads), its single-core performance is still extremely sub-par. This is the first version of the system, and we haven't put much effort into a proper compiler yet. You can expect the raw performance to substantially improve on every release, as we work towards a proper codegen (including a constellation of missing optimizations). Meanwhile, you can use the interpreters today, to have a glimpse of what massively parallel programming looks like, from the lens of a Pythonish, high-level language!

bend's People

Contributors

imaqtkatt avatar developedby avatar lunaamora avatar victortaelin avatar tjjfvi avatar kings177 avatar franchufranchu avatar kpcofgs avatar sipher avatar pfauljulian avatar edusporto avatar davisuga avatar eltociear avatar epicguru avatar janiczek avatar themhv avatar 99991 avatar warpwing avatar renxida avatar eduhenke avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.