Giter Site home page Giter Site logo

cake's Introduction

Join the project community on our server!


Cake is a Rust framework for distributed inference of large models like LLama3 based on Candle. The goal of the project is being able to run big (70B+) models by repurposing consumer hardware into an heterogeneous cluster of iOS, Android, macOS, Linux and Windows devices, effectively leveraging planned obsolescence as a tool to make AI more accessible and democratic.

⚠ This is experimental code that's being actively developed and changed very quickly, expect bugs ⚠

The idea is to shard the transformer blocks to multiple devices in order to be able to run the inference on models that wouldn't normally fit in the GPU memory of a single device. Inferences over contiguous transformer blocks on the same worker are batched in order to minimize latency due to data transfer.

Support

OS Architectures Acceleration Status
GNU/Linux arm, arm64, x86_64 -
GNU/Linux arm, arm64, x86_64 CUDA
GNU/Linux arm, arm64, x86_64 BLAS
Windows x86_64 BLAS untested
Windows x86_64 CUDA untested
macOS x86_64 -
macOS aarch64 -
macOS aarch64 Metal
Android arm, arm64, x86_64 -
Android arm, arm64, x86_64 CUDA untested
iOS / iPadOS aarch64 -
iOS / iPadOS aarch64 Metal 🛠️ 90% done, WIP
Web - WebGPU in theory possible, not done

CUDA >= 12.2 is required for CUDA accelerated systems.

Compile

With Rust installed, you can build the core library and the CLI utilities with different accelerations.

Without acceleration (will use CPU):

cargo build --release

With Metal acceleration for Apple Silicon:

cargo build --release --features metal

With CUDA acceleration:

cargo build --release --features cuda

To generate the iOS bindings in the app that can then be compiled and deployed via XCode:

make ios

Using

Run a worker node:

cake-cli --model /path/to/Meta-Llama-3-8B \ # model path, read below on how to optimize model size for workers
         --mode worker \                    # run as worker
         --name worker0 \                   # worker name in topology file
         --topology topology.yml \          # topology
         --address 0.0.0.0:10128            # bind address

Run a master node with an OpenAI compatible REST API:

cake-cli --model /path/to/Meta-Llama-3-8B \ # model path
         --api 0.0.0.0:8080               \ # API bind address
         --topology topology.yml            # topology file

Where topology.yml determines which layers are served by which worker (you can find a list of all the layers of a model in its tensor index file):

linux_server_1:
  host: 'linux_server.host:10128'
  description: 'NVIDIA Titan X Pascal (12GB)'
  layers:
    - 'model.layers.0-5'

linux_server_2:
  host: 'linux_server2.host:10128'
  description: 'NVIDIA GeForce 3080 (10GB)'
  layers:
    - 'model.layers.6-16'

iphone:
  host: 'iphone.host:10128'
  description: 'iPhone 15 Pro Max'
  layers:
    - 'model.layers.17'

ipad:
  host: 'ipad.host:10128'
  description: 'iPad'
  layers:
    - 'model.layers.18-19'

macbook:
  host: 'macbook.host:10128'
  description: 'M1 Max'
  layers:
    - 'model.layers.20-31' 

You can now interact with the cluster by:

curl http://master-ip:8080/api/v1/chat/completions \                                                                                                                           ~  
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
        {   
            "role": "system",
            "content": "You are a helpful AI assistant."
        },  
        {   
            "role": "user",
            "content": "Why is the sky blue?"
        }
    ]
}'

Splitting the Model

As a memory and disk space optimization, you might want to give the worker only the data it actually needs from the model instead of the whole folder, in which case you can use the cake-split-model utility. For instance to generate a smaller version of the llama3 safetensors, you can:

cake-split-model --model-path path/to/Meta-Llama-3-8B \ # source model to split
                 --topology path/to/topology.yml \      # topology file
                 --output output-folder-name            # output folder where all the workers data bundles will be saved

This will create a smaller folder with only the required layers tensors and the topology file for the specific worker. Remember to also copy other model contents (config.json, tokenizer.json, etc) in the worker bundle before deploying it.

License

Released under the GPL 3 license. To see the licenses of the project dependencies, install cargo license with cargo install cargo-license and then run cargo license.

cake's People

Contributors

evilsocket avatar b0xtch avatar yaojunluo avatar

Stargazers

dap avatar  avatar  avatar Justin Rovner avatar  avatar XW.Mu avatar Acfufu avatar  avatar www.DigCore.cn avatar xianxian.zhang avatar  avatar Joey avatar Oakley avatar  avatar Alexey Perminov avatar Pan avatar Ming-Hsuan Wu avatar 蔡建海 avatar b1ackd0t avatar 380560524 avatar Frida Legros avatar EdisonM avatar  avatar HU SHAOXING avatar Ernad Halilović avatar  avatar Tanuj Bhojwani avatar 清漪 avatar xulingyi avatar  avatar Pawel Kasperek avatar jzice avatar Cole avatar Lawrence Lin avatar Yusong Gao avatar Sam Neurohack avatar  avatar  avatar  avatar ikthap avatar BioAngel avatar  avatar  avatar  avatar Nicolas Embleton avatar Trojanking avatar huangsu avatar TianYi Wen avatar Chia-Wei, Wei avatar  avatar FreeAnimals avatar  avatar  avatar  avatar  avatar  avatar chen avatar  avatar  avatar  avatar James avatar jsoncode avatar kimc avatar all3n avatar  avatar  avatar Fred Bliss avatar  avatar 29 avatar  avatar  avatar GK avatar guanchen zhu avatar Li Xingmao avatar Fernando Jorge Mota avatar  avatar lukewcn avatar  avatar guoyucn avatar liyiheng avatar Ming Ye avatar Kiran Kunigiri avatar hirak0 avatar  avatar JeongHoon Baek avatar  avatar Sweeney W avatar maskx avatar Daniel P. Shannon avatar fox_ghost avatar Hoverhuang avatar  avatar Pakin Tadatangsakul avatar Danial Othman avatar William Ferrell avatar Jophy Ye avatar  avatar  avatar Yicheng Zhang avatar  avatar

Watchers

gradetwo avatar  avatar Jeremy Lu avatar kyle avatar  avatar  avatar Paulo Alencar avatar LazyCoder avatar  avatar Yu Zhang avatar Wouter Tichelaar avatar  avatar FreeAnimals avatar Qin Hangyu avatar Marcin Kozlowski avatar Gui Bibeau avatar Amogh Yermalkar avatar al-sabr avatar Ashen avatar  avatar yifengchen avatar  avatar xarleyn avatar Jesse C. Lin avatar  avatar  avatar Jay W avatar  avatar IAN avatar  avatar fish4terrisa-MSDSM avatar jeff-cc avatar  avatar Persimmons avatar  avatar  avatar Xu avatar  avatar Guankai avatar

cake's Issues

Inquiries about the possibility of supporting windows systems

Hello developers, I see you this project, really awesome, I have been struggling with the lack of performance of the device, and do not have much money to buy A100 graphics card, because to buy milk powder for the child, haha, would like to consult whether there is the intention to join the windows system, I see that the mac, linux, android have support, we are mainly on the side of windows 7! I see mac, linux, and android are all supported, we are mainly windows 7 on our side, we have six computers, it would be nice to have a cluster that supports windows.

第二次请求会报错

您好,第一次请求的时候会正常输出,第二次请求会报错,主节点的服务也会终止
工作节点运行命令

CUDA_VISIBLE_DEVICES=3 ./cake-cli --model /sdc/pre_trained_model/Llama3-Chinese-8B-Instruct --mode worker --name worker0 --topology /sdc/jky/cake/topology.yml --address 0.0.0.0:10128

主节点运行命令

CUDA_VISIBLE_DEVICES=3,4,5,6,7 ./cake-cli --model /home/pre_trained_model/Llama3-Chinese-8B-Instruct --api 0.0.0.0:8080 --topology /home/jky/cake/topology.yml

报错如下:

thread 'tokio-runtime-worker' panicked at /sdc/jky/cake/cake-core/src/cake/worker.rs:215:26:
called `Result::unwrap()` on an `Err` value: cannot broadcast [29, 29] to [1, 32, 29, 170]
   0: candle_core::error::Error::bt
   1: candle_core::layout::Layout::broadcast_as
   2: candle_core::tensor::Tensor::broadcast_as
   3: cake_core::models::llama3::cache::Cache::apply_attention_mask
   4: cake_core::models::llama3::attention::CausalSelfAttention::forward
   5: <cake_core::models::llama3::transformer::Transformer as cake_core::cake::Forwarder>::forward::{{closure}}
   6: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   7: tokio::runtime::task::core::Core<T,S>::poll
   8: tokio::runtime::task::harness::Harness<T,S>::poll
   9: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  10: tokio::runtime::scheduler::multi_thread::worker::Context::run
  11: tokio::runtime::context::set_scheduler
  12: tokio::runtime::context::runtime::enter_runtime
  13: tokio::runtime::scheduler::multi_thread::worker::run
  14: tokio::runtime::task::core::Core<T,S>::poll
  15: tokio::runtime::task::harness::Harness<T,S>::poll
  16: tokio::runtime::blocking::pool::Inner::run
  17: std::sys_common::backtrace::__rust_begin_short_backtrace
  18: core::ops::function::FnOnce::call_once{{vtable.shim}}
  19: std::sys::pal::unix::thread::Thread::new::thread_start
  20: <unknown>
  21: <unknown>


Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: <cake_core::models::llama3::transformer::Transformer as cake_core::cake::Forwarder>::forward::{{closure}}
   2: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   3: tokio::runtime::task::core::Core<T,S>::poll
   4: tokio::runtime::task::harness::Harness<T,S>::poll
   5: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   6: tokio::runtime::scheduler::multi_thread::worker::Context::run
   7: tokio::runtime::context::set_scheduler
   8: tokio::runtime::context::runtime::enter_runtime
   9: tokio::runtime::scheduler::multi_thread::worker::run
  10: tokio::runtime::task::core::Core<T,S>::poll
  11: tokio::runtime::task::harness::Harness<T,S>::poll
  12: tokio::runtime::blocking::pool::Inner::run
  13: std::sys_common::backtrace::__rust_begin_short_backtrace
  14: core::ops::function::FnOnce::call_once{{vtable.shim}}
  15: std::sys::pal::unix::thread::Thread::new::thread_start
  16: <unknown>
  17: <unknown>
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   4: tokio::runtime::task::core::Core<T,S>::poll
   5: tokio::runtime::task::harness::Harness<T,S>::poll
   6: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   7: tokio::runtime::scheduler::multi_thread::worker::Context::run
   8: tokio::runtime::context::set_scheduler
   9: tokio::runtime::context::runtime::enter_runtime
  10: tokio::runtime::scheduler::multi_thread::worker::run
  11: tokio::runtime::task::core::Core<T,S>::poll
  12: tokio::runtime::task::harness::Harness<T,S>::poll
  13: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Error in model.forward: error in forward batch operation for block

The first time I call the API, it works fine. However, when I call the REST API for the second time, the master node reports the error:
cake/api/mod.rs:98:10: called Result::unwrap() on an Err value: error in model.forward: error in forward batch operation for block 29: error receiving response for Batch
Additionally, one of my workers will also trigger an error and then stop:
src/cake/worker.rs:225:26: called Result::unwrap() on an Err value: cannot broadcast [28, 28] to [1, 32, 28, 65]

May I ask why I am unable to download the model and use the product through Huggingface

root@llama01:/www/cake# /www/cake/target/release/cake-cli --model /www/llama --mode worker --name linux_server_1 --address 0.0.0.0:9527 --topology /www/cake/topology.yml
[2024-08-08T16:11:12Z INFO ] [Worker] dtype=F16 device=Cpu mem=5.3 MiB
[2024-08-08T16:11:12Z INFO ] loading configuration from /www/llama/config.json
[2024-08-08T16:11:12Z INFO ] loading topology from /www/cake/topology.yml
[2024-08-08T16:11:12Z INFO ] loading tensors in /www/llama/model.safetensors.index.json
[2024-08-08T16:11:12Z INFO ] loading tensors from /www/llama/model.safetensors.index.json ...
[2024-08-08T16:11:12Z INFO ] loading model-00002-of-00004.safetensors ...
Error: cannot find tensor model-00002-of-00004.safetensors.self_attn.q_proj.weight

Unable to build without CUDA

Tried on debian server and termux. Results are same

CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_DEBUG=true RUST_BACKTRACE=full cargo build --release
warning: /home/dankcat/cake/cake-ios/Cargo.toml: `crate_type` is deprecated in favor of `crate-type` and will not work in the 2024 edition
(in the `cake` library target)
   Compiling cudarc v0.11.7
   Compiling candle-kernels v0.6.0
   Compiling zstd-sys v2.0.12+zstd.1.5.6
   Compiling block-buffer v0.10.4
error: failed to run custom build command for `candle-kernels v0.6.0`

Caused by:
  process didn't exit successfully: `/home/dankcat/cake/target/release/build/candle-kernels-15ec0a2c0042f062/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  cargo:rerun-if-changed=src/compatibility.cuh
  cargo:rerun-if-changed=src/cuda_utils.cuh
  cargo:rerun-if-changed=src/binary_op_macros.cuh
  cargo:info=["/usr", "/usr/local/cuda", "/opt/cuda", "/usr/lib/cuda", "C:/Program Files/NVIDIA GPU Computing Toolkit", "C:/CUDA"]
  cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP

  --- stderr
  thread 'main' panicked at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bindgen_cuda-0.1.5/src/lib.rs:489:18:
  `nvidia-smi` failed. Ensure that you have CUDA installed and that `nvidia-smi` is in your PATH.: Os { code: 2, kind: NotFound, message: "No such file or directory" }
  stack backtrace:
     0:     0x55c93d687785 - std::backtrace_rs::backtrace::libunwind::trace::h1a07e5dba0da0cd2
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/../../backtrace/src/backtrace/libunwind.rs:105:5
     1:     0x55c93d687785 - std::backtrace_rs::backtrace::trace_unsynchronized::h61b9b8394328c0bc
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
     2:     0x55c93d687785 - std::sys_common::backtrace::_print_fmt::h1c5e18b460934cff
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:68:5
     3:     0x55c93d687785 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h1e1a1972118942ad
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:44:22
     4:     0x55c93d6ac13b - core::fmt::rt::Argument::fmt::h07af2b4071d536cd
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/fmt/rt.rs:165:63
     5:     0x55c93d6ac13b - core::fmt::write::hc090a2ffd6b28c4a
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/fmt/mod.rs:1157:21
     6:     0x55c93d68420f - std::io::Write::write_fmt::h8898bac6ff039a23
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/io/mod.rs:1832:15
     7:     0x55c93d68755e - std::sys_common::backtrace::_print::h4e80c5803d4ee35b
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:47:5
     8:     0x55c93d68755e - std::sys_common::backtrace::print::ha96650907276675e
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:34:9
     9:     0x55c93d688a49 - std::panicking::default_hook::{{closure}}::h215c2a0a8346e0e0
    10:     0x55c93d68878d - std::panicking::default_hook::h207342be97478370
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:298:9
    11:     0x55c93d688ee3 - std::panicking::rust_panic_with_hook::hac8bdceee1e4fe2c
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:795:13
    12:     0x55c93d688dc4 - std::panicking::begin_panic_handler::{{closure}}::h00d785e82757ce3c
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:664:13
    13:     0x55c93d687c49 - std::sys_common::backtrace::__rust_end_short_backtrace::h1628d957bcd06996
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:171:18
    14:     0x55c93d688af7 - rust_begin_unwind
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
    15:     0x55c93d5e10f3 - core::panicking::panic_fmt::hdc63834ffaaefae5
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
    16:     0x55c93d5e1546 - core::result::unwrap_failed::h82b551e0ff2b2176
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1654:5
    17:     0x55c93d5f00d8 - core::result::Result<T,E>::expect::h0d780f1427a920a0
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1034:23
    18:     0x55c93d6058fc - bindgen_cuda::compute_cap::h544f29d1dbea88ae
                                 at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bindgen_cuda-0.1.5/src/lib.rs:485:19
    19:     0x55c93d60216f - <bindgen_cuda::Builder as core::default::Default>::default::hc8d3c33e79e06ed7
                                 at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/bindgen_cuda-0.1.5/src/lib.rs:48:27
    20:     0x55c93d5e2e5f - build_script_build::main::h601c987ee98bf43b
                                 at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/candle-kernels-0.6.0/build.rs:7:19
    21:     0x55c93d5e270b - core::ops::function::FnOnce::call_once::h3413b6fc62df34af
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:250:5
    22:     0x55c93d5e1e6e - std::sys_common::backtrace::__rust_begin_short_backtrace::hbdfe41c52daab1ec
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:155:18
    23:     0x55c93d5e22d1 - std::rt::lang_start::{{closure}}::h51c795f7d1b1d218
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:159:18
    24:     0x55c93d67ead0 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h6abeee5a7794ceb5
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:284:13
    25:     0x55c93d67ead0 - std::panicking::try::do_call::hd6e966bb06877057
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:559:40
    26:     0x55c93d67ead0 - std::panicking::try::hc9b3807f5768cb19
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:523:19
    27:     0x55c93d67ead0 - std::panic::catch_unwind::h94a757c154076c6e
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panic.rs:149:14
    28:     0x55c93d67ead0 - std::rt::lang_start_internal::{{closure}}::hc5223fb36050c743
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:141:48
    29:     0x55c93d67ead0 - std::panicking::try::do_call::hddf7b4e1ebeb3f69
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:559:40
    30:     0x55c93d67ead0 - std::panicking::try::h1842860a1f941a31
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:523:19
    31:     0x55c93d67ead0 - std::panic::catch_unwind::h009016ccf811d4c3
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panic.rs:149:14
    32:     0x55c93d67ead0 - std::rt::lang_start_internal::h3ed4fe7b2f419135
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:141:20
    33:     0x55c93d5e22aa - std::rt::lang_start::hff6e3b582a875b8d
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:158:17
    34:     0x55c93d5e306e - main
    35:     0x7f40ce51524a - <unknown>
    36:     0x7f40ce515305 - __libc_start_main
    37:     0x55c93d5e1761 - _start
    38:                0x0 - <unknown>
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for `cudarc v0.11.7`

Caused by:
  process didn't exit successfully: `/home/dankcat/cake/target/release/build/cudarc-5c6a5152ed8f4c4d/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=build.rs
  cargo:rerun-if-env-changed=CUDA_ROOT
  cargo:rerun-if-env-changed=CUDA_PATH
  cargo:rerun-if-env-changed=CUDA_TOOLKIT_ROOT_DIR

  --- stderr
  thread 'main' panicked at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.11.7/build.rs:55:10:
  Failed to execute `nvcc`: Os { code: 2, kind: NotFound, message: "No such file or directory" }
  stack backtrace:
     0:     0x564532b54cf5 - std::backtrace_rs::backtrace::libunwind::trace::h1a07e5dba0da0cd2
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/../../backtrace/src/backtrace/libunwind.rs:105:5
     1:     0x564532b54cf5 - std::backtrace_rs::backtrace::trace_unsynchronized::h61b9b8394328c0bc
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
     2:     0x564532b54cf5 - std::sys_common::backtrace::_print_fmt::h1c5e18b460934cff
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:68:5
     3:     0x564532b54cf5 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h1e1a1972118942ad
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:44:22
     4:     0x564532b75a2b - core::fmt::rt::Argument::fmt::h07af2b4071d536cd
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/fmt/rt.rs:165:63
     5:     0x564532b75a2b - core::fmt::write::hc090a2ffd6b28c4a
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/fmt/mod.rs:1157:21
     6:     0x564532b5290f - std::io::Write::write_fmt::h8898bac6ff039a23
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/io/mod.rs:1832:15
     7:     0x564532b54ace - std::sys_common::backtrace::_print::h4e80c5803d4ee35b
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:47:5
     8:     0x564532b54ace - std::sys_common::backtrace::print::ha96650907276675e
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:34:9
     9:     0x564532b55d89 - std::panicking::default_hook::{{closure}}::h215c2a0a8346e0e0
    10:     0x564532b55acd - std::panicking::default_hook::h207342be97478370
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:298:9
    11:     0x564532b56223 - std::panicking::rust_panic_with_hook::hac8bdceee1e4fe2c
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:795:13
    12:     0x564532b56104 - std::panicking::begin_panic_handler::{{closure}}::h00d785e82757ce3c
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:664:13
    13:     0x564532b551b9 - std::sys_common::backtrace::__rust_end_short_backtrace::h1628d957bcd06996
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:171:18
    14:     0x564532b55e37 - rust_begin_unwind
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:652:5
    15:     0x564532b25f53 - core::panicking::panic_fmt::hdc63834ffaaefae5
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/panicking.rs:72:14
    16:     0x564532b26366 - core::result::unwrap_failed::h82b551e0ff2b2176
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1654:5
    17:     0x564532b2c438 - core::result::Result<T,E>::expect::h33784a2d338b94a7
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1034:23
    18:     0x564532b316f6 - build_script_build::cuda_version_from_build_system::h4a38442c7c737c00
                                 at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.11.7/build.rs:52:18
    19:     0x564532b3133a - build_script_build::main::h77dc56d88b14ee07
                                 at /home/dankcat/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.11.7/build.rs:37:34
    20:     0x564532b2e5cb - core::ops::function::FnOnce::call_once::h2274ad654a6bbd1b
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:250:5
    21:     0x564532b340fe - std::sys_common::backtrace::__rust_begin_short_backtrace::hff1eff237bf98703
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/sys_common/backtrace.rs:155:18
    22:     0x564532b2b3d1 - std::rt::lang_start::{{closure}}::h214b04bede10fd10
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:159:18
    23:     0x564532b4f850 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h6abeee5a7794ceb5
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/ops/function.rs:284:13
    24:     0x564532b4f850 - std::panicking::try::do_call::hd6e966bb06877057
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:559:40
    25:     0x564532b4f850 - std::panicking::try::hc9b3807f5768cb19
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:523:19
    26:     0x564532b4f850 - std::panic::catch_unwind::h94a757c154076c6e
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panic.rs:149:14
    27:     0x564532b4f850 - std::rt::lang_start_internal::{{closure}}::hc5223fb36050c743
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:141:48
    28:     0x564532b4f850 - std::panicking::try::do_call::hddf7b4e1ebeb3f69
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:559:40
    29:     0x564532b4f850 - std::panicking::try::h1842860a1f941a31
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panicking.rs:523:19
    30:     0x564532b4f850 - std::panic::catch_unwind::h009016ccf811d4c3
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/panic.rs:149:14
    31:     0x564532b4f850 - std::rt::lang_start_internal::h3ed4fe7b2f419135
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:141:20
    32:     0x564532b2b3aa - std::rt::lang_start::ha16ce9452477e973
                                 at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/std/src/rt.rs:158:17
    33:     0x564532b331fe - main
    34:     0x7fec3bef624a - <unknown>
    35:     0x7fec3bef6305 - __libc_start_main
    36:     0x564532b26541 - _start
    37:                0x0 - <unknown>

Building on ubuntu errors `cuMemAdvise_v2` on cuda 12.1

Compiling tracing-core v0.1.32
error[E0599]: no method named `cuMemAdvise_v2` found for reference `&'static driver::sys::sys_12010::Lib` in the current scope
   --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.11.7/src/driver/result.rs:613:10
    |
612 | /     lib()
613 | |         .cuMemAdvise_v2(dptr, num_bytes, advice, location)
    | |_________-^^^^^^^^^^^^^^
    |
help: there is a method `cuMemAdvise` with a similar name
    |
613 |         .cuMemAdvise(dptr, num_bytes, advice, location)
    |          ~~~~~~~~~~~

error[E0599]: no method named `cuMemPrefetchAsync_v2` found for reference `&'static driver::sys::sys_12010::Lib` in the current scope
     --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.11.7/src/driver/result.rs:628:10
      |
627   | /     lib()
628   | |         .cuMemPrefetchAsync_v2(dptr, num_bytes, location, 0, stream)
      | |_________-^^^^^^^^^^^^^^^^^^^^^
      |
help: there is a method `cuMemPrefetchAsync` with a similar name, but with different arguments
     --> /home/ubuntu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/cudarc-0.11.7/src/driver/sys/sys_12010.rs:13548:5
      |
13548 | /     pub unsafe fn cuMemPrefetchAsync(
13549 | |         &self,
13550 | |         devPtr: CUdeviceptr,
13551 | |         count: usize,
13552 | |         dstDevice: CUdevice,
13553 | |         hStream: CUstream,
13554 | |     ) -> CUresult {
      | |_________________^
image

Is it possible to use quantized models?

Firs of all, I wanna thank you for your hard work, I love this project and I thinks it's awesome to be able to handle inference on different devices.
As for me, the point in splitting a model among different devices, lays in my current RAM limitations, so I guess it would have much more sense to be able to use quantized versions of the big models.

无法找到指定的文件

模型路径是正确的,不知道这个报错指的是哪个文件找不到

[Worker] dtype=F16 device=Cuda(CudaDevice(DeviceId(1))) mem=207.4 MiB
 loading topology from topology.yml
loading configuration from /sdc/pre_trained_model/Llama3-Chinese-8B-Instruct/config.json
Error: No such file or directory (os error 2)

Thanks for the FOSS! Suggestion for future possible backends runtimes: Vulkan, OpenCL, SYCL/OpenVino/intel GPU, AMD gpu/ROCm/HIP.

Thanks for the FOSS!

Suggestion for future possible backends runtimes: Vulkan, OpenCL, SYCL/OpenVino/intel GPU, AMD gpu/ROCm/HIP.

Vulkan and OpenCL both have the possibility of being very portable to GPUs and also to some extent CPUs that have supporting SW for it.

SYCL can run on various CPU / GPU platforms; it / openvino etc. is the primary ideal target to support intel gpus.

About the reason of having cluster nodes

Thanks for your valuable contribution.
I have the following question that I need some clarification. It would probably be also noteworthy to be mentioned in the README description for more clarity.
From my basic understanding, in cake we are splitting the model into its layers and distributing those layers to separate nodes because a huge 70B model will not fit into a single normal GPU. So my question is that what would be the benefit of having a cluster of these nodes on our network instead of having only a single worker and just loading and offloading each layer of model one by one? Because my understanding is that the model inference is sequential, so one node has to wait for the process of previous layers to finish to start its process. So basically having multiple nodes would appear redundant. Unless, we have some sort of pipelining mechanism that would feed batches to the nodes one at a time to perform pipelining. Is that our intention here? Could you please provide some guidance and explanation on this? Thanks again.

Dockerfile support

Hereby I have successfully compiled your project with Docker, and am willing to share with anyone struggling to do the same.

Since this software is in alpha, I advise the author to use this as reference and build official docker image for this project, before static linking and AppImage.

The filesystem structure is:

├── build.sh # build script
├── cake # cloned repository
├── cargo_config.toml # cargo mirror config
├── Dockerfile_intermediate # building intermediate image
└── run.sh # run the final container

Content of build.sh:

INTERMEDIATE_IMAGE_NAME=cake_llm_intermediate
IMAGE_NAME=cake_llm

INTERMEDIATE_CONTAINER_NAME=cake_container_intermediate
CONTAINER_NAME=cake_container

git clone https://github.com/evilsocket/cake

docker kill $CONTAINER_NAME
docker rm $CONTAINER_NAME
docker rmi $INTERMEDIATE_IMAGE_NAME

docker build -t $INTERMEDIATE_IMAGE_NAME -f Dockerfile_intermediate .


read -p "Do you want to continue? (y/n): " answer

case $answer in
    [Yy]* ) echo "You chose yes.";;
    [Nn]* ) echo "You chose no."; exit 1;;
    * ) echo "Please answer yes or no."; exit 1;;
esac

docker kill $INTERMEDIATE_CONTAINER_NAME
docker rm $INTERMEDIATE_CONTAINER_NAME

docker rmi $IMAGE_NAME
docker run -d --privileged --gpus 1 --name $INTERMEDIATE_CONTAINER_NAME $INTERMEDIATE_IMAGE_NAME tail -f /dev/null
docker exec -w /root/cake $INTERMEDIATE_CONTAINER_NAME cargo build
docker commit $INTERMEDIATE_CONTAINER_NAME $IMAGE_NAME 

docker kill $INTERMEDIATE_CONTAINER_NAME
docker rm $INTERMEDIATE_CONTAINER_NAME

Content of Dockerfile_intermediate:

FROM nvidia/cuda:12.4.0-base-ubuntu22.04

RUN rm /etc/apt/apt.conf.d/docker-clean
RUN apt update
RUN apt install -y build-essential curl

RUN apt install -y cuda-nvcc-12-4 cuda-nvrtc-dev-12-4 libcublas-dev-12-4 libcurand-dev-12-4

RUN apt install -y cargo

COPY cake /root/cake

COPY cargo_config.toml /root/.cargo/config.toml

Content of run.sh:

IMAGENAME=cake_llm
CONTAINER_NAME=cake_container

docker kill $CONTAINER_NAME
docker rm $CONTAINER_NAME

MODEL_PATH=/root/data/Meta-Llama-3-8B-Instruct
TOPOFILE=/root/data/topology.yaml

docker run -it --rm --mount type=bind,source=<source_path>,target=/root/data,ro -e LD_LIBRARY_PATH=/usr/local/cuda-12.4/targets/x86_64-linux/lib/ --name $CONTAINER_NAME --privileged --gpus 1 $IMAGENAME /root/cake/target/debug/cake-cli --model $MODEL_PATH --topology $TOPOFILE 

PTX代码使用了一个不被支持的工具链进行编译

您好,我在使用中遇到了新的问题
运行命令

RUST_LOG=debug CUDA_VISIBLE_DEVICES=2 ./cake-cli --model /data1/pre_trained_model/Llama-3-8B-Instruct --topology /sdc/jky/cake/topology.yml

报错如下:

[2024-07-17T06:24:01Z DEBUG] device is cuda 0
[2024-07-17T06:24:01Z INFO ] [Master] dtype=F16 device=Cuda(CudaDevice(DeviceId(1))) mem=220.7 MiB
[2024-07-17T06:24:01Z INFO ] loading configuration from /data1/pre_trained_model/Llama-3-8B-Instruct/config.json
[2024-07-17T06:24:01Z INFO ] loading topology from /sdc/jky/cake/topology.yml
[2024-07-17T06:24:01Z DEBUG] cache::n_elem = 128
[2024-07-17T06:24:01Z DEBUG] cache::theta = [ 1.0000e0, 8.1462e-1, 6.6360e-1, 5.4058e-1, 4.4037e-1, 3.5873e-1, 2.9223e-1,
     2.3805e-1, 1.9392e-1, 1.5797e-1, 1.2869e-1, 1.0483e-1, 8.5397e-2, 6.9566e-2,
     5.6670e-2, 4.6164e-2, 3.7606e-2, 3.0635e-2, 2.4955e-2, 2.0329e-2, 1.6560e-2,
     1.3490e-2, 1.0990e-2, 8.9523e-3, 7.2927e-3, 5.9407e-3, 4.8394e-3, 3.9423e-3,
     3.2114e-3, 2.6161e-3, 2.1311e-3, 1.7360e-3, 1.4142e-3, 1.1520e-3, 9.3847e-4,
     7.6450e-4, 6.2277e-4, 5.0732e-4, 4.1327e-4, 3.3666e-4, 2.7425e-4, 2.2341e-4,
     1.8199e-4, 1.4825e-4, 1.2077e-4, 9.8381e-5, 8.0143e-5, 6.5286e-5, 5.3183e-5,
     4.3324e-5, 3.5292e-5, 2.8750e-5, 2.3420e-5, 1.9078e-5, 1.5542e-5, 1.2660e-5,
     1.0313e-5, 8.4015e-6, 6.8440e-6, 5.5752e-6, 4.5417e-6, 3.6997e-6, 3.0139e-6,
     2.4551e-6]
    Tensor[[64], f32, cuda:0]
Error: DriverError(CUDA_ERROR_UNSUPPORTED_PTX_VERSION, "the provided PTX was compiled with an unsupported toolchain.") when loading cast_u32_f32

无法编译成功

C:\Users\Administrator\Desktop\cake>cargo build --release
warning: C:\Users\Administrator\Desktop\cake\cake-ios\Cargo.toml: crate_type is deprecated in favor of crate-type and will not work in the 2024 edition
(in the cake library target)
Compiling cudarc v0.11.7
Compiling candle-kernels v0.6.0
Compiling clap_lex v0.7.1
Compiling bit-vec v0.6.3
Compiling strsim v0.11.1
Compiling nom v7.1.3
Compiling console v0.15.8
Compiling esaxx-rs v0.1.10
error: failed to run custom build command for cudarc v0.11.7
note: To improve backtraces for build dependencies, set the CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_DEBUG=true environment variable to enable debug information generation.

Caused by:
process didn't exit successfully: C:\Users\Administrator\Desktop\cake\target\release\build\cudarc-95f6bdd5c33de08a\build-script-build (exit code: 101)
--- stdout
cargo:rerun-if-changed=build.rs
cargo:rerun-if-env-changed=CUDA_ROOT
cargo:rerun-if-env-changed=CUDA_PATH
cargo:rerun-if-env-changed=CUDA_TOOLKIT_ROOT_DIR

--- stderr
thread 'main' panicked at C:\Users\Administrator.cargo\registry\src\index.crates.io-6f17d22bba15001f\cudarc-0.11.7\build.rs:82:14:
Unsupported cuda toolkit version: 11.0. Please raise a github issue.
stack backtrace:
0: std::panicking::begin_panic_handler
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\panicking.rs:652
1: core::panicking::panic_fmt
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\panicking.rs:72
2: <alloc::vec::Vec as core::iter::traits::collect::FromIterator>::from_iter
3: <alloc::vec::Vec as core::iter::traits::collect::FromIterator>::from_iter
4: core::ops::function::FnOnce::call_once
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
warning: build failed, waiting for other jobs to finish...
error: failed to run custom build command for candle-kernels v0.6.0
note: To improve backtraces for build dependencies, set the CARGO_PROFILE_RELEASE_BUILD_OVERRIDE_DEBUG=true environment variable to enable debug information generation.

Caused by:
process didn't exit successfully: C:\Users\Administrator\Desktop\cake\target\release\build\candle-kernels-644872f2b8f06ed1\build-script-build (exit code: 101)
--- stdout
cargo:rerun-if-changed=build.rs
cargo:rerun-if-changed=src/compatibility.cuh
cargo:rerun-if-changed=src/cuda_utils.cuh
cargo:rerun-if-changed=src/binary_op_macros.cuh
cargo:info=["/usr", "/usr/local/cuda", "/opt/cuda", "/usr/lib/cuda", "C:/Program Files/NVIDIA GPU Computing Toolkit", "C:/CUDA"]
cargo:rerun-if-env-changed=CUDA_COMPUTE_CAP

--- stderr
thread 'main' panicked at C:\Users\Administrator.cargo\registry\src\index.crates.io-6f17d22bba15001f\bindgen_cuda-0.1.5\src\lib.rs:492:9:
assertion left == right failed
left: "Field "compute_cap" is not a valid field to query."
right: "compute_cap"
stack backtrace:
0: std::panicking::begin_panic_handler
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\panicking.rs:652
1: core::panicking::panic_fmt
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\panicking.rs:72
2: core::panicking::assert_failed_inner
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\panicking.rs:409
3: core::panicking::assert_failed
4: bindgen_cuda::cuda_include_dir::{{closure}}
5: <bindgen_cuda::Builder as core::default::Default>::default
6: std::rt::lang_start
7: std::rt::lang_start
8: __ImageBase
9: std::rt::lang_start
10: std::rt::lang_start_internal
at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\rt.rs:141
11: std::rt::lang_start
12: main
13: invoke_main
at D:\a_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78
14: __scrt_common_main_seh
at D:\a_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
15: BaseThreadInitThunk
16: RtlUserThreadStart
note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.

C:\Users\Administrator\Desktop\cake>nvidia-smi
Tue Jul 16 01:15:46 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 457.30 Driver Version: 457.30 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... WDDM | 00000000:01:00.0 On | N/A |
|100% 29C P8 16W / 250W | 555MiB / 11264MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1340 C+G Insufficient Permissions N/A |
| 0 N/A N/A 12420 C+G C:\Windows\explorer.exe N/A |
| 0 N/A N/A 12928 C+G ...m Files\ToDesk\ToDesk.exe N/A |
| 0 N/A N/A 13352 C+G ...artMenuExperienceHost.exe N/A |
| 0 N/A N/A 13940 C+G ...d\runtime\WeChatAppEx.exe N/A |
| 0 N/A N/A 14476 C+G ...y\ShellExperienceHost.exe N/A |
| 0 N/A N/A 15492 C+G ...2txyewy\TextInputHost.exe N/A |
| 0 N/A N/A 17964 C+G ...ray\lghub_system_tray.exe N/A |
| 0 N/A N/A 18256 C+G ...e\PhoneExperienceHost.exe N/A |
| 0 N/A N/A 18744 C+G ...5n1h2txyewy\SearchApp.exe N/A |
| 0 N/A N/A 19444 C+G ...lPanel\SystemSettings.exe N/A |
+-----------------------------------------------------------------------------+

C:\Users\Administrator\Desktop\cake>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Thu_Jun_11_22:26:48_Pacific_Daylight_Time_2020
Cuda compilation tools, release 11.0, V11.0.194
Build cuda_11.0_bu.relgpu_drvr445TC445_37.28540450_0

bug with tokenizer and gibberish output

the tokenizer has issues resolving a few tokens including special ones (they will be shown in the output as ), which is causing all sorts of gibberish output ... it's probably a matter of parsing the model/tokenizer.json properly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.