Giter Site home page Giter Site logo

Future ARM support? about multipy HOT 6 OPEN

pytorch avatar pytorch commented on August 15, 2024
Future ARM support?

from multipy.

Comments (6)

d4l3k avatar d4l3k commented on August 15, 2024 1

I got multipy working on aarch64 in a bit of a hacky way but we can polish this up so it does things correctly.

(venv-multipy) ubuntu@ip-172-31-38-182 ~/m/m/r/build (main)> ./interactive_embedded_interpreter
Registering torch::deploy builtin library tensorrt (idx 0) with 0 builtin modules
torch::deploy builtin tensorrt contains 0 modules
Registering torch::deploy builtin library cpython_internal (idx 1) with 0 builtin modules
torch::deploy builtin cpython_internal contains 6 modules
Registering torch::deploy builtin library tensorrt (idx 0) with 0 builtin modules
torch::deploy builtin tensorrt contains 0 modules
Registering torch::deploy builtin library cpython_internal (idx 1) with 0 builtin modules
torch::deploy builtin cpython_internal contains 6 modules
[W OperatorEntry.cpp:150] Warning: Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::get_gradients(int context_id) -> Dict(Tensor, Tensor)
    registered at /home/ubuntu/pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
  dispatch key: (catch all)
  previous kernel: registered at /home/ubuntu/pytorch/torch/csrc/jit/runtime/register_distributed_ops.cpp:278
       new kernel: registered at /home/ubuntu/pytorch/torch/csrc/jit/runtime/register_distributed_ops.cpp:278 (function registerKernel)
--Return--
> /home/ubuntu/.pyenv/versions/3.9.13/lib/python3.9/pdb.py(1626)set_trace()->None
-> pdb.set_trace(sys._getframe().f_back)
(Pdb) import torch
(Pdb) torch.zeros(100)
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0.])
(Pdb) import platform
(Pdb) platform.machine()
'aarch64'

Caveats:

  • this only works with python/torch/c extensions that are compiled with -mtls-dialect=trad since we don't have support for ARM64's TLSDESC support.
  • DTP* relocations are also pretty hacky (just sets module_id to 0) but don't seem to be causing any issues

Links:

I've been testing on a Graviton3 instance

from multipy.

d4l3k avatar d4l3k commented on August 15, 2024

We'd love to add ARM support but I don't have any specific timelines for when that might happen right now. We'd be happy to work with someone to add support if there's any community interest in contributing

Can you share more details of what type of hardware you want to use multipy/deploy with? Are you targeting mobile/embedded devices or desktop style hardware i.e. Macs/Graviton/etc?

The main changes would be to improve the loaders depending on the environment:

I believe this was adapted from the Android linker implementation with the non-x86 bits removed. It's feasible to add it back in though may be quite a bit of work when dealing with the full E2E PyTorch/Python build

https://android.googlesource.com/platform/bionic/+/android-6.0.1_r1/linker/linker.cpp

from multipy.

saareliad avatar saareliad commented on August 15, 2024

Hi @d4l3k , I'm actually targeting Datacenter / near-edge, server.
The ARM cores will mostly run very lightweight pre/post processing functions (e.g., tokenization in NLP) which will be added as a custom op (e.g., as done in libraries like torchtext/torchaudio/... ) while the rest of compute will be offloaded to computational accelerators.

from multipy.

d4l3k avatar d4l3k commented on August 15, 2024

@saareliad can you share what ARM architecture you're using? 64-bit? v8?

from multipy.

saareliad avatar saareliad commented on August 15, 2024

@saareliad can you share what ARM architecture you're using? 64-bit? v8?

Yes, 64-bit v8.2 (N1).

from multipy.

d4l3k avatar d4l3k commented on August 15, 2024

@saareliad sounds good, that should work with this prototype code -- 32bit won't

from multipy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.