Giter Site home page Giter Site logo

pytorch / executorch Goto Github PK

View Code? Open in Web Editor NEW
812.0 48.0 155.0 52.76 MB

On-device AI across mobile, embedded and edge for PyTorch

Home Page: https://pytorch.org/executorch/

License: Other

C++ 48.40% Python 38.19% C 0.52% Shell 0.95% Starlark 3.34% Dockerfile 0.03% CMake 1.16% Objective-C++ 5.08% Objective-C 0.96% Java 0.43% GLSL 0.93%
deep-learning embedded machine-learning mobile neural-network tensor gpu

executorch's Introduction

ExecuTorch

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.

Key value propositions of ExecuTorch are:

  • Portability: Compatibility with a wide variety of computing platforms, from high-end mobile phones to highly constrained embedded systems and microcontrollers.
  • Productivity: Enabling developers to use the same toolchains and SDK from PyTorch model authoring and conversion, to debugging and deployment to a wide variety of platforms.
  • Performance: Providing end users with a seamless and high-performance experience due to a lightweight runtime and utilizing full hardware capabilities such as CPUs, NPUs, and DSPs.

For a comprehensive technical overview of ExecuTorch and step-by-step tutorials, please visit our documentation website for the latest release (or the main branch).

Important: This is a preview release

This is a preview version of ExecuTorch and should be used for testing and evaluation purposes only. It is not recommended for use in production settings. We welcome any feedback, suggestions, and bug reports from the community to help us improve the technology. Please use the PyTorch Forums for discussion and feedback about ExecuTorch using the ExecuTorch category, and our GitHub repository for bug reporting.

The ExecuTorch code and APIs are still changing quickly, and there are not yet any guarantees about forward/backward source compatibility. We recommend using the latest v#.#.# release tag from the Releases page when experimenting with this preview release.

Directory Structure

executorch
├── backends                        #  Backend delegate implementations.
├── build                           #  Utilities for managing the build system.
├── bundled_program                 #  Utilities for attaching reference inputs and outputs to models. TODO move to extension
├── codegen                         #  Tooling to autogenerate bindings between kernels and the runtime. TODO move to tool
├── configurations                  #  TODO delete this
├── docs                            #  Static docs tooling
├── examples                        #  Examples of various user flows, such as model export, delegates, and runtime execution.
├── exir                            #  Ahead of time library, model capture and lowering apis.
|   ├── _serialize                  #  Serialize final export artifact.
|   ├── backend                     #  Backend delegate ahead of time APIs
|   ├── capture                     #  Program capture.
|   ├── dialects                    #  Op sets for various dialects in the export process.
|   ├── emit                        #  Conversion from ExportedProgram to ExecuTorch execution instructions.
|   ├── passes                      #  Built-in compiler passes.
|   ├── program                     #  Export artifacts.
|   ├── verification                #  IR verification.
├── extension                       #  Extensions built on top of the runtime.
|   ├── aten_util
|   ├── data_loader                 #  1st party data loader implementations.
|   ├── memory_allocator            #  1st party memory allocator implementations.
|   ├── pybindings                  #  Python api for executorch runtime.
|   ├── pytree                      #  C++ and Python flattening and unflattening lib for pytrees.
|   ├── testing_util
├── kernels                         #  1st party kernel implementations.
|   ├── aten
|   ├── optimized
|   ├── portable                    #  Reference implementations of ATen operators.
|   ├── prim_ops                    #  Special ops used in executorch runtime for control flow and symbolic primitives.
|   ├── quantized
├── profiler                        #  Utilities for profiling. TODO delete in favor of ETDump in sdk/
├── runtime                         #  core cpp runtime of executorch
|   ├── backend                     #  Backend delegate runtime APIs
|   ├── core                        #  Core structures used across all levels of the runtime
|   ├── executor                    #  Model loading, initalization, and execution.
|   ├── kernel                      #  Kernel registration and management.
|   ├── platform                    #  Layer between architecture specific code and user calls.
├── schema                          #  ExecuTorch program definition, TODO move under serialization/
├── scripts                         #  Utility scripts for size management, dependency management, etc.
├── sdk                             #  Model profiling, debugging, and introspection.
├── shim                            #  Compatibility layer between OSS and Internal builds
├── test                            #  Broad scoped end2end tests
├── third-party                     #  third-party dependencies
├── util                            #  TODO delete this

License

ExecuTorch is BSD licensed, as found in the LICENSE file.

executorch's People

Contributors

angelayi avatar cccclai avatar cymbalrush avatar dbort avatar denisvieriu97 avatar digantdesai avatar gasoonjia avatar gregorycomer avatar guangy10 avatar huydhn avatar jack-khuu avatar jacobszwejbka avatar jerryzh168 avatar jorgep31415 avatar kimishpatel avatar kirklandsign avatar larryliu0820 avatar lucylq avatar manuelcandales avatar mcr229 avatar mergennachin avatar mikekgfb avatar olivia-liu avatar shoumikhin avatar ss-jia avatar svekars avatar tarun292 avatar ydwu4 avatar yipjustin avatar zhxchen17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

executorch's Issues

android xnnpack example fails

python3 -m examples.xnnpack.aot_compiler --model_name="dl3" --delegate
fails with

Command '['flatc', '--binary', '-o', '/tmp/tmp8mrxbnc_', '/tmp/tmp8mrxbnc_/program.fbs', '/tmp/tmp8mrxbnc_/data.json']' died with <Signals.SIGKILL: 9>.

Upcoming changes to export API in ExecuTorch (published on 9/12/2023)

Where are we?

Exporting pytorch model for ExecuTorch runtime goes through multiple AoT (Ahead of Time) stages.
At high level there are 3 stages.

  1. exir.capture: This captures model’s graph using ATen IR.
  2. to_edge: translate ATen dialect into edge dialect with dtype specialization.
  3. to_executorch: translate edge dialect to executorch dialect, along with running various passes, e.g. out variant, memory planning etc., to make model ready for executorch runtime.

Two important stops in model’s journey to executorch runtime are: a) quantization and b) delegation.

Entry points for quantization are between step 1 and 2. Thus quantization APIs consume ATen IR and are not edge/executorch specific.

Entry points for delegation are between step 2 and 3. Thus delegation APIs consume edge dialect IR.

Need for the export API change.

Quantization workflow is built on top of exir.capture which is built on top of torch.export API. In order to support QAT, such exported models need to work with eager mode autograd. Current export, of step 1 above, emits ATen IR with core ATen ops. This is not autograd safe, meaning it is not safe to run such an exported model in eager mode (e.g. in python), and, expect the autograd engine to work. Thus training APIs, such as calculating loss on the output and calling backward on the loss, are not guaranteed to work with this IR.

It is important that quantization APIs, for QAT as well as PTQ, work on the same IR, because a) it provides better UX to the users and b) it provides a single IR that backend specific quantizers (read more here) can target.

For this reason we aligned on two stage export, that is rooted in the idea of progressive lowering. The two stages are:

  1. Export emits pre-dispatch ATen IR
  2. Pre-dispatch ATen IR is lowered to core ATen IR.

Output of stage 1 is autograd safe and thus models exported at 1 can be trained via eager mode autograd engine.

New export API.

We are rolling out changes related to new export API in three stages.

Stage 1 (landed):

As shown in the figure below, exir.capture is broken down into:

  • capture_pre_autograd_graph
  • exir.capture

image

Example of exporting model without quantization:

gm = export.capture_pre_autograd_graph(m)
ep = exir.capture(gm) # to be replaced with torch.export

Example of exporting model with quantization:

gm = torch.capture_pre_autograd_graph(m)
quantized_gm = calls_to_quantizaiton_api(gm)
ep = exir.capture(quantized_gm) # to be replaced with torch.export

You can see these changes here and here for how quantization APIs fit in.

Stage 2 (coming soon):

We will deprecate exir.capture in favor of directly using torch.export. More updates on this will be posted soon.

Stage 3 (timeline is to be determined):

The two APIs listed in stage 1 will be renamed to:

  • torch.export
  • to_core_aten

torch.export will export graph with ATen IR, and full ATen opset, that is autograd safe, while to_core_aten will transform output of torch.export into core ATen IR that is NOT autograd safe.

Example of exporting model without quantization:

ep = torch.export(model)
ep = ep.to_core_aten()

Example of exporting model with quantization:

ep = torch.export(model)

gm = ep.module() # obtain fx.GraphModule. API name may change
quantized_gm = calls_to_quantizaiton_api(gm)
quantized_ep = torch.export(quantized_gm) # re-export for API compatibility

ep = quantized_ep.to_core_aten()

Timeline for this is to be determined, but this will NOT happen before PyTorch conference on 10/16/2023.

Why this change?

There are a couple of reasons:
This change aligns well with the long term state where capture_pre_autograd_graph is replaced with torch.export to obtain autograd safe aten IR, and the current use of exir.capture (or torch.export when replaced) will be replaced with to_core_aten to obtain ATen IR with core ATen opset.

In the long term, export for quantization wont be separate. Quantization will be an optional step, like delegation, in the export journey. Thus aligning with that in the short terms helps because:

  • it helps users with a correct mental model of how quantization fits in the export workflow, and
  • export problems dont become quantization problems.

Why the change now?

To minimize the migration pain later and have better alignment with the long term changes.

Docs previews do not delete files from previous PR versions

During the lifetime of #1082, I renamed a couple of .md files.

In the latest version of the docs preview, I can see the new file name at https://docs-preview.pytorch.org/pytorch/executorch/1082/backend-delegates-integration.html

But, I can also see an old, stale version of the previous file name at https://docs-preview.pytorch.org/pytorch/executorch/1082/native-delegates-integration.html

It seems like the preview job doesn't delete the old files when running again.

Leaving old files around could lead to unintentional breakages. E.g., if I had renamed the files but forgotten to update a link to those files, my testing might have led me to believe that the links worked, because they pointed to actual files and didn't 404, even if they weren't technically the right files.

Fix Permute_Memory_Format_Pass

TOSA backend requires channel last memory formats for operators like Conv2d (NHWC).

The Permute_Memory_Format_Pass will be needed to satisfy the channel last memory requirement in the lowering stage.

However, this pass is still WIP and the neighbor operators' shapes are not properly updated. Create a issue ticket here to track the fix and development.

Delegating to a backend

Hi,

I am taking the steps to run my model for inference in my Android app.

As I understand after building the exported model, the next step is to delegate to a backend. Is this right?

I have a couple of questions regarding the sample code for lowering a module in

a) Lowering the Whole Module (https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html#lowering-the-whole-module)
and
b) Lowering a model to XNNPACK (https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html#lowering-a-model-to-xnnpack)

Specifically:

  1. In a) the file which is saved is the lowered module while in b) it is the to_executorch() file. Aren't they different? Which is right? What am I missing?

  2. The call to_backend in a) has a number of arguments while in b) its only argument is the backend name. Can I use either one? Is one preferable over the other?

I intend on using XNNPACK. Should I just go with code like the one in b), or it doesn't really matter as except for the two points described above, I did not notice significant differences?

Thanks

From https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html#lowering-the-whole-module

class LowerableModule(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x):
        return torch.sin(x)


# Export and lower the module to Edge Dialect
example_args = (torch.ones(1),)
pre_autograd_aten_dialect = capture_pre_autograd_graph(LowerableModule(), example_args)
aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect, example_args)
edge_program: EdgeProgramManager = to_edge(aten_dialect)
to_be_lowered_module = edge_program.exported_program()

from executorch.exir.backend.backend_api import LoweredBackendModule, to_backend

# Import the backend
from executorch.exir.backend.test.backend_with_compiler_demo import (  # noqa
    BackendWithCompilerDemo,
)

# Lower the module
lowered_module: LoweredBackendModule = to_backend(
    "BackendWithCompilerDemo", to_be_lowered_module, []
)
print(lowered_module)
print(lowered_module.backend_id)
print(lowered_module.processed_bytes)
print(lowered_module.original_module)

# Serialize and save it to a file
save_path = "delegate.pte"
with open(save_path, "wb") as f:
    f.write(lowered_module.buffer())

From https://pytorch.org/executorch/stable/tutorial-xnnpack-delegate-lowering.html#lowering-a-model-to-xnnpack

import torch
import torchvision.models as models

from torch.export import export
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
from executorch.exir import to_edge


mobilenet_v2 = models.mobilenetv2.mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT).eval()
sample_inputs = (torch.randn(1, 3, 224, 224), )

edge = to_edge(export(mobilenet_v2, sample_inputs))

edge = edge.to_backend(XnnpackPartitioner)

exec_prog = edge.to_executorch()

with open("xnnpack_mobilenetv2.pte", "wb") as file:
    file.write(exec_prog.buffer)

Ninja: build stopped: subcommand failed.

Can not run the command "python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" for "exporting and bundling the MobileNet v3 model." in ExecuTorch iOS Demo App tutorial.

Env: MacOS termial

I have installed all the additional dependencies as per the previous steps, but I'm getting an error at this stage. Is there something I might have missed?

error:
**(executorch) qiuyicheng@qiuyc-MacBook-Pro executorch % python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3"
Traceback (most recent call last):
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
subprocess.run(
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/qiuyicheng/QIU/Code/pytorch/executorch/examples/apple/mps/scripts/mps_example.py", line 15, in
from executorch.backends.apple.mps.mps_preprocess import MPSBackend
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/backends/apple/mps/mps_preprocess.py", line 10, in
from executorch.backends.apple.mps.utils.graph_bindings import graph_bindings
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/executorch/backends/apple/mps/utils/graph_bindings.py", line 37, in
graph_bindings = torch.utils.cpp_extension.load(
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1308, in load
return _jit_compile(
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1710, in _jit_compile
_write_ninja_file_and_build_library(
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1823, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'MPSGraphBindings': [1/2] c++ -MMD -MF MPSGraphInterface.o.d -DTORCH_EXTENSION_NAME=MPSGraphBindings -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -I/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include/TH -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include/THC -isystem /Users/qiuyicheng/miniconda3/envs/executorch/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm -o MPSGraphInterface.o
FAILED: MPSGraphInterface.o
c++ -MMD -MF MPSGraphInterface.o.d -DTORCH_EXTENSION_NAME=MPSGraphBindings -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_clang" -DPYBIND11_STDLIB="_libcpp" -DPYBIND11_BUILD_ABI="_cxxabi1002" -I/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include/TH -isystem /Users/qiuyicheng/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/include/THC -isystem /Users/qiuyicheng/miniconda3/envs/executorch/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -c /Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm -o MPSGraphInterface.o
/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm:118:3: error: unknown type name 'MPSGraphExecutableSerializationDescriptor'; did you mean 'MPSGraphExecutableExecutionDescriptor'?
MPSGraphExecutableSerializationDescriptor *serializationDescriptor = [MPSGraphExecutableSerializationDescriptor new];
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
MPSGraphExecutableExecutionDescriptor
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/MetalPerformanceShadersGraph.framework/Headers/MPSGraphExecutable.h:33:12: note: 'MPSGraphExecutableExecutionDescriptor' declared here
@interface MPSGraphExecutableExecutionDescriptor : NSObject
^
/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm:118:73: error: use of undeclared identifier 'MPSGraphExecutableSerializationDescriptor'
MPSGraphExecutableSerializationDescriptor *serializationDescriptor = [MPSGraphExecutableSerializationDescriptor new];
^
/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm:119:27: error: property 'deploymentPlatform' not found on object of type 'MPSGraphExecutableExecutionDescriptor '
serializationDescriptor.deploymentPlatform = MPSGraphDeploymentPlatformMacOS;
^
/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm:119:48: error: use of undeclared identifier 'MPSGraphDeploymentPlatformMacOS'
serializationDescriptor.deploymentPlatform = MPSGraphDeploymentPlatformMacOS;
^
/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm:120:27: error: property 'minimumDeploymentTarget' not found on object of type 'MPSGraphExecutableExecutionDescriptor '
serializationDescriptor.minimumDeploymentTarget = @"14.0.0";
^
/Users/qiuyicheng/QIU/Code/pytorch/executorch/backends/apple/mps/utils/MPSGraphInterface.mm:121:9: warning: instance method '-serializeToMPSGraphPackageAtURL:descriptor:' not found (return type defaults to 'id') [-Wobjc-method-access]
[exec serializeToMPSGraphPackageAtURL:bundleURL descriptor:serializationDescriptor];
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/MetalPerformanceShadersGraph.framework/Headers/MPSGraphExecutable.h:80:12: note: receiver is instance of class declared here
@interface MPSGraphExecutable : NSObject
^
1 warning and 5 errors generated.
ninja: build stopped: subcommand failed.

Build runtime - required headers

What c++ headers should I have in my system before building runtime? I'm on ubuntu, so it would be great to have list of devel packages with that headers

--edit--
For ubuntu 22.04 which clang version you suggest?
--edit2--
Should I install coresponsing llvm?

--edit3--
I have installed these packages:

clang-15 
clang-tools-15 
libclang-15-dev
libclang-cpp15
libclang-cpp15-dev
libclang-common-15-dev 
libc++-15-dev
llvm-15 
llvm-15-dev 
lld-15 
liblldb-15-dev 
liblld-15

Questions on deploying Quantized models ...

Hi,

This is more of a question than an issue, but I couldn't find the documentation or source code examples that address this. We have a backend that only supports fixed point operators and I am trying to evaluate using executorch to deploy to our platform. I am new to using Py-Torch as a deployment platform, so please bear with me if my question is too basic.

When I use Py-Torch quantization, I see that it creates a graph in the following format where each operator is sandwiched between dequant and quant ops:

  ... -> dequant -> opX -> quant -> dequant -> opY -> quant -> ...

So, when I use executorch partitioning, is it the expectation that we pattern match dequant -> opX -> quant for lowering into some supported fixed point primitive supported on the backend?

Suppose, I have a Python model of each fixed point op, is there any straightforward way I can run the executorch program directly on Python by substituting the python model for the corresponding lowered module? Since the graph schema is known, it should be possible to do this myself, but wondering if someone already solved this problem.

If I lower the entire graph onto the backend as a single lowered module, I suppose that the memory planning doesn't apply inside the lowered module - i.e., the lowered module needs to take care of memory planning of tensors inside the module?

Finally, is there an example that shows how I can pass already quantized inputs to the executorch program? For example, if I use fixed quantization for inputs and outputs, clients can directly pass quantized inputs and outputs without the need to deal with floating point data. Is this possible with executorch?

Appreciate your help with my questions. This is an impressive platform!

Thanks,
Vijay.

Add `POET (Private Optimal Energy Training)` to `EXECUTORCH`

motivation and pitch

POET enables the training of state-of-the-art memory-hungry ML models on smartphones and other edge devices. POET (Private Optimal Energy Training) exploits the twin techniques of integrated tensor rematerialization, and paging-in/out of secondary storage (as detailed in our paper at ICML 2022) to optimize models for training with limited memory. POET's Mixed Integer Linear Formulation (MILP) ensures the solutions are provably optimal!

implementation available

https://github.com/ShishirPatil/poet
https://arxiv.org/abs/2207.07697

[v0.1.0] Release Tracker

Release Tracker for 0.1.0 release

UPDATE (Oct 16, 2023): We are done accepting cherry-pick fixes. A new release tag has been created.

Will use this thread to do QA and cherry-pick fixes.

  • For doc-only fixes (i.e., changes solely in executorch/docs), we will periodically (i.e., once a day) cherry-pick them as a batch. You don't need to do anything.
  • For bug-fixes that needs cherry-picking, here's how you can proceed.

Reference: HUD dashboard for CI results in the release branch.

Cherry-pick process:

  • Land to main branch via the normal phabricator flow.
  • Cherry-Pick to release/0.1. But not land it. See details below
     # in main branch
     git log # find the commit hash of the cherry-pick


     git fetch origin release/0.1
     git checkout release/0.1
     git checkout -b ${USER}/my_fork 
     git cherry-pick <commit_hash> 

     git push
     # may need to do git push --set-upstream origin ${USER}/my_fork 
     
     # Create a pull request for 'my_fork' on GitHub by visiting
     # Ex: https://github.com/pytorch/executorch/pull/new/${USER}/my_fork 
     
     # Example UI: https://www.internalfb.com/intern/px/p/3zxVb
  • Post here the main branch PR, release branch PR and brief description why you need this. Note that if you use - to make a list, the GitHub UI will show a richer version of the PR link instead of just #123.

Tasks

No tasks being tracked yet.

Build on windows

Is it support build on windows and link executorch for windows inference? (without libtorch

"Build doc" flaky failure copying to /docs

Noticed this docs failure on the CI hud: https://github.com/pytorch/executorch/actions/runs/6266414923/job/17017344924#step:11:216

The failing script line is https://github.com/pytorch/executorch/blob/main/.github/workflows/doc-build.yml#L47

RUNNER_DOCS_DIR is set to /docs in this case, but it looks like the runner doesn't have the permission to create files/dirs under that directory.

This jobs succeeds on other commits, so it looks like it's flaky. We should figure out how to make it more solid.

git submodule update --init fails

git clone --branch v0.1.0 https://github.com/pytorch/executorch.git
cd executorch
git submodule sync
git submodule update --init

fails with

fatal: unable to access 'https://git.mlplatform.org/ml/ethos-u/ethos-u-core-driver.git/': SSL: no alternative certificate subject name matches target host name 'git.mlplatform.org'
fatal: clone of 'https://git.mlplatform.org/ml/ethos-u/ethos-u-core-driver.git' into submodule path 'C:/Users/some_path/executorch/backends/arm/third-party/ethos-u-core-driver' failed
Failed to clone 'backends/arm/third-party/ethos-u-core-driver' a second time, aborting

Pybind support

After enabling pybind we should be able to migrate a lot more internal tests to OSS. Currently the pybind repo is checked in third-party/ but need some wiring to make it work for python_cpp_extension BUCK macros.

[CI]LintRunner Version Mismatch

Inside public CI, the version of lintrunner is 0.11.0, while the lintrunner-adapters is 0.9.0.

However, the latest version of lintrunner-adapters is also 0.11.0. Is there any specific reason of using a slightly older version of lintrunner-adapters? Or it's just not been updated yet in the docker image?

The latest lintrunner version: https://pypi.org/project/lintrunner/#history
The latest lintrunner-adapters version: https://pypi.org/project/lintrunner-adapters/#history

Requirement already satisfied: lintrunner==0.11.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from -r requirements-lintrunner.txt (line 2)) (0.11.0)
Requirement already satisfied: lintrunner-adapters==0.9.0 in /opt/conda/envs/py_3.10/lib/python3.10/site-packages (from -r requirements-lintrunner.txt (line 3)) (0.9.0)

Error with lowering ViT to QNN backend

Please use mobilenet_v2.py script to repro and replace mv2 with vit. Then run the script. I skipped the adb model run part.

Got the following error:

[INFO][Qnn ExecuTorch] create QNN Logger with log_level 2
[WARNING]QnnDsp <W> Initializing HtpProvider

[INFO][Qnn ExecuTorch] Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO][Qnn ExecuTorch] Caching: Caching is in SAVE MODE.
[INFO][Qnn ExecuTorch] Running level=3 optimization.
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_2:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_2 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_3:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_3 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_6:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_6 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_7:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_7 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_10:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_10 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_11:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_11 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_14:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_14 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_15:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_15 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_18:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_18 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_19:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_19 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_22:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_22 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_23:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_23 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_26:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_26 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_27:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_27 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_30:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_30 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_31:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_31 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_34:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_34 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_35:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_35 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_38:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_38 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_39:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_39 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_42:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_42 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_43:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_43 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_46:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_46 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default_47:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default_47 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.linear.default | True
[QNN Partitioner Op Support]: aten.linear.default | True
[ERROR]Tensor 0 and 0 have mismatching datatypes. 0x408 != 0x232.

[ERROR]Op specific validation failed.

[ERROR]QnnDsp <E> validateNativeOps master op validator aten_select_copy_int_36:qti.aisw:StridedSlice failed 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_select_copy_int_36 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.select_copy.int | False
Traceback (most recent call last):
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/hsz/executorch/examples/qualcomm/scripts/vit.py", line 132, in <module>
    build_executorch_binary(
  File "/home/hsz/executorch/examples/qualcomm/scripts/utils.py", line 169, in build_executorch_binary
    edge_prog.exported_program = to_backend(edge_prog.exported_program, QnnPartitioner)
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/functools.py", line 878, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/home/hsz/executorch/exir/backend/backend_api.py", line 296, in _
    partitioner_result = partitioner_instance(copied_edge_program)
  File "/home/hsz/executorch/exir/backend/partitioner.py", line 57, in __call__
    return self.partition(exported_program)
  File "/home/hsz/executorch/backends/qualcomm/partition/qnn_partitioner.py", line 127, in partition
    partitions = self.generate_partitions(graph_module)
  File "/home/hsz/executorch/backends/qualcomm/partition/qnn_partitioner.py", line 111, in generate_partitions
    return generate_partitions_from_list_of_nodes(
  File "/home/hsz/executorch/exir/backend/canonical_partitioners/pattern_op_partitioner.py", line 50, in generate_partitions_from_list_of_nodes
    partition_list = capability_partitioner.propose_partitions()
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/passes/infra/partitioner.py", line 150, in propose_partitions
    if self.__is_node_supported(node) and node not in assignment:
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/passes/infra/partitioner.py", line 55, in __is_node_supported
    self.operator_support.is_node_supported(dict(self.graph_module.named_modules()), node)
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/passes/operator_support.py", line 145, in is_node_supported
    return is_node_supported(submodules, node)
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/passes/operator_support.py", line 170, in _any_chain
    return any(
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/fx/passes/operator_support.py", line 171, in <genexpr>
    x.is_node_supported(submods, node)
  File "/home/hsz/executorch/backends/qualcomm/partition/qnn_partitioner.py", line 50, in is_node_supported
    op_wrapper = self.node_visitors[node.target.__name__].define_node(
KeyError: 'aten.native_layer_norm.default'
[INFO][Qnn ExecuTorch] Destroy Qnn context
[INFO][Qnn ExecuTorch] Destroy Qnn device
[INFO][Qnn ExecuTorch] Destroy Qnn backend

Support complex data type in ExecuTorch

Currently complex data type is not supported in ExecuTorch runtime. Although the model may be rewritten to real number operations, but it would affect the user's development efficiency.

Vulkan support

I can see module with vulkan backend, but it looks like unfinished. Are there plans to implement it?

[pt2e to tosa] face AttributeError

Hi @Jerry-Ge ,

I have run the https://github.com/pytorch/executorch/blob/main/examples/arm/run.sh example done and success, now I am try to modify it to run a quantize int8 pytorch model which need to pass vela on FVP use ARM Ethous U55.

I use the pytorch mnist classification cnn model and quantize to int8 by convert_pt2e. The result of int8 model seems correct.
And I want to export to executorch which backend is ARM U55, but face AttributeError: 'ReshapeAttribute' object has no attribute 'NewshapeAsNumpy'. Did you mean: 'NewShapeAsNumpy'? while doing edge = edge.to_backend(ArmPartitioner).
How could I fix it?

The following code is my export code.

from __future__ import print_function
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.optim.lr_scheduler import StepLR
from torch.ao.quantization import get_default_qconfig_mapping
from torch.quantization.quantize_fx import prepare_fx, convert_fx
from torch.ao.quantization import QuantStub, DeQuantStub
import cv2
import numpy as np

import argparse
import logging

import torch
import torch._export as export

from executorch.backends.arm.arm_backend import ArmPartitioner
from executorch.exir import EdgeCompileConfig

from ..portable.utils import export_to_edge, save_pte_program
class Net(nn.Module):
    def __init__(self): 
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 8, 3, 1)
        self.conv2 = nn.Conv2d(8, 16, 3, 1)
        self.conv3 = nn.Conv2d(16, 32, 5, 1)
        self.fc1 = nn.Linear(32, 64)
        self.fc2 = nn.Linear(64, 10)
    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2,stride=2)
        x = self.conv2(x)
        x = F.max_pool2d(x, 2,stride=2)
        x = self.conv3(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = self.fc2(x)
        output = F.softmax(x, dim=1)
        return output


def train(args, model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.cross_entropy(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % args.log_interval == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
            if args.dry_run:
                break


def test(model, device, test_loader):
    # model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

def calibrate(model, data_loader):  
    # model.eval()
    with torch.no_grad():
        for image, target in data_loader:
            model(image)
def main():
    # Training settings
    parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
    parser.add_argument('--batch-size', type=int, default=64, metavar='N',
                        help='input batch size for training (default: 64)')
    parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',
                        help='input batch size for testing (default: 1000)')
    parser.add_argument('--epochs', type=int, default=14, metavar='N',
                        help='number of epochs to train (default: 14)')
    parser.add_argument('--lr', type=float, default=1.0, metavar='LR',
                        help='learning rate (default: 1.0)')
    parser.add_argument('--gamma', type=float, default=0.7, metavar='M',
                        help='Learning rate step gamma (default: 0.7)')
    parser.add_argument('--no-cuda', action='store_true', default=False,
                        help='disables CUDA training')
    parser.add_argument('--dry-run', action='store_true', default=False,
                        help='quickly check a single pass')
    parser.add_argument('--seed', type=int, default=1, metavar='S',
                        help='random seed (default: 1)')
    parser.add_argument('--log-interval', type=int, default=10, metavar='N',
                        help='how many batches to wait before logging training status')
    parser.add_argument('--save-model', action='store_true', default=False,
                        help='For Saving the current Model')
    parser.add_argument(
        "-d",
        "--delegate",
        action="store_true",
        required=False,
        default=False,
        help="Flag for producing ArmBackend delegated model",
    )
    args = parser.parse_args()
    use_cuda = not args.no_cuda and torch.cuda.is_available()

    torch.manual_seed(args.seed)

    device = torch.device("cuda" if use_cuda else "cpu")

    train_kwargs = {'batch_size': args.batch_size}
    test_kwargs = {'batch_size': args.test_batch_size}
    if use_cuda:
        cuda_kwargs = {'num_workers': 1,
                       'pin_memory': True,
                       'shuffle': True}
        train_kwargs.update(cuda_kwargs)
        test_kwargs.update(cuda_kwargs)

    transform=transforms.Compose([
        transforms.ToTensor()
        ])
    dataset1 = datasets.MNIST('./data', train=True, download=True,
                       transform=transform)
    dataset2 = datasets.MNIST('./data', train=False,
                       transform=transform)
    train_loader = torch.utils.data.DataLoader(dataset1,**train_kwargs)
    test_loader = torch.utils.data.DataLoader(dataset2, **test_kwargs)

    float_model = Net().to(device)
    float_model.load_state_dict(torch.load("./pytorch_mnist_cnn_floating.pt"))
    float_model.eval()  
    model_to_quantize = Net().to(device)
    model_to_quantize.load_state_dict(torch.load("./pytorch_mnist_cnn_floating.pt"))
    model_to_quantize.eval()

    from torch._export import capture_pre_autograd_graph

    example_inputs = (torch.randn(1, 1, 28,28),)
    exported_model = capture_pre_autograd_graph(model_to_quantize, example_inputs)
    # or capture with dynamic dimensions
    # from torch._export import dynamic_dim
    # exported_model = capture_pre_autograd_graph(model_to_quantize, example_inputs, constraints=[dynamic_dim(example_inputs[0], 0)])
    from torch.ao.quantization.quantizer.xnnpack_quantizer import (
    XNNPACKQuantizer,
    get_symmetric_quantization_config,
    )
    quantizer = XNNPACKQuantizer()
    quantizer.set_global(get_symmetric_quantization_config())

    from torch.ao.quantization.quantize_pt2e import (
    prepare_pt2e,
    convert_pt2e,
    )
    prepared_model = prepare_pt2e(exported_model, quantizer)
    print(prepared_model.graph)

    calibrate(prepared_model, train_loader)  

    quantized_model = convert_pt2e(prepared_model)

    ################################################################
    ################################################################
    # pre-autograd export. eventually this will become torch.export
    # model = export.capture_pre_autograd_graph(quantized_model, example_inputs)
    print("convert_pt2e(prepared_model)done ")
    edge = export_to_edge(
        quantized_model,
        example_inputs,
        edge_compile_config=EdgeCompileConfig(
            _check_ir_validity=False,
        ),
    )
    print("export_to_edge done ")
    logging.info(f"Exported graph:\n{edge.exported_program().graph}")

    delegate = args.delegate
    model_name = "pytorch_mnist_cnn_ptq_qnnpack"
    if delegate is True:
        edge = edge.to_backend(ArmPartitioner)
        logging.info(f"Lowered graph:\n{edge.exported_program().graph}")
    print("edge.to_backend(ArmPartitioner) done ")
    exec_prog = edge.to_executorch()
    print("edge.to_executorch() done ")
    model_name = f"{model_name}" + (
        "_arm_delegate" if delegate is True else ""
    )
    save_pte_program(exec_prog.buffer, model_name)

    # delegate = args.delegate
    # # model_name = args.model_name + str_qconfig_mapping
    # model_name = args.model_name
    # if delegate is True:
    #     edge = edge.to_backend(ArmPartitioner)
    #     logging.info(f"Lowered graph:\n{edge.exported_program().graph}")

    # exec_prog = edge.to_executorch()

    # model_name = f"{model_name}" + (
    #     "_arm_delegate" if delegate is True else ""
    # )
    # save_pte_program(exec_prog.buffer, model_name)



if __name__ == '__main__':
    main()

Screenshot from 2023-11-07 16-54-56
Screenshot from 2023-11-07 16-48-22

Exporting mobilenetv2 to pte

Hi All,

first of all the executorch looks really interesting. Thank you for releasing!

To give it a try, I wanted to export simple MobileNetV2 architecture to pte and run it using executorch engine.

Following the exporting tutorial, I used the script from here https://pytorch.org/executorch/stable/tutorials/export-to-executorch-tutorial.html#saving-to-a-file

However, I get an error:

torch._export.verifier.SpecViolationError: Operator torch._ops.aten._native_batch_norm_legit_functional.default is not Aten Canonical.

Does it mean that for BatchNorm I need some extra handling?

Exporting simple example without BatchNorm works for me.

The minimal reproducible example:

import torch
import torchvision

import executorch.exir as exir

from torch._export import capture_pre_autograd_graph
from torch.export import export, ExportedProgram

m = torchvision.models.MobileNetV2()

example_args = (torch.randn(1, 3, 224, 224),)
pre_autograd_aten_dialect = capture_pre_autograd_graph(m, example_args)

# Optionally do quantization:
# pre_autograd_aten_dialect = convert_pt2e(prepare_pt2e(pre_autograd_aten_dialect, CustomBackendQuantizer))

aten_dialect: ExportedProgram = export(pre_autograd_aten_dialect, example_args)
edge_program: exir.EdgeProgramManager = exir.to_edge(aten_dialect)

# Optionally do delegation:
# edge_program = edge_program.to_backend(CustomBackendPartitioner)

executorch_program: exir.ExecutorchProgramManager = edge_program.to_executorch(
    exir.ExecutorchBackendConfig(
        passes=[],  # User-defined passes
    )
)

with open("mobilenet_v2.pte", "wb") as file:
    file.write(executorch_program.buffer)

The environment:

I installed executorch using git clone --branch v0.1.0 https://github.com/pytorch/executorch.git (as in the docs)
Versions:

torch==2.2.0.dev20231010+cpu
torchvision==0.17.0.dev20231010+cpu

Python 3.10.

If you need any more details, please let me know.

Thank you for suggestions.

Best regards,
Michał

Error lowering inception v3 to QNN

Please use mobilenet_v2.py script to repro and replace mv2 with ic3.

Got the following error:

[INFO][Qnn ExecuTorch] create QNN Logger with log_level 2
[WARNING]QnnDsp <W> Initializing HtpProvider

[INFO][Qnn ExecuTorch] Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO][Qnn ExecuTorch] Caching: Caching is in SAVE MODE.
[INFO][Qnn ExecuTorch] Running level=3 optimization.
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[ERROR]QnnDsp <E> validateNativeOps aten_linear_default:qti.aisw:FullyConnected op validator (quantized and FP16) failed 3110 and 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_linear_default with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.linear.default | False
[QNN Partitioner Op Support]: aten.view_copy.default | True
[QNN Partitioner Op Support]: aten.mean.dim | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.max_pool2d_with_indices.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.max_pool2d_with_indices.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.avg_pool2d.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.max_pool2d_with_indices.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.max_pool2d_with_indices.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.relu.default | True
[QNN Partitioner Op Support]: aten.convolution.default | True
[QNN Partitioner Op Support]: aten.cat.default | True
[QNN Partitioner Op Support]: aten.add.Tensor | True
[QNN Partitioner Op Support]: aten.mul.Tensor | True
[ERROR]Tensor 0 and 0 have mismatching datatypes. 0x232 != 0x408.

[ERROR]Op specific validation failed.

[ERROR]QnnDsp <E> validateNativeOps master op validator aten_unsqueeze_copy_default_2:qti.aisw:ExpandDims failed 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_unsqueeze_copy_default_2 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.unsqueeze_copy.default | False
[QNN Partitioner Op Support]: aten.select_copy.int | True
[QNN Partitioner Op Support]: aten.add.Tensor | True
[QNN Partitioner Op Support]: aten.mul.Tensor | True
[ERROR]Tensor 0 and 0 have mismatching datatypes. 0x232 != 0x408.

[ERROR]Op specific validation failed.

[ERROR]QnnDsp <E> validateNativeOps master op validator aten_unsqueeze_copy_default_1:qti.aisw:ExpandDims failed 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_unsqueeze_copy_default_1 with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.unsqueeze_copy.default | False
[QNN Partitioner Op Support]: aten.select_copy.int | True
[QNN Partitioner Op Support]: aten.add.Tensor | True
[QNN Partitioner Op Support]: aten.mul.Tensor | True
[ERROR]Tensor 0 and 0 have mismatching datatypes. 0x232 != 0x408.

[ERROR]Op specific validation failed.

[ERROR]QnnDsp <E> validateNativeOps master op validator aten_unsqueeze_copy_default:qti.aisw:ExpandDims failed 3110

[ERROR]QnnDsp <E> QnnBackend_validateOpConfig failed 3110

[ERROR]QnnDsp <E> Failed to validate op aten_unsqueeze_copy_default with error 0xc26

[WARNING][Qnn ExecuTorch] Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.unsqueeze_copy.default | False
[QNN Partitioner Op Support]: aten.select_copy.int | True
[INFO][Qnn ExecuTorch] create QNN Logger with log_level 2
[INFO][Qnn ExecuTorch] Initialize Qnn backend parameters for Qnn executorch backend type 2
[INFO][Qnn ExecuTorch] Caching: Caching is in SAVE MODE.
[INFO][Qnn ExecuTorch] Running level=3 optimization.
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_1, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_2, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_3, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_4, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_5, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_6, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_7, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_quantize_per_tensor_default_8, quantized_decomposed.quantize_per_tensor.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_mul_tensor, aten.mul.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_mul_tensor_1, aten.mul.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_mul_tensor_2, aten.mul.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_add_tensor, aten.add.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_add_tensor_1, aten.add.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_add_tensor_2, aten.add.Tensor
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_1, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_1, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_2, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_2, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_max_pool2d_with_indices_default, aten.max_pool2d_with_indices.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: getitem, getitem
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_3, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_3, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_4, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_4, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_max_pool2d_with_indices_default_1, aten.max_pool2d_with_indices.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: getitem_1, getitem
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_5, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_6, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_8, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_5, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_6, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_8, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_11, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_7, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_9, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_11, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_7, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_9, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_10, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_10, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_1, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_12, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_13, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_15, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_1, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_12, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_13, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_15, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_18, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_14, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_16, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_18, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_14, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_16, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_17, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_17, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_2, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_19, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_20, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_22, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_2, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_19, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_20, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_22, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_25, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_21, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_23, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_25, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_21, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_23, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_24, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_24, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_3, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_26, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_27, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_max_pool2d_with_indices_default_2, aten.max_pool2d_with_indices.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_26, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_27, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: getitem_2, getitem
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_28, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_28, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_29, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_29, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_4, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_30, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_31, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_34, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_3, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_30, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_31, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_34, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_39, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_32, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_35, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_39, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_32, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_35, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_33, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_36, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_33, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_36, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_37, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_37, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_38, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_38, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_5, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_40, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_41, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_44, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_4, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_40, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_41, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_44, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_49, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_42, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_45, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_49, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_42, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_45, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_43, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_46, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_43, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_46, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_47, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_47, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_48, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_48, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_6, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_50, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_51, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_54, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_5, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_50, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_51, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_54, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_59, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_52, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_55, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_59, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_52, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_55, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_53, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_56, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_53, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_56, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_57, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_57, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_58, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_58, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_7, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_60, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_61, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_64, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_6, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_60, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_61, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_64, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_69, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_62, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_65, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_69, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_62, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_65, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_63, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_66, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_63, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_66, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_67, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_67, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_68, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_68, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_8, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_70, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_72, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_max_pool2d_with_indices_default_3, aten.max_pool2d_with_indices.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_70, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_72, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: getitem_3, getitem
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_71, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_73, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_71, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_73, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_74, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_74, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_75, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_75, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_9, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_76, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_77, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_80, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_7, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_76, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_77, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_80, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_84, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_78, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_79, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_81, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_84, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_78, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_79, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_81, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_10, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_82, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_83, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_82, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_83, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_11, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_12, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_85, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_86, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_89, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_avg_pool2d_default_8, aten.avg_pool2d.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_85, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_86, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_89, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_93, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_87, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_88, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_90, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_93, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_87, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_88, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_90, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_13, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_91, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_convolution_default_92, aten.convolution.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_91, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_relu_default_92, aten.relu.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_14, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_cat_default_15, aten.cat.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_mean_dim, aten.mean.dim
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_permute_copy_default, aten.permute_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: aten_view_copy_default, aten.view_copy.default
INFO:executorch.backends.qualcomm.qnn_preprocess:Visiting: quantized_decomposed_dequantize_per_tensor_tensor, quantized_decomposed.dequantize_per_tensor.tensor
concat_opt.cc:303:ERROR:Concat node 10ab0000001f: input sizes on dim 3 add up to 672, output size is 3
graph_prepare.cc:4280:ERROR:Exception during graph prepare. bad concat detected
[ERROR]QnnDsp <E> graph prepare failed 99

[ERROR]QnnDsp <E> Failed to finalize graph executorch with err: 1002

[ERROR]QnnDsp <E> Failed to finalize graph (id: 2) with err 1002

[ERROR][Qnn ExecuTorch] Failed to finalize Qnn Graph with error: 1002
Traceback (most recent call last):
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/hsz/executorch/examples/qualcomm/scripts/inception_v3.py", line 131, in <module>
    build_executorch_binary(
  File "/home/hsz/executorch/examples/qualcomm/scripts/utils.py", line 169, in build_executorch_binary
    edge_prog.exported_program = to_backend(edge_prog.exported_program, QnnPartitioner)
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/functools.py", line 878, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/home/hsz/executorch/exir/backend/backend_api.py", line 312, in _
    tagged_graph_module = _partition_and_lower(
  File "/home/hsz/executorch/exir/backend/backend_api.py", line 189, in _partition_and_lower
    lowered_submodule = to_backend(
  File "/home/hsz/local/miniconda3/envs/executorch/lib/python3.10/functools.py", line 878, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
  File "/home/hsz/executorch/exir/backend/backend_api.py", line 110, in _
    preprocess_result: PreprocessResult = cls.preprocess(
  File "/home/hsz/executorch/backends/qualcomm/qnn_preprocess.py", line 77, in preprocess
    assert len(qnn_context_binary) != 0, "Failed to generate Qnn context binary."
AssertionError: Failed to generate Qnn context binary.
[INFO][Qnn ExecuTorch] Destroy Qnn context
[INFO][Qnn ExecuTorch] Destroy Qnn device
[INFO][Qnn ExecuTorch] Destroy Qnn backend
[INFO][Qnn ExecuTorch] Destroy Qnn context
[INFO][Qnn ExecuTorch] Destroy Qnn device
[INFO][Qnn ExecuTorch] Destroy Qnn backend

[build Error initializing DaemonStateData] how to fix it

hi,

I reference the tutorial to install the buck2-x86_64-unknown-linux-musl.zst on my PC.
And I want to build

/tmp/buck2 build //examples/portable/executor_runner:executor_runner --show-output

and face the build failed.

And I try to use killall reference from https://stackoverflow.com/questions/76771689/buck2-cant-create-inotify-watchers
and try build again.
But it is still build failed. Could somebody help me?

Screenshot from 2023-10-31 13-44-29

OS: Linux Ubuntu 20.04.4 LTS x86_64
buck2 version: 2023-07-18

Thanks,
Kris

Building executorch runtime

Sorry for this question.

From
https://pytorch.org/executorch/stable/demo-apps-android.html#xnnpack

it is not clear where in the directory tree to run the commands listed in this section.

I tried running them in a number of locations including executorch/examples/demo-apps/android/jni where the error was

  add_subdirectory given source
  "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/examples/demo-apps/android/../../../third-party/fbjni"
  which is not an existing directory.

I guess it should be "obvious". It isn't to me. Sorry.

Thanks

Compiling model for MPS delegate failed

  • System Sonoma 14.0
  • Hardware Apple Silicon M2
/AppleInternal/Library/BuildRoots/75428952-3aa4-11ee-8b65-46d450270006/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphComputePackage.mm:271: failed assertion `Error: creating .mpsgraphpackage directory failed'
[1]    55399 abort      python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3"
/Users/ourfor/anaconda3/envs/executorch/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

How to virtualize the qte model?

Hi,

I am now working on executorch. I want to see the model architecture of qte, which is easy for us to debug.
However, I cannot find a virtualizing tool. Netron does not support qte format now.
Could executorch support to virtualize the qte format model?

Besides, I wonder whether the export function will translate the ops in the Pytorch model to specifics ops in qte format?

Thanks!!!

Tasks

No tasks being tracked yet.

Executorch OSS CI tracking issue (2023)

I'm creating this initial issue to keep track of executorch OSS CI progress in 2023. P0 needs to be done before Oct.

  • (P0) Integrate executorch with GHA and PyTorch-org shareable runners (AWS, linux). This is needed to provide the infrastructure to run OSS CI.
  • (P0) Integrate executorch with GitHub webhook and Rockset DB. This is needed to keep track of which is run on OSS CI.
  • (P0) Setup e2e build and test on Linux runner. This requires BUCK OSS to run.
    • (P0) Check with Executorch team if they need support on getting the BUCK OSS setup.
    • (P0) Also setup one for Mac M1
    • (P0) Add e2e testing for the rest of the sample models T160756042
  • (P1) Setup devices (Android, iOS) to run benchmark tests. The prefer option is https://aws.amazon.com/device-farm
  • (P1) Sync with Executorch and TorchBench on how to benchmark Executor.
  • (P1) Setup benchmark CI for Executor.
  • (P1) Add OSS alert for Executorch when CI workflow fails

This is probably not needed for now, only list them here for reference.

  • (P2) Linter
  • (P2) Binaries release
  • (P2) GitHub First features such as merge bot, Dr.CI

cc @malfet @seemethere @mergennachin @dbort @clee2000

Issue with getting started example

Hi,

I followed the getting started document and tried to build the executor-runner binary but get the following error:
Action failed: root//examples/portable/executor_runner:executor_runner (cxx_link_executable) Local command returned non-zero exit code 1 Reproduce locally: clang++ @buck-out/v2/gen/root/524f8da68ea2a374/examples/portable/executor_runner/__executor_runner__/executor_runner.linker.argsfile stdout: stderr: clang: error: invalid linker name in argument '-fuse-ld=lld' Build ID: 3f6d5a07-3e35-4a51-9706-d5c1499375ba Network: Up: 0 B Down: 0 B Jobs completed: 3. Time elapsed: 0.0s. Cache hits: 0%. Commands: 1 (cached: 0, remote: 0, local: 1) BUILD FAILED Failed to build 'root//examples/portable/executor_runner:executor_runner (prelude//platforms:default#524f8da68ea2a374)'

I'm using Ubuntu 22.04 and below is a list of my installed packages in the conda environment. Can you help me with this?

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
attrs 23.1.0 pypi_0 pypi
bzip2 1.0.8 h7f98852_4 conda-forge
c-ares 1.21.0 hd590300_0 conda-forge
ca-certificates 2023.7.22 hbcca054_0 conda-forge
certifi 2023.7.22 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
cmake 3.27.6 hcfe8598_0 conda-forge
exceptiongroup 1.1.3 pypi_0 pypi
execnet 2.0.2 pypi_0 pypi
executorch 0.1.0 pypi_0 pypi
expecttest 0.1.6 pypi_0 pypi
filelock 3.13.1 pypi_0 pypi
flatbuffers 23.5.26 pypi_0 pypi
fsspec 2023.10.0 pypi_0 pypi
huggingface-hub 0.17.3 pypi_0 pypi
hypothesis 6.88.1 pypi_0 pypi
icu 73.2 h59595ed_0 conda-forge
idna 3.4 pypi_0 pypi
iniconfig 2.0.0 pypi_0 pypi
jinja2 3.1.2 pypi_0 pypi
keyutils 1.6.1 h166bdaf_0 conda-forge
krb5 1.21.2 h659d440_0 conda-forge
ld_impl_linux-64 2.40 h41732ed_0 conda-forge
libcurl 8.4.0 hca28451_0 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 h516909a_1 conda-forge
libexpat 2.5.0 hcb278e6_1 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 13.2.0 h807b86a_2 conda-forge
libgomp 13.2.0 h807b86a_2 conda-forge
libiconv 1.17 h166bdaf_0 conda-forge
libllvm17 17.0.4 h5cf9203_0 conda-forge
libnghttp2 1.55.1 h47da74e_0 conda-forge
libnsl 2.0.1 hd590300_0 conda-forge
libsqlite 3.44.0 h2797004_0 conda-forge
libssh2 1.11.0 h0841786_0 conda-forge
libstdcxx-ng 13.2.0 h7e041cc_2 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libuv 1.46.0 hd590300_0 conda-forge
libxml2 2.11.5 h232c23b_1 conda-forge
libzlib 1.2.13 hd590300_5 conda-forge
lld 17.0.4 hcfcaf08_0 conda-forge
markupsafe 2.1.2 pypi_0 pypi
mpmath 1.2.1 pypi_0 pypi
ncurses 6.4 h59595ed_2 conda-forge
networkx 3.0rc1 pypi_0 pypi
numpy 1.26.1 pypi_0 pypi
openssl 3.1.4 hd590300_0 conda-forge
packaging 23.2 pypi_0 pypi
pandas 2.1.2 pypi_0 pypi
parameterized 0.9.0 pypi_0 pypi
pillow 9.3.0 pypi_0 pypi
pip 23.3.1 pyhd8ed1ab_0 conda-forge
pluggy 1.3.0 pypi_0 pypi
pytest 7.4.3 pypi_0 pypi
pytest-xdist 3.3.1 pypi_0 pypi
python 3.10.0 h543edf9_3_cpython conda-forge
python-dateutil 2.8.2 pypi_0 pypi
pytz 2023.3.post1 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
readline 8.2 h8228510_1 conda-forge
regex 2023.10.3 pypi_0 pypi
requests 2.31.0 pypi_0 pypi
rhash 1.4.4 hd590300_0 conda-forge
ruamel-yaml 0.18.5 pypi_0 pypi
ruamel-yaml-clib 0.2.8 pypi_0 pypi
safetensors 0.4.0 pypi_0 pypi
setuptools 68.2.2 pyhd8ed1ab_0 conda-forge
six 1.16.0 pypi_0 pypi
sortedcontainers 2.4.0 pypi_0 pypi
sqlite 3.44.0 h2c6b66d_0 conda-forge
sympy 1.11.1 pypi_0 pypi
tabulate 0.9.0 pypi_0 pypi
timm 0.6.13 pypi_0 pypi
tk 8.6.13 h2797004_0 conda-forge
tokenizers 0.14.1 pypi_0 pypi
tomli 2.0.1 pypi_0 pypi
torch 2.2.0.dev20231010+cpu pypi_0 pypi
torchaudio 2.2.0.dev20231010+cpu pypi_0 pypi
torchsr 1.0.4 pypi_0 pypi
torchvision 0.17.0.dev20231010+cpu pypi_0 pypi
tqdm 4.66.1 pypi_0 pypi
transformers 4.34.0 pypi_0 pypi
typing-extensions 4.8.0 pypi_0 pypi
tzdata 2023.3 pypi_0 pypi
urllib3 2.0.7 pypi_0 pypi
wheel 0.41.3 pyhd8ed1ab_0 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
zstd 1.5.5.1 pypi_0 pypi`

Executorch OSS tests will be failing until the next pytorch nightly

I'm working on landing these two pytorch diffs: pytorch/pytorch#109160 and pytorch/pytorch#109382 which will include the following changes to executorch: #314 and #359.

These are tested and green in the internal CI (D49218534 and D49207492), but because the executorch OSS pin is on a pytorch nightly, and no existing pytorch nightly build contain these changes, executorch OSS tests will be failing until the next pytorch nightly is built, and we update executorch's pin to be that nightly.

cc @mergennachin

CoreML backend delegate example does not work

I have been trying to run the example code snippet shown for the CoreML delegate here. I have setup executorch and the coreml backend for it as described here followed by here. But when running the following example:

import executorch.exir as exir
import torch

from executorch.exir.backend.backend_api import to_backend

from executorch.backends.coreml.compiler import CoreMLBackend

class LowerableSubModel(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x):
        return torch.sin(x)

# Convert the lowerable module to Edge IR Representation
to_be_lowered = LowerableSubModel()
example_input = (torch.ones(1), )
to_be_lowered_exir_submodule = exir.capture(to_be_lowered, example_input).to_edge()

# Lower to Core ML backend
lowered_module = to_backend('CoreMLBackend', to_be_lowered_exir_submodule, [])

I get a module not found error:

Traceback (most recent call last):
  File "/Users/abhiroop/Developer/torch_to_coreml/executorch_example.py", line 6, in <module>
    from executorch.backends.coreml.compiler import CoreMLBackend
ModuleNotFoundError: No module named 'executorch.backends.coreml'

Here is the full list of commands I ran to setup Executorch and its CoreML backend:

git clone --branch v0.1.0 https://github.com/pytorch/executorch.git

cd executorch
git submodule sync
git submodule update --init

conda create -yn executorch python=3.10.0
conda activate executorch

conda install cmake

./install_requirements.sh

./build/install_flatc.sh

./backends/apple/coreml/scripts/install_requirements.sh 

Have I missed a step in the installation procedure? Some help in figuring out what I am missing would be much appreciated!

Mac M1 CMake build failure: "variable 'i' set but not used" error in third-party/flatbuffers/src/idl_gen_rust.cpp

When building with CMake at 77175d7 on an M1 mac

(executorch) executorch % cmake --build cmake-out -j33

...

[ 17%] Building CXX object third-party/flatbuffers/CMakeFiles/flatc.dir/grpc/src/compiler/swift_generator.cc.o
/Users/@@@@@@@/executorch/third-party/flatbuffers/src/idl_gen_rust.cpp:397:12: error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]
    size_t i = 0;
           ^
[ 17%] Linking CXX static library libgflags_nothreads.a
[ 17%] Built target gflags_nothreads_static
1 error generated.
make[2]: *** [third-party/flatbuffers/CMakeFiles/flatc.dir/src/idl_gen_rust.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [third-party/flatbuffers/CMakeFiles/flatc.dir/all] Error 2
make: *** [all] Error 2

Compiler info

% c++ --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.6.0
Thread model: posix

Reported by @mergennachin

Build error about ExecuTorch on ARM Cortex-M55 + Ethos-U55

Hi,

I follow the ExecuTorch on ARM Cortex-M55 + Ethos-U55 example work flow to do ./run.sh [same-optional-scratch-dir-as-before].

But face the error like,

/tmp/ccFX9kwp.s: Assembler messages:
/tmp/ccFX9kwp.s:87: Error: syntax error -- `vcvtne.f64.f32 d0,s0'
make[2]: *** [kernels/portable/CMakeFiles/portable_kernels.dir/build.make:1322: kernels/portable/CMakeFiles/portable_kernels.dir/cpu/op_remainder.cpp.obj] Error 1
make[2]: *** Waiting for unfinished jobs....
make[1]: *** [CMakeFiles/Makefile2:247: kernels/portable/CMakeFiles/portable_kernels.dir/all] Error 2
make: *** [Makefile:91: all] Error 2

Hi, @cccclai
It seems like build portable_kernels.dir/cpu/op_remainder.cpp face error, while execute the following cmake command at ./run.sh .

cmake                                                 \
        -DBUCK2=${buck2}                                  \
        -DEXECUTORCH_BUILD_XNNPACK=OFF                    \
        -DEXECUTORCH_BUILD_GFLAGS=OFF                     \
        -DEXECUTORCH_BUILD_EXECUTOR_RUNNER=OFF            \
        -DEXECUTORCH_BUILD_HOST_TARGETS=OFF               \
        -DEXECUTORCH_BUILD_SDK=OFF                        \
        -DEXECUTORCH_BUILD_ARM_BAREMETAL=ON               \
        -DCMAKE_BUILD_TYPE=Release                        \
        -DEXECUTORCH_ENABLE_LOGGING=ON                    \
        -DEXECUTORCH_SELECT_OPS_LIST="aten::_softmax.out" \
        -DFLATC_EXECUTABLE="$(which flatc)"               \
        -DCMAKE_TOOLCHAIN_FILE="${toolchain_cmake}"       \
        "${et_root_dir}"

How can I fix it?

Screenshot from 2023-11-01 17-25-32

OS: Linux Ubuntu 20.04.4 LTS x86_64
buck2 version: 2023-07-18 (buck2-x86_64-unknown-linux-musl.zst)
cmake version 3.27.7
python version: 3.10.13
executorch: tag 0.1.0

Thanks,
Kris

Improve memory usage in EXIR emitter

  • Parameters in json file, that the json str is 5x larger than the file size. One option is to have parameters in segments in [internal T165017481]
  • Redundancy of different artifacts. Before the emitter, the memory size is > 2x larger than the model size.

RuntimeError: Missing out variants: {'aten::alias'}

I was able to have exir.capture run the trace of my model (I think). However, now the code fails with the error listed below. Could yo please take a look and let me know what you think I am doing wrong and what I should do next?
Thanks

<executorch.exir.program._program.ExirExportedProgram object at 0x7f59c4f14f40>
  0%|                                                                                                       | 0/25 [00:48<?, ?it/s]
Traceback (most recent call last):
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/fx/passes/infra/pass_manager.py", line 270, in __call__
    res = fn(module)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/fx/passes/infra/pass_base.py", line 41, in __call__
    self.ensures(graph_module)
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/passes/__init__.py", line 311, in ensures
    raise RuntimeError(f"Missing out variants: {self.missing_out_vars}")
RuntimeError: Missing out variants: {'aten::alias'}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/train.py", line 318, in <module>
    open("tfmodel.pte", "wb").write(exir.capture(m, (enc_input, dec_input, dec_source_mask, dec_target_mask))
  File "/home/adonnini1/Development/ContextQSourceCode/NeuralNetworks/trajectory-prediction-transformers-master/executorch/exir/program/_program.py", line 181, in to_executorch
    new_prog = ep._transform(*edge_to_executorch_passes(config))
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/export/exported_program.py", line 569, in _transform
    res = pm(self.graph_module)
  File "/home/adonnini1/anaconda3/lib/python3.9/site-packages/torch/fx/passes/infra/pass_manager.py", line 296, in __call__
    raise Exception(msg) from e
Exception: An error occurred when running the 'ToOutVarPass' pass after the following passes: ['SpecPropPass', 'EdgeToBackendOpsPass', 'RemoveAssertAsyncPass', 'HintBasedSymShapeEvalPass']

GPU backends

Hello,

It would be great to have way to utilize GPUs with executorch. I can see unfinished vulkan backend. Are there plans to enable vulkan, opengl or opencl?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.