llvm / torch-mlir Goto Github PK
View Code? Open in Web Editor NEWThe Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
License: Other
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
License: Other
Here is a simple python script that reproduces the issue.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch_mlir
torch_mlir.debug_trace_to_stderr()
N = 3
Cin = 16
Cout = 4
w = 10
h = 10
class Net(nn.Module):
def __init__(self, Cin, Cout):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(Cin, Cout, (3,3))
def forward(self, x):
x = self.conv1(x)
output = F.log_softmax(x, dim=1)
return output
model = Net(Cin, Cout)
inputs = torch.ones((N,Cin,h,w))
criterion = torch.nn.NLLLoss()
target = torch.empty(N, 8, 8, dtype=torch.long).random_(0, Cout)
optimizer = torch.optim.Adadelta(model.parameters(), lr=1e-3)
mb = torch_mlir.ModuleBuilder()
with mb.capture_function("adadelta_test", [inputs, target]) as f:
optimizer.zero_grad()
loss = criterion(model(inputs), target)
loss.backward()
optimizer.step()
f.returns([loss])
mb.module.operation.print(large_elements_limit=2)
When I run this, I get the following output.
TORCH_MLIR TRACE: Convolution (unboxed) dispatch: aten::convolution(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::_log_softmax(Tensor self, int dim, bool half_to_float) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::nll_loss2d_forward(Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index) -> (Tensor output, Tensor total_weight)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::nll_loss2d_backward(Tensor grad_output, Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index, Tensor total_weight) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::_log_softmax_backward_data(Tensor grad_output, Tensor output, int dim, Tensor self) -> (Tensor)
TORCH_MLIR TRACE: mkldnn_convolution_backward dispatch: aten::mkldnn_convolution_backward(Tensor self, Tensor grad_output, Tensor weight, int[] padding, int[] stride, int[] dilation, int groups, bool[3] output_mask) -> (Tensor, Tensor, Tensor)
TORCH_MLIR TRACE: copy_ dispatch: aten::copy_(Tensor(a!) self, Tensor src, bool non_blocking=False) -> (Tensor(a!))
TORCH_MLIR TRACE: copy_ dispatch: aten::copy_(Tensor(a!) self, Tensor src, bool non_blocking=False) -> (Tensor(a!))
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::zero_(Tensor(a!) self) -> (Tensor(a!))
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::zero_(Tensor(a!) self) -> (Tensor(a!))
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::mul_.Tensor(Tensor(a!) self, Tensor other) -> (Tensor(a!))
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::addcmul.out(Tensor self, Tensor tensor1, Tensor tensor2, *, Scalar value=1, Tensor(a!) out) -> (Tensor(a!))
Traceback (most recent call last):
File "models/conv2d.py", line 34, in <module>
optimizer.step()
File "/pytorch_nightly/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/pytorch_nightly/lib/python3.8/site-packages/torch/optim/adadelta.py", line 74, in step
square_avg.mul_(rho).addcmul_(grad, grad, value=1 - rho)
RuntimeError: isTensor() INTERNAL ASSERT FAILED at "/pytorch/aten/src/ATen/core/ivalue_inl.h":130, please report a bug to PyTorch. Expected Tensor but got Double
Any ideas on what could be causing this?
In general the doc for these ops is a bit terse right now, can you revisit these? Possibly linking to an official documentation page would be fine as well I guess.
Originally posted by @joker-eph in #16
Aim: Tracing the computation graph of 1 iteration of training from pytorch to mlir using modulebuilder.
Models tried: simple conv mnist model, dlrm, and a very simple 2 layer fully connected network.
Current status: Forward pass works great! but for the backward pass, it uses the loss tensors generated during the trace.
Way to spot:
We can see that the operand assigned to the loss computed is only used during it's assignment and return of function
Sample example:
Python script + mlir file generated is stored:
simple_example.zip
Thanks yall :)
Hello guys,
I have been reading code about graph lowering path, basically, the path is as follow:
Torchscript -> Translate into Torch MLIR Dialect -> Lowering into Linalg(computation ops)/Std(Basic ops)/SCF(loops/control flow) -> call IREE backend
Could someone give me some basic rationale about this path, Why directly lower down to Linalg instead of to HLO as there are optimization paths that can be reused in HLO like operation fusion, etc?
Thanks in advance,
Yang
ATen Passes are built using the old explicit registration patterns. They should get updated to be consistent with other NPCOMP passes.
Originally posted by @stellaraccident in #16
easy_emb.zip
Hey everybody,
I am trying to trace a network with embedding bag in it but I found some bug during backward pass(aside from caching of tensors). So when computing the gradient it tries to do some index_add ->cumsum_ -> resize and then index_select but I think it's missing a step of reducing the value after cumsum_ by 1 because when it tries to do index select it goes above the size of the vector it's trying to access by 1. I have a unit test attached.
To generate MLIR:
cd easy_emb
python emb.py
vim/(your favourite editor) embedding.mlir
Or we can just open the file pre-generated inside the zip and look
especially on line 59 - line 66
https://buildkite.com/iree/mlir-npcomp-standalone/builds/27#6fba062f-0357-4195-9d16-4daa9f651d75
ย | INTERFACE_LIBRARY targets may only have whitelisted properties. The
ย | property "LINK_LIBRARIES" is not allowed.
ย | Call Stack (most recent call first):
ย | /work/install/llvm-project/mlir-generic-rtti/lib/cmake/mlir/AddMLIR.cmake:213 (mlir_check_link_libraries)
ย | lib/Python/CMakeLists.txt:54 (mlir_check_all_link_libraries)
Some 3.7+ language features snuck in and should be removed.
ย | File "/work/.mmrepo/universe/github.com/llvm/mlir-npcomp.git/test/Python/Compiler/comparisons.py", line 3, in
ย | from npcomp.compiler import test_config
ย | File "/work/build/npcomp_default/python/npcomp/init.py", line 5, in
ย | from . import tracing
ย | File "/work/build/npcomp_default/python/npcomp/tracing/init.py", line 3, in
ย | from .mlir_trace import *
ย | File "/work/build/npcomp_default/python/npcomp/tracing/mlir_trace.py", line 15, in
ย | from npcomp.tracing.emitters import *
ย | File "/work/build/npcomp_default/python/npcomp/tracing/emitters.py", line 22, in
ย | defaults=(TraceValueType.NDARRAY,))):
ย | TypeError: namedtuple() got an unexpected keyword argument 'defaults'
The CI runs with python 3.6.
do we have tests for this pass? I didn't see return-elimination in any tests.
Originally posted by @silvasean in #16
I suspect that the AcapDispatch code for materializing a const bool tensor may have some issues, but we lack the facilities to exercise it properly. Add a test specifically for this at the appropriate point.
We generally prefer the "let results =" style vs Results inheritance, especially since you use the let-form for arguments.
Since the form you have it in seems consistent through the file, let's not change now. We can do a cleanup to the let form in a followup if desired.
Originally posted by @stellaraccident in #16
I have successfully build the llvm, mlir , and pytorch front, generate the _mlir and _torch_mlir *.so file. And I have add thems into PATHPYTHON . But when I test, there exist some error:
Import Error
import npcomp.frontends.pytorch as torch_mlir
File "mlir-npcomp/build/python/npcomp/frontends/pytorch/__init__.py", line 8, in <module>
from _torch_mlir import _get_mlir
ImportError: cannot import name '_get_mlir' from '_torch_mlir' (mlir-npcomp/build/python/_torch_mlir.cpython-37m-x86_64-linux-gnu.so)
The _torch_mlir
module can be import successfully in python, I have test help(_torch_mlir)
and type(_torch_mlir)
in python, but the printed result not found _get_mlir
function.
code
import npcomp.frontends.pytorch as torch_mlir
dev = torch_mlir.mlir_device()
t0 = torch.randn((4,4), device=dev)
t1 = torch.randn((4,4)).to(dev)
t2 = t0 + t1
t2_mlir = torch_mlir.get_mlir( t2 )
t2_cpu = t2.to('cpu')
Hello!
I am just trying to go through build instructions in the README. I built Pytorch Frontend in docker container following the instructions and then installed IREE via pip3 install. But When I try to run the e2e test targeting the IREE backend I got the following error:
root@0e8aafd709c6:/src/mlir-npcomp# python frontends/pytorch/e2e_testing/torchscript/main.py --config=iree
Traceback (most recent call last):
File "frontends/pytorch/e2e_testing/torchscript/main.py", line 9, in <module>
from torch_mlir.torchscript.e2e_test.framework import run_tests
ModuleNotFoundError: No module named 'torch_mlir'
Could anyone help me? I think I didn't miss any instructions in the README...
This should be generalized.
Running into an issue
silvasean@silvasean0:~/pg/mlir-npcomp/mlir-npcomp$ source .env; python frontends/pytorch/test/graph_export/test_script_add3.py
Traceback (most recent call last):
File "frontends/pytorch/test/graph_export/test_script_add3.py", line 21, in <module>
def add3(t0, t1, t2):
TypeError: import_function(): incompatible function arguments. The following argument types are supported:
1. (self: _torch_mlir.ModuleBuilder, arg0: torch::jit::StrongFunctionPtr) -> torch::jit::StrongFunctionPtr
Invoked with: <_torch_mlir.ModuleBuilder object at 0x7f81c3e8d0b0>, <torch.jit.ScriptFunction object at 0x7f81a8cb36d0>
/usr/local/google/home/silvasean/.local/lib/python3.8/site-packages/pybind11/include/pybind11/pybind11.h
/usr/local/google/home/silvasean/.local/lib/python3.8/site-packages/torch/include/pybind11/pybind11.h
I created a simple example as a stepping stone towards the backward pass. It builds on the conv2d forward pass and just adds the negative log likelihood loss.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch_mlir
N = 3
Cin = 16
Cout = 4
w = 10
h = 10
class Net(nn.Module):
def __init__(self, Cin, Cout):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(Cin, Cout, (3,3))
def forward(self, x):
x = self.conv1(x)
output = F.log_softmax(x, dim=1)
return output
model = Net(Cin, Cout)
inputs = torch.ones((N,Cin,h,w))
loss = torch.nn.NLLLoss()
target = torch.empty(N, 8, 8, dtype=torch.long).random_(0, Cout)
mb = torch_mlir.ModuleBuilder()
with mb.capture_function("resa", [inputs]) as f:
#f.returns([model(inputs)]) # This works
f.returns([loss(model(inputs), target)]) # This does not work
mb.module.operation.print(large_elements_limit=2)
When I try to run this on 30adf9e, I get the following error.
Traceback (most recent call last):
File "models/conv2d.py", line 29, in <module>
f.returns([loss(model(inputs), target)]) # This does not work
File "/pytorch_nightly/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
result = self.forward(*input, **kwargs)
File "/pytorch_nightly/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 213, in forward
return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
File "/pytorch_nightly/lib/python3.8/site-packages/torch/nn/functional.py", line 2237, in nll_loss
ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: unsupported PyTorch scalar type: UNKNOWN_SCALAR
The instructions in the README and docker image are now up to date. It would be nice to get the CI going for it. I'm not entirely certain how to adapt the LLVM install caching to building within a container.
We might want to wait too until we get closer to PyTorch head: I suspect we'll be successful then at just installing an appropriate version and building against it (and can forgo the container in the CI).
TODOs and cleaned up comments with explanation of what isn't supported yet?
Probably we want a Type Interface that supports mangling.
Originally posted by @stellaraccident in https://github.com/_render_node/MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NDY1MzAzMTgx/pull_request_reviews/more_threads
Dumping here repro steps that got asan working on a ubuntu 20.04 system. The default way that LLVM handles this does not seem to play well with shared libraries.
unset CC
unset CXX
export CC
export CXX
./build_tools/install_mlir.sh -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DLLVM_USE_SANITIZER=Address
./build_tools/cmake_configure.sh -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DLLVM_USE_SANITIZER=Address
cd build
ninja
export LSAN_OPTIONS=detect_leaks=0
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libasan.so.5
ninja check
ninja check-frontends-pytorch
The issue with naively attempting it was as noted here: google/sanitizers#796 (comment)
I suspect that when building with clang, there should be a -shared-libsan
on appropriate link lines.
RuntimeError: false INTERNAL ASSERT FAILED at "/pytorch/aten/src/ATen/core/boxing/impl/boxing.h":48, please report a bug to PyTorch. Tried to call KernelFunction::call() for a kernel that only has a boxed kernel and doesn't support calling from an unboxed API yet.
When hooking the dispatcher, not all kernels are supported with registration via a boxed fallback kernel. Need to special case/skip many of these. Moving out of the capture closure usually gets things moving for now.
Prefer to use an OperandAdaptor
Originally posted by @stellaraccident in https://github.com/_render_node/MDE3OlB1bGxSZXF1ZXN0UmV2aWV3NDY1MzAzMTgx/pull_request_reviews/more_threads
We have an in-npcomp type range that we should branch off of, but since these are being refactored away upstream, let's not bother updating this.
Originally posted by @stellaraccident in #16
On llvm/circt#767, they were experiencing some of the same issues we have intermittently observed with respect to effects of TypeID
multiple definition issues resulting in equality of types being dependent on which shared library does the check.
It is my belief that the way that libLLVM.so
/libMLIR.so
/libNPCOMP.so
are "over-linked" (and order inverted on the link command line) creates the conditions for this kind of issue to surface (although, I have not actually managed to ever nail it down to a specific smoking gun -- more of a "that is clearly not the right way to do it and would result in this kind of issue easily" kind of judgments).
I suggested that @mikeurbach try to use BUILD_SHARED_LIBS
mode because it gets the shared-library layering correctly, and this resolved the mismatch for them. I suggest we switch npcomp to the same regime and remove support for linking against libMLIR.so
. I am very slowly trying to complete https://reviews.llvm.org/D94387, which should fix the situation for the aggregate dylib linking modes, which would be nice for an eventual production release. In that new world, the dylib building mode is a specialization of BUILD_SHARED_LIBS
, so we would need to switch locally regardless.
Failed Tests (9):
FRONTENDS_PYTORCH :: test_export_ResA.py
FRONTENDS_PYTORCH :: test_export_add3.py
FRONTENDS_PYTORCH :: test_export_batchnorm.py
FRONTENDS_PYTORCH :: test_export_conv2d_back.py
FRONTENDS_PYTORCH :: test_export_multi_out.py
FRONTENDS_PYTORCH :: test_export_resnet18.py
FRONTENDS_PYTORCH :: test_export_vgg11.py
FRONTENDS_PYTORCH :: test_op_report_conv2d.py
FRONTENDS_PYTORCH :: test_op_report_vgg_style_lenet.py
Sample of errors:
error: 'aten.relu_' op operand #0 must be tensor of any type values, but got 'memref<32x64x32x32xf32>'
aten to loops conversion failed error: 'std.call' op 'native_batch_norm_4F32_1F32_1F32_4F32_1F32_1F32_1F32_1F32_out' does not reference a valid function
error: unsupported or non-LLVM operation: aten.constant
JIT session error: Symbols not found: [ _mlir_ciface_as_strided_1F32_4F32_out ]
error: 'aten.div' op operand #0 must be tensor of any type values, but got 'f32'
Hi,
I am trying to run example scripts under frontends/pytorch/examples
but I got two kinds of errors:
When I am running scripts using capture_function
, I got:
the same error occurs when I run cos_e2e.py, div_inplace_e2e.py, mm_e2e.py, mul_maximum_e2e.py, tanh_out_e2e.py
. Could someone tell me how to generate these op definitions and where to put them?
I am still learning the code so this might be a trivial question or maybe I just missed some steps?
I went through all the build steps in README, I successfully ran tools/torchscrip_e2e_test.sh. I installed iree backend using pip.
Following the work done in the rest of the org. This project is small enough that I don't think it requires special coordination. I'll make the change sometime in the next couple of days.
There are two problems I ran into today after fetching the latest code:
cmake --build /build/npcomp --target check-npcomp check-frontends-pytorch
in docker environment following README.md in the top directory, I got:/build
such as llvm-install,llvm-build, npcomp
. Now there is only /build/npcomp
and the directory is almost empty except a .env file and few newly created directories.I noticed some changes in the build process in #251 and #258. Could someone update this README file to point it to the correct build output directory?
tools/torchscript_e2e_test.sh --config=iree
this command will show that it cannot find the corresponding test package:Thanks!
Unfortunately, we don't have the knobs exposed yet through the Python API to do abbreviated printouts.
When using a backend dispatch key (i.e. PrivateUse3), aten::conv2d calls are never recorded; however, when using AutogradPrivateUse3, they are (but this has other problems). conv is special in a number of ways and need to check with PT devs regarding how to resolve.
The MLIR generated for torch.cos
via ModuleBuilder
ignores %arg0
. Instead, it returns a constant with the result of torch.kernel_call "aten::mm" %arg0
for the %arg0
used during capture_function
.
This can be reproduced with the code below or via python3 frontends/pytorch/examples/cos_e2e.py
after #134 is submitted.
import torch
import torch_mlir
torch.manual_seed(0)
input = torch.rand(2, 3)
mb = torch_mlir.ModuleBuilder()
with mb.capture_function("cos", [input]) as f:
result = torch.cos(input)
f.returns([result])
print(mb.module)
module {
func @cos(%arg0: !numpy.ndarray<[2,3]:f32>) -> !numpy.ndarray<[2,3]:f32> {
%cst = constant dense<[[0.879371106, 0.719147384, 0.996088445], [0.991296648, 0.953116595, 0.805617868]]> : tensor<2x3xf32>
%0 = numpy.create_array_from_tensor %cst : (tensor<2x3xf32>) -> !numpy.ndarray<[2,3]:f32>
return %0 : !numpy.ndarray<[2,3]:f32>
}
}
Hello,
I substituted refjit backend with iree backend in torchscript_resnet18_e2e.py:
backend = iree.IreeNpcompBackend()
#backend = refjit.RefjitNpcompBackend()
I know this probably won't run successfully ... But worth a try, I got this error:
I also ran the iree-translate command alone (I copied the instruction from the last line of the screenshot above, I also dumped the input file using mb.module.operation.get_asm
and f.write
and then passed the input file as iree-translate's parameter):
My questions are:
Thanks in advance to anyone who helps me with these questions!
First of all, awesome project!
It would be interesting to see the results of the NPComp infrastructure vs. other python compilers, such as Numba, on scientific python apps. NPBench has a wide variety of HPC and computational science apps written in numpy. It'd be great if you had an implementation/results there!
I have followed the instruction:
./build_tools/cmake_configure.sh
is fine. But then both give me:ninja: error: loading 'build.ninja': No such file or directory
Do you have a recommendation how does this ninja error can get fixed?
Here is a simple example showing the problem. The python code is shown below.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch_mlir
torch_mlir.debug_trace_to_stderr()
N = 3
Cin = 16
Cout = 4
w = 10
h = 10
class Net(nn.Module):
def __init__(self, Cin, Cout):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(Cin, Cout, (3,3))
def forward(self, x):
x0 = self.conv1(x)
x1 = self.conv1(x)
z = torch.cat([x0, x1])
output = F.log_softmax(z, dim=1)
return output
model = Net(Cin, Cout)
inputs = torch.ones((N,Cin,h,w))
weight = torch.randn(Cout)
loss = torch.nn.NLLLoss()
target = torch.empty(2*N, 8, 8, dtype=torch.long).random_(0, Cout)
mb = torch_mlir.ModuleBuilder()
with mb.capture_function("cat_test", [inputs, target]) as f:
result = loss(model(inputs), target)
f.returns([result])
mb.module.operation.print(large_elements_limit=2)
This results in the following output.
TORCH_MLIR TRACE: Convolution (unboxed) dispatch: aten::convolution(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups) -> (Tensor)
TORCH_MLIR TRACE: Convolution (unboxed) dispatch: aten::convolution(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::_cat(Tensor[] tensors, int dim=0) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::_log_softmax(Tensor self, int dim, bool half_to_float) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::nll_loss2d_forward(Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index) -> (Tensor output, Tensor total_weight)
module {
func @resa(%arg0: !numpy.ndarray<[3,16,10,10]:f32>, %arg1: !numpy.ndarray<[6,8,8]:i64>) -> !numpy.ndarray<[]:f32> {
%cst = constant opaque<"", "0xDEADBEEF"> : tensor<4x16x3x3xf32>
%cst_0 = constant opaque<"", "0xDEADBEEF"> : tensor<4xf32>
%c1_i64 = constant 1 : i64
%c1_i64_1 = constant 1 : i64
%0 = basicpy.build_list %c1_i64, %c1_i64_1 : (i64, i64) -> !basicpy.ListType
%c0_i64 = constant 0 : i64
%c0_i64_2 = constant 0 : i64
%1 = basicpy.build_list %c0_i64, %c0_i64_2 : (i64, i64) -> !basicpy.ListType
%c1_i64_3 = constant 1 : i64
%c1_i64_4 = constant 1 : i64
%2 = basicpy.build_list %c1_i64_3, %c1_i64_4 : (i64, i64) -> !basicpy.ListType
%false = constant false
%c0_i64_5 = constant 0 : i64
%c0_i64_6 = constant 0 : i64
%3 = basicpy.build_list %c0_i64_5, %c0_i64_6 : (i64, i64) -> !basicpy.ListType
%c1_i64_7 = constant 1 : i64
%c1_i64_8 = constant 1 : i64
%c1_i64_9 = constant 1 : i64
%4 = basicpy.build_list %c1_i64_8, %c1_i64_9 : (i64, i64) -> !basicpy.ListType
%c0_i64_10 = constant 0 : i64
%c0_i64_11 = constant 0 : i64
%5 = basicpy.build_list %c0_i64_10, %c0_i64_11 : (i64, i64) -> !basicpy.ListType
%c1_i64_12 = constant 1 : i64
%c1_i64_13 = constant 1 : i64
%6 = basicpy.build_list %c1_i64_12, %c1_i64_13 : (i64, i64) -> !basicpy.ListType
%false_14 = constant false
%c0_i64_15 = constant 0 : i64
%c0_i64_16 = constant 0 : i64
%7 = basicpy.build_list %c0_i64_15, %c0_i64_16 : (i64, i64) -> !basicpy.ListType
%c1_i64_17 = constant 1 : i64
%8 = basicpy.build_list %12, %13 : (!numpy.ndarray<[3,4,8,8]:f32>, !numpy.ndarray<[3,4,8,8]:f32>) -> !basicpy.ListType
%c0_i64_18 = constant 0 : i64
%c1_i64_19 = constant 1 : i64
%false_20 = constant false
%9 = basicpy.singleton : !basicpy.NoneType
%c1_i64_21 = constant 1 : i64
%c-100_i64 = constant -100 : i64
%10 = numpy.create_array_from_tensor %cst : (tensor<4x16x3x3xf32>) -> !numpy.ndarray<[4,16,3,3]:f32>
%11 = numpy.create_array_from_tensor %cst_0 : (tensor<4xf32>) -> !numpy.ndarray<[4]:f32>
%12 = torch.kernel_call "aten::convolution" %arg0, %10, %11, %0, %1, %2, %false, %3, %c1_i64_7 : (!numpy.ndarray<[3,16,10,10]:f32>, !numpy.ndarray<[4,16,3,3]:f32>, !numpy.ndarray<[4]:f32>, !basicpy.ListType, !basicpy.ListType, !basicpy.ListType, i1, !basicpy.ListType, i64) -> !numpy.ndarray<[3,4,8,8]:f32> {sigArgTypes = ["Tensor", "Tensor", "Tensor?", "int[]", "int[]", "int[]", "bool", "int[]", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
%13 = torch.kernel_call "aten::convolution" %arg0, %10, %11, %4, %5, %6, %false_14, %7, %c1_i64_17 : (!numpy.ndarray<[3,16,10,10]:f32>, !numpy.ndarray<[4,16,3,3]:f32>, !numpy.ndarray<[4]:f32>, !basicpy.ListType, !basicpy.ListType, !basicpy.ListType, i1, !basicpy.ListType, i64) -> !numpy.ndarray<[3,4,8,8]:f32> {sigArgTypes = ["Tensor", "Tensor", "Tensor?", "int[]", "int[]", "int[]", "bool", "int[]", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
%14 = torch.kernel_call "aten::_cat" %8, %c0_i64_18 : (!basicpy.ListType, i64) -> !numpy.ndarray<[6,4,8,8]:f32> {sigArgTypes = ["Tensor[]", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
%15 = torch.kernel_call "aten::_log_softmax" %14, %c1_i64_19, %false_20 : (!numpy.ndarray<[6,4,8,8]:f32>, i64, i1) -> !numpy.ndarray<[6,4,8,8]:f32> {sigArgTypes = ["Tensor", "int", "bool"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
%16:2 = torch.kernel_call "aten::nll_loss2d_forward" %15, %arg1, %9, %c1_i64_21, %c-100_i64 : (!numpy.ndarray<[6,4,8,8]:f32>, !numpy.ndarray<[6,8,8]:i64>, !basicpy.NoneType, i64, i64) -> (!numpy.ndarray<[]:f32>, !numpy.ndarray<[]:f32>) {sigArgTypes = ["Tensor", "Tensor", "Tensor?", "int", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor", "Tensor"]}
return %16#0 : !numpy.ndarray<[]:f32>
}
The problem with this output however is here
%8 = basicpy.build_list %12, %13 : (!numpy.ndarray<[3,4,8,8]:f32>, !numpy.ndarray<[3,4,8,8]:f32>) -> !basicpy.ListType
...
%12 = torch.kernel_call "aten::convolution" %arg0, %10, %11, %0, %1, %2, %false, %3, %c1_i64_7 : (!numpy.ndarray<[3,16,10,10]:f32>, !numpy.ndarray<[4,16,3,3]:f32>, !numpy.ndarray<[4]:f32>, !basicpy.ListType, !basicpy.ListType, !basicpy.ListType, i1, !basicpy.ListType, i64) -> !numpy.ndarray<[3,4,8,8]:f32> {sigArgTypes = ["Tensor", "Tensor", "Tensor?", "int[]", "int[]", "int[]", "bool", "int[]", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
%13 = torch.kernel_call "aten::convolution" %arg0, %10, %11, %4, %5, %6, %false_14, %7, %c1_i64_17 : (!numpy.ndarray<[3,16,10,10]:f32>, !numpy.ndarray<[4,16,3,3]:f32>, !numpy.ndarray<[4]:f32>, !basicpy.ListType, !basicpy.ListType, !basicpy.ListType, i1, !basicpy.ListType, i64) -> !numpy.ndarray<[3,4,8,8]:f32> {sigArgTypes = ["Tensor", "Tensor", "Tensor?", "int[]", "int[]", "int[]", "bool", "int[]", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
because the build_list op is referencing %12 and %13 when they don't exist yet. When I feed this to npcomp-opt, I get
conv2d.mlir:33:10: error: operand #0 does not dominate this use
%8 = basicpy.build_list %12, %13 : (!numpy.ndarray<[3,4,8,8]:f32>, !numpy.ndarray<[3,4,8,8]:f32>) -> !basicpy.ListType
^
conv2d.mlir:33:10: note: see current operation: %8 = "basicpy.build_list"(%12, %13) : (!numpy.ndarray<[3,4,8,8]:f32>, !numpy.ndarray<[3,4,8,8]:f32>) -> !basicpy.ListType
conv2d.mlir:42:11: note: operand defined here
%12 = torch.kernel_call "aten::convolution" %arg0, %10, %11, %0, %1, %2, %false, %3, %c1_i64_7 : (!numpy.ndarray<[3,16,10,10]:f32>, !numpy.ndarray<[4,16,3,3]:f32>, !numpy.ndarray<[4]:f32>, !basicpy.ListType, !basicpy.ListType, !basicpy.ListType, i1, !basicpy.ListType, i64) -> !numpy.ndarray<[3,4,8,8]:f32> {sigArgTypes = ["Tensor", "Tensor", "Tensor?", "int[]", "int[]", "int[]", "bool", "int[]", "int"], sigIsMutable = false, sigIsVararg = false, sigIsVarret = false, sigRetTypes = ["Tensor"]}
Manually moving the definition of %8 after the definition of %12 and %13 fixes the problem. This seems to suggest that when the build_list operator is being constructed it is not being inserted in the right location.
We added support in #209 for some limited cases:
Adding test cases / implementing support / verifying that they work for these other cases would be welcome.
Test cases likely would involve calling torch.nn.functional.linear directly (instead of using torch.nn.Linear):
https://pytorch.org/docs/master/generated/torch.nn.functional.linear.html#torch.nn.functional.linear
I am trying to build mlir-comp and it fails with the following error:
[27/126] Building TCPOps.h.inc... FAILED: include/npcomp/Dialect/TCP/IR/TCPOps.h.inc cd .../mlir-npcomp/include/npcomp/Dialect/TCP/IR/TCPOps.td:42:50: error: Variable not defined: 'Shape_ExtentTensorType' let arguments = (ins AnyRankedTensor:$operand, Shape_ExtentTensorType:$shape); ^ [28/126] Building TCPOps.cpp.inc... FAILED: include/npcomp/Dialect/TCP/IR/TCPOps.cpp.inc .../mlir-npcomp/include/npcomp/Dialect/TCP/IR/TCPOps.td:42:50: error: Variable not defined: 'Shape_ExtentTensorType' let arguments = (ins AnyRankedTensor:$operand, Shape_ExtentTensorType:$shape); ^ [32/126] Building TCPOpsDialect.h.inc...
And further down with aten_ops too. I am not sure if the above error is related with this one.
../frontends/pytorch/lib/aten_ops.cpp:107:19: error: no member named 'addmm' in namespace 'at::native'; did you mean 'addmv'? at::native::addmm(torch_a, torch_b, torch_c, alpha, beta).clone(); ~~~~~~~~~~~~^~~~~ addmv
Hey guys,
I tried updating to this patch
but when I tried tracing with acap_dispatch, it seems to still be using "add" instead of "add_". I have a small unit test and the output that is generated as attachment
add_test.zip
. Any help/comments would be appreciated! :)
Below is a simple python script that reproduces the issue.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch_mlir
torch_mlir.debug_trace_to_stderr()
N = 3
Cin = 16
Cout = 4
w = 10
h = 10
class Net(nn.Module):
def __init__(self, Cin, Cout):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(Cin, Cout, (3,3))
def forward(self, x):
x = self.conv1(x)
indices = torch.arange(N)
x = x[indices, :, :, :]
output = F.log_softmax(x, dim=1)
return output
model = Net(Cin, Cout)
inputs = torch.ones((N,Cin,h,w))
loss = torch.nn.NLLLoss()
target = torch.empty(N, 8, 8, dtype=torch.long).random_(0, Cout)
mb = torch_mlir.ModuleBuilder()
with mb.capture_function("arange_test", [inputs, target]) as f:
result = loss(model(inputs), target)
result.backward()
f.returns([result] + [p.grad for p in model.parameters()])
mb.module.operation.print(large_elements_limit=2)
When I try to run this, I get the following output.
TORCH_MLIR TRACE: Convolution (unboxed) dispatch: aten::convolution(Tensor input, Tensor weight, Tensor? bias, int[] stride, int[] padding, int[] dilation, bool transposed, int[] output_padding, int groups) -> (Tensor)
TORCH_MLIR TRACE: Fallback (boxed) dispatch: aten::arange.start_out(Scalar start, Scalar end, Scalar step=1, *, Tensor(a!) out) -> (Tensor(a!))
Traceback (most recent call last):
File "models/conv2d.py", line 32, in <module>
result = loss(model(inputs), target)
File "/pytorch_nightly/lib/python3.8/site-packages/torch/nn/modules/module.py", line 744, in _call_impl
result = self.forward(*input, **kwargs)
File "models/conv2d.py", line 20, in forward
indices = torch.arange(N)
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::arange.start_out. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, PrivateUse2, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].
CPU: registered at /pytorch/build/aten/src/ATen/CPUType.cpp:2154 [kernel]
PrivateUse2: registered at /mlir-npcomp/frontends/pytorch/csrc/c10_dispatch/acap_dispatch.cpp:645 [backend fallback]
BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
AutogradOther: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
AutogradCPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
AutogradCUDA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
AutogradXLA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
AutogradPrivateUse1: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
AutogradPrivateUse2: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
AutogradPrivateUse3: registered at /pytorch/torch/csrc/autograd/generated/VariableType_1.cpp:8628 [autograd kernel]
Tracer: registered at /pytorch/torch/csrc/autograd/generated/TraceType_1.cpp:10219 [kernel]
Autocast: fallthrough registered at /pytorch/aten/src/ATen/autocast_mode.cpp:254 [backend fallback]
Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:527 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
The error states that there is no fallback kernel defined for this op (arange.start_out). But I noticed that in type_dispatch/aten_mlir_type_default.cpp there is a function RegisterAtenTypeFunctions that registers arange.start_out
.op(torch::RegisterOperators::options()
.schema("aten::arange.start_out(Scalar start, Scalar end, "
"Scalar step=1, *, Tensor(a!) out) -> Tensor(a!)")
.impl_unboxedOnlyKernel<at::Tensor &(at::Tensor &, at::Scalar,
at::Scalar, at::Scalar),
&ATenMLIRTypeDefault::arange_out>(
at::TensorTypeId::XLATensorId)
.aliasAnalysis(c10::AliasAnalysisKind::FROM_SCHEMA))
But the error also states that I passed in an empty list of Tensors. Does that mean we need another definition of this op or just how it handles optional arguments because I have only passed in the end argument in the above function? Or do you think the root cause of this is something completely different?
Any chance we could get actual symbolic argument names? Esp for BatchNorm which has a lot of parameters.
Originally posted by @stellaraccident in #16
Importing the PyTorch frontend complains of a missing _get_mlir
method in the _torch_mlir.so
binary. I am currently on master
(81dd571).
# Build script.
LLVM_VERSION=10
export CC=clang-$LLVM_VERSION
export CXX=clang++
export LDFLAGS=-fuse-ld=$(which ld.lld)
sh ./build_tools/install_mlir.sh
sh ./build_tools/cmake_configure.sh
# Build and run tests
cd build
ninja
ninja check-npcomp
# Produce import error.
PYTHONPATH=build/python:build/frontends/pytorch/csrc python
Python 3.8.5 (default, Jul 27 2020, 10:09:03)
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import npcomp.frontends.pytorch
2020-09-09 13:13:57.642115: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "build/python/npcomp/frontends/pytorch/__init__.py", line 8, in <module>
from _torch_mlir import _get_mlir
ImportError: cannot import name '_get_mlir' from '_torch_mlir' (build/frontends/pytorch/csrc/_torch_mlir.so)
Right now I'm learning the source code. And I saw the following code in TCFToLinalg.cpp:67
auto heightPlusTwicePadding = builder.create<SubIOp>(op->getLoc(), height, twicePaddingHeight);
According to PyTorch conv2d document
I think it might be:
auto heightPlusTwicePadding = builder.create<AddIOp>(op->getLoc(), height, twicePaddingHeight);
I'm just started learning PyTorch and MLIR things. I'm not sure if I'm correct?
Besides, I have another question. Does it support conversion (tcf::ConvNCHWOp->linalg::ConvNCHWOp) with padding, dilation, stride parameter?
when I run ninja check-npcomp, there are some erros :
Testing Time: 2.29s
Unsupported: 1
Passed : 2
Failed : 74
FAILED: test/CMakeFiles/check-npcomp-lit
why?
RuntimeError: TODO: implement tensor import for non-arg tensors
Will require building out DenseElementsAttr in the C/Python API.
There doesn't seem to be anything in the verifier for the convolution ops that guarantees a particular rank here, so this could easily segfault/abort.
Originally posted by @silvasean in #16
Would you mind adding a comment describing the invariants, and conversions that are supported (and what is currently not supported).
Originally posted by @stellaraccident in #16
In test_export_conv2d_back.py:
Traceback (most recent call last):
File "/src/mlir-npcomp/frontends/pytorch/test/acap_export/test_export_conv2d_back.py", line 30, in <module>
result = model(tensor)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 419, in forward
return self._conv_forward(input, self.weight)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py", line 415, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Unsupported capture value returned from kernel (Bool): True
Now that #154 is resolved, we get further on MacOS, exposing a couple of other issues. Tracking them here.
This looks like a quite heavy type to return here.
It isn't clear to me what this is doing right now, but taking a StringMap (or better if possible a DenseMap<StringRef. uint64_t>
) as output operand would allow to not realloc constantly by reusing the same map over and over from the call site.
Originally posted by @joker-eph in https://github.com/_render_node/MDIzOlB1bGxSZXF1ZXN0UmV2aWV3VGhyZWFkMjkyOTIwMTA4OnYy/pull_request_review_threads/discussion
Hi,
I'm thinking to contribute some build feature to the npcomp cmake build script, and want some advice before doing it.
Here are our problem.
We developed a project Foo, which will use mlir's python binding from a modified llvm-project, this llvm-project is updated with upstream, and shipped to us as a prebuilt library.
Now , we want also to use npcomp in our project, but the current build process need to build llvm from source, and will create a bundled mlir pacakge, so this just doesn't work with us.
I think npcomp should be able to work in a more decoupled way, what I want to do are:
cmake -DLLVM_INSTALL_ROOT
import mlir
will import milr from the prebuilt mlir , and import npcomp
will just import things npcomp neededIs it fine to do so?
Sincerely
With PyTorch nightly (1.9.0.dev20210216) installed via conda, I get the following error
static_assert(c10::guts::false_t<Func>(), ".impl_UNBOXED(...) was removed. Please use .impl(...) instead.");
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~
../frontends/pytorch/csrc/builder/acap_dispatch.cpp:596:5: note: in instantiation of function template specialization 'torch::Library::impl_UNBOXED<const char *, at::Tensor (const at::Tensor &, const at::Tensor &, const c10::optional<at::Tensor> &, c10::ArrayRef<long>, c10::ArrayRef<long>, c10::ArrayRef<long>, bool, c10::ArrayRef<long>, long)>' requested here
m.impl_UNBOXED("convolution", &AcapController::convolutionKernel);
I could do a bisection but it would be better if you could point me to a version that is known to work. Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.