Comments (3)
Could be related to #110758
from pytorch.
works on aot_eager backend, looks like a stride mismatch?
{(1,), (3, 1, 1, 1), (128, 1, 1, 1)} # aot_eager
{(1,), (3, 1, 3, 3), (128, 1, 128, 128), (128, 1, 1, 1)} # inductor
diff --git a/torch/optim/adam.py b/torch/optim/adam.py
index fba4b2027b0..f7f13029cef 100644
--- a/torch/optim/adam.py
+++ b/torch/optim/adam.py
@@ -672,6 +672,13 @@ def _fused_adam(
lr_dict[device] = lr.to(device=device, non_blocking=True) # type: ignore[union-attr]
lr = lr_dict[device]
torch._foreach_add_(device_state_steps, 1)
+
+ strides = set()
+ for l in [device_params, device_grads, device_exp_avgs, device_exp_avg_sqs]:
+ for x in l:
+ strides.add(x.stride())
+ breakpoint()
+
torch._fused_adam_(
device_params,
device_grads,
from pytorch.
@shunting314 isnt there a stride optimization in inductor? Iirc there was a similar issue with distributed
from pytorch.
Related Issues (20)
- torch.compiler docstrings can be derived from _dynamo and _inductor docs HOT 2
- Avoid Having to Register Op For ExternKernelChoice of Aten Refs
- [BUG] Using custome backend for torch.compile give nothing outputs HOT 3
- `assume_constant_result` does not work with method of `UnspecializedNNModuleVariable` HOT 1
- [AOTI] Conv-BN folding on CPU not working anymore after benchmark script change in https://github.com/pytorch/pytorch/pull/123403 HOT 3
- With pytest8.2.0 or later, test cases under test/distributed/ execute will meet issue "object has no attribute 'runTest'. Did you mean: 'run_test'"
- THPVariable_Check(list_elem) INTERNAL ASSERT FAILED HOT 1
- [RFC] Load and register rendezvous backends dynamically as plugins at runtime
- Modeling ViT does not support quantized models HOT 1
- SDPA memory efficient and flash attention kernels don't work with singleton dimensions HOT 3
- Multiple unhandled exceptions in weights_only unpickler
- PyTorch nightly docs build hasn't run since 4/8? HOT 1
- Run rstcheck on modified docstrings and docs as additional linter HOT 1
- Triton `kernel.run` segfaults when passed non-default stream
- CUDA 12.5 HOT 1
- Compilation from source fails (PYTORCH 1.13.1) HOT 2
- Can we have a Y-Split module for torch.nn.Sequential? HOT 1
- DISABLED test_cusparse_multiple_threads_same_device (__main__.TestCuda) HOT 2
- UNSTABLE periodic / linux-jammy-xpu-py3.8 / test (default) HOT 2
- RuntimeError: derivative for aten::_spdiags is not implemented HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.