Comments (4)
We should fix this asap but there is a hotfix already landed.. arguably should still be high pri, will look soon
from pytorch.
here is a smaller repro: https://github.com/pytorch-labs/float8_experimental/pull/260/files
from pytorch.
pytorch-labs/float8_experimental#262 is a hotfix for float8, although from talking to @eellison it sounds like we can potentially fix this in inductor by having inductor not convert views into non-views when we know they came directly from a collective
from pytorch.
I have some confusion regarding the necessity of the kernel # triton_poi_fused_view_3. The code responsible for generating this kernel can be found here.
This optorch.ops.aten.view.dtype(all_gather_into_tensor_2, torch.float8_e5m2)
is a view but it should be a no-op since it merely converts between a uint8 and an fp8 dtype, both of which have the same bit width. This code was used to workaround the fact that nccl doesn't currently support comms for fp8 dtype (or at least our bindings to nccl in pytorch). Perhaps we need to allow fp8 dtypes to all_gather and to the uint trick down there?
from pytorch.
Related Issues (20)
- Handling reasoning about rationals in symbolic shapes HOT 1
- [Dynamo] torch.cuda.device context manager doesn't work
- torch.compile warning message for pybind'ed c++ functions is very spammy
- torch.compile doesn't work well with custom triton kernel from Mamba HOT 1
- Linear is not deterministic even using deterministic algorithms and cpu
- [Inductor] Fusion of Tiled Point-Wise and Reduction Operators
- Improve debugability of warnings/errors "Triggered internally at"
- [dynamo] DAC: 'AudioSignal' object has no attribute 'sample_rate'
- reduce_scatter_tensor with strided inputs produces corrupted results
- Dynamo Graph break in Unsupported: call_method ConstDictVariable()
- [user empathy day 2][based] torch.compile issues
- Map with multiple arguments not supported in Dynamo and causes graph breaks
- inductor error when torch.compile on distrifuser
- [User Empathy Day 2] non-deterministic recompiles for ChatTTS model
- [user empathy day 2] dynamo raises exception when tracing super(Fraction, cls).__new__(cls)
- custom ops with needs_fixed_stride_order doesn't work with auto_functionalized
- `slice step cannot be zero` HOT 1
- Verify that guards are well formed before concluding that Dynamo complication has succeeded
- DISABLED inductor / cuda12.1-py3.10-gcc9-sm80 / test (inductor_torchbench_smoketest_perf) HOT 1
- [inductor] error in graph lowering and graph breaks
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch.