Comments (4)
The thing is that almost all ops including sum
return a tensor that has the same type as the original type. Having the returned tensor type depend on the values in the argument would feel a bit dodgy to me hence this wrapping around behavior. Maybe you could just do the casting on your side via a .eq(1u32)?.to_dtype(DType::U32)?.sum(0)?
, this would have the benefit of being more explicit about what actually takes place.
from candle.
That solution is adequate for me, we can close this since it sounds like this is intended behavior
from candle.
One thing that'd be nice is to detect if overflows occur somehow, and crash if it happens. Perhaps this already occurs on the CPU? Would be nice to at least bring that behavior to the CPU if not, and ideally other devices too, at least for the most basic arithmetic kernels
from candle.
In rust this typically happens when compiling in debug mode. However when optimizations are enabled, the behavior is overflowing as the overflow check would have some performance impact.
from candle.
Related Issues (20)
- Example with model via `include_bytes!`?
- failed to build cudarc -- unsupported cuda toolkit version: `11040`
- Unsupported op_type Pad for op
- How to slice a tensor? HOT 1
- Using MKL Documentation goes to 404
- Quantization issue - Mixtral 8x22b HOT 2
- ~2x slower than `Transformer` on cpu with `Bert` model HOT 2
- Unsupported cuda toolkit version: `12050` HOT 3
- Unable to convert t5 model to GGUF HOT 3
- Error: cannot seed the CPU rng with set_seed
- CUBLAS_STATUS_NOT_SUPPORTED for Conv2d HOT 2
- SeparableConv2d implementation HOT 1
- How to load LoRA adapter along with the GGUF model? HOT 6
- Misleading `Tensor::matmul` documentation
- Meta voice WASM example? HOT 1
- Since cudarc 0.11.4 error with PTX -- CUDA_ERROR_UNSUPPORTED_PTX_VERSION HOT 3
- Tensor::to_scalar very high latency HOT 5
- nvcc fatal : Cannot find compiler 'cl.exe' in PATH HOT 1
- Improving the versatility of Tensor::slice_assign
- Automatically upcasting GGUF values HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from candle.