Comments (3)
@cwpearson, you are right. oneapi::mkl::sparse::optimize_gemv()
was a no-op for previous versions of oneMKL. Last few releases, though, it stopped being a no-op. In the current release of oneMKL (2024.0), if you expect to call oneapi::mkl::sparse::gemv()
only a single time with a given matrix handle then, calling oneapi::mkl::sparse::optimize_gemv()
would very likely be detrimental to performance, and you could remove that optimize_gemv()
call. The cost of the call to optimize_gemv()
is generally amortized over multiple calls to gemv()
with the same matrix handle (e.g., you call optimize_gemv()
once, followed by say, a hundred gemv()
calls). The oneMKL team will try reducing the cost of optimize_gemv()
in a future oneMKL release.
from kokkos-kernels.
Thank you @gajanan-choudhary! We have a PR in progress to remove optimize_gemv
for our Aurora users. We are considering a two-phase SpMV API, and would revisit this, but it does not exist yet.
from kokkos-kernels.
Hi @cwpearson, just a quick update: I recently worked on trying to improve the performance of oneapi::mkl::sparse:;optimize_gemv()
. I was able to significantly improve the performance (at least for "non-tiny" matrices) for the sycl::buffer
APIs. Some bottleneck still exists in the USM APIs that we are still trying to figure out. These improvements will pop up in the oneMKL 2024.2 release (not the upcoming 2024.1 release) in a few months.
We will continue looking into improving the performance of both optimize_gemv
and one-shot sparse::gemv()
without optimize_gemv()
in future releases.
from kokkos-kernels.
Related Issues (20)
- Is there a need to shorted the names of generated files for Windows?
- `NULL pointer` error in `cusparseCreateCsr` HOT 1
- `doxygen`: generate a tag file similar to how other `Trilinos` packages do
- trtri undefined references in cuda/11.2.2 build with no eti HOT 1
- sparse_serial, sparse_openmp unit test failures with float type, intel/2023.1.0 + mkl on spr
- Issue with installed Kokkos-Kernels on Frontier HOT 11
- Add Sparse Direct Solver Support via TPLs HOT 3
- Missing fence in spadd before `cmax` used
- Missing fence in spadd before `c_nnz_upperbound` is used
- Nightly build failure, cuda/11.1 + gcc/8.3.0: Test_Common_AlignPtrTo.hpp:138:36: error: 'RangePolicy' was not declared in this scope HOT 5
- Missing parenthesis in coo2crs overload HOT 2
- bsr_spmmv unit-test: address sanitizer report a leak HOT 2
- team_spmv should be compatible with non-square matrices HOT 1
- Nightly test failure, sparse_sptrsv_kokkos_complex_double_int_int_TestDevice with intel debug, and nvhpc/22.3 builds HOT 3
- SPGEMM -- Segmentation fault HOT 6
- kokkos kernels: broken unit test w/ cuda 12.4 on h100 gpus with UVM enabled HOT 11
- QR on a single matrix: valgrind reports invalid reads and writes HOT 7
- Unit test headers are installed HOT 2
- common: `KokkosKernels_default_types.hpp` missing namespace
- inconsistent default size type between `CrsMatrix` and `StaticCrsGraph` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kokkos-kernels.