Giter Site home page Giter Site logo

trisycl / trisycl Goto Github PK

View Code? Open in Web Editor NEW
435.0 45.0 97.0 391.49 MB

Generic system-wide modern C++ for heterogeneous platforms with SYCL from Khronos Group

License: Other

Makefile 0.91% Shell 0.19% C++ 57.71% C 39.10% CMake 2.08% Emacs Lisp 0.02%
opencl sycl gpu-computing fpga heterogeneous-parallel-programming spir cpp cpp20 trisycl

trisycl's Introduction

triSYCL

C++

image

ACAP++: C++ extensions for AMD Versal ACAP AIE1 architecture

See tests/acap for some code samples.

Look at doc/acap.rst to know more about how to install/use the ACAP++ environment.

Introduction

triSYCL is a research project to experiment with the specification of the SYCL standard and to give feedback to the Khronos Group SYCL_committee and also to the ISO C++ committee.

Because of lack of resources this SYCL implementation is very incomplete and should not be used by a normal end-user. Fortunately there are now many other implementations of SYCL available, including some strong implementations like ComputeCpp, DPC++ or hipSYCL that can be used on various targets.

This implementation is mainly based on C++23 features backed with OpenMP or TBB for parallel execution on the CPU, with Boost.Compute for the non single-source OpenCL interoperability layer and with an experimental LLVM/Clang version for the device compiler (from 2017-2018 which is now obsolete) providing full single-source SYCL experience, typically targeting a SPIR device. Since in SYCL there is a host fall-back, this CPU implementation can be seen as an implementation of this fall-back too.

Since around 2018 Intel has put a lot of effort in their own oneAPI DPC++ SYCL project to up-stream SYCL into LLVM/Clang, there is another project about merging the oneAPI DPC++ SYCL implementation with triSYCL at https://github.com/triSYCL/sycl to give a greater user experience for Xilinx FPGA instead of using our obsolete experimental clunky device compiler. But this is still very experimental because the Xilinx tool-chain is based on old incompatible versions of LLVM/Clang and nothing of these is supported by the Xilinx product teams.

Most of our efforts are focused on extensions, such as targeting Xilinx FPGA and Versal ACAP CGRA with internal developments on https://gitenterprise.xilinx.com/rkeryell/acappp.

triSYCL has been used to experiment and provide feedback for SYCL 1.2, 1.2.1, 2.2, 2020 and even the OpenCL C++ 1.0 kernel language from OpenCL 2.2.

This is provided as is, without any warranty, with the same license as LLVM/Clang.

Technical lead: Ronan at keryell point FR. Developments started first at AMD, then was mainly funded by Xilinx and now again by AMD since Xilinx has been bought by AMD in 2022.

It is possible to have a paid internship around triSYCL, if you have some skills related to this project. Contact the technical lead about this. AMD is also hiring in this area... :-)

SYCL

SYCL is a single-source modern C++-based DSEL (Domain Specific Embedded Language) and open standard from Khronos aimed at facilitating the programming of heterogeneous accelerators by leveraging existing concepts inspired by OpenCL, CUDA, C++AMP, OpenMP...

A typical kernel with its launch looks like this pure modern C++ code:

queue {}.submit([&](handler &h) {
    auto accA = bufA.get_access<access::mode::read>(h);
    auto accB = bufB.get_access<access::mode::write>(h);
    h.parallel_for<class myKernel>(myRange, [=](item i) {
        accA[i] = accB[i] + 1;
    });
});

Look for example at https://github.com/triSYCL/triSYCL/blob/master/tests/examples/demo_parallel_matrix_add.cpp for a complete example.

SYCL is developed inside the Khronos SYCL committee and thus, for more information on SYCL, look at https://www.khronos.org/sycl

Note that even if the concepts behind SYCL are inspired by OpenCL concepts, the SYCL programming model is a very general asynchronous task graph model for heterogeneous computing targeting various frameworks and API and has no relation with OpenCL itself, except when using the OpenCL API interoperability mode, like any other target.

For the SYCL ecosystem, look at https://sycl.tech

Documentation

Some reasons to use SYCL

Please see about SYCL to have some context, a list of presentations, some related projects.

Installation & testing

SYCL is a template library, so no real installation is required.

There are some examples you can build however.

See Testing.

Architecture of triSYCL runtime and compiler

Architecture of triSYCL runtime and compiler describes the code base with some high-level diagrams but also how it was possible to compile and use the obsolete device compiler on some Xilinx FPGA for example. Now look at https://github.com/triSYCL/sycl instead.

CMake infrastructure

Some details about CMake configuration and organization can be found in CMake.

Pre-processor macros used in triSYCL

Yes, there are some macros used in triSYCL! Look at Pre-processor macros used in triSYCL to discover some of them.

Environment variables used in triSYCL

See Environment variables with triSYCL.

Possible futures

See Possible futures.

triSYCL code documentation

The documentation of the triSYCL implementation itself can be found in https://trisycl.github.io/triSYCL/Doxygen/triSYCL/html and https://trisycl.github.io/triSYCL/Doxygen/triSYCL/triSYCL-implementation-refman.pdf

There are also some internal documentation at https://pages.gitenterprise.xilinx.com/rkeryell/acappp/Doxygen/acappp/html

News

  • 2023/06/09: merge the 5-year old branch experimenting with ACAP++ SYCL CPU model extensions for AMD Versal ACAP AIE1 CGRA like the XCVC1902 used in VCK190 or VCK5000 boards.
  • 2018/03/12: the long-going device compiler branch has been merged in to provide experimental support for SPIR-df friendly devices, such as PoCL or Xilinx FPGA. This is only for the brave for now.
  • 2018/02/01: there is now some documentation about the architecture of triSYCL on GPU and accelerators with its device compiler based on Clang/LLVM in doc/architecture.rst. While this is wildly experimental, there is a growing interest around it and it is always useful to get started as a contributor.
  • 2018/01/05: there are some internship openings at Xilinx to work on triSYCL for FPGA https://xilinx.referrals.selectminds.com/jobs/compiler-engineer-intern-on-sycl-for-fpga-4685 and more generally Xilinx is hiring in compilation, runtime, C++, SYCL, OpenCL, machine-learning...
  • 2017/12/06: the brand-new SYCL 1.2.1 specification is out and triSYCL starts moving to it
  • 2017/11/17: the presentations and videos from SC17 on SYCL and triSYCL are now online https://www.khronos.org/news/events/supercomputing-2017
  • 2017/09/19: there is a prototype of device compiler based on Clang/LLVM generating SPIR 2.0 "de facto" (SPIR-df) and working at least with PoCL and Xilinx SDx xocc for FPGA.
  • 2017/03/03: triSYCL can use CMake & ctest and works on Windows 10 with Visual Studio 2017. It works also with Ubuntu WSL on Windows. :-) More info
  • 2017/01/12: Add test case using the Xilinx compiler for FPGA
  • 2016/11/18: If you missed the free SYCL T-shirt on the Khronos booth during SC16, you can always buy some on https://teespring.com/khronos-hpc (lady's sizes available, so no excuse! :-) )
  • 2016/08/12: OpenCL kernels can be run with OpenCL kernel interoperability mode now.
  • 2016/04/18: SYCL 2.2 provisional specification is out. This version implement SYCL 2.2 pipes and reservations plus the blocking pipe extension from Xilinx.

trisycl's People

Contributors

a-doumoulakis avatar agozillon avatar ahonorat avatar airlied avatar benoitsteiner avatar chriscummins avatar ghostkeeper avatar hughperkins avatar j-stephan avatar jeffamstutz avatar joanthibault avatar keryell avatar krasznaa avatar lforg37 avatar mathiasmagnus avatar nazavode avatar pkeir avatar psalz avatar ralender avatar szellmann avatar thijswithaar avatar ville-k avatar xlnx-hyunkwon avatar yu810226 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

trisycl's Issues

Fix read-only status of a buffer

As mentioned in #11 by @MathiasMagnus there is a issue on how the read-only status is computed.
Actually tests/buffer/read_write_buffer.cpp does not check the results. :-(
Furthermore this read-only status seems to have changed in the specification in the meantime.
So the implementation has to be fixed and cleaned up..

Move the project to its own GitHub organization

Since triSYCL is growing, with a specific Clang and LLVM version for the outlining compiler, I plan to create a GitHub organization and move all the related repositories under it.
In that way it could be easier to have a more collaborative control of the repositories.

Implement buffer allocators

It is used in TensorFlow SYCL.
Unfortunately the current triSYCL accessor design would leak in its type the allocator type.
So some deeper refactoring is required.

Question: Architecture?

What is the architecture?

Concretely, if I wanted to get a sycl program converted into some kind of llvm bytecode-thing, but not yet converted into openmp/xilinx/etc, how would I do that?

Do we want to compile warning free?

I know I have a fetish for writing warning-free, cross-platform, cross-compiler C++ code, but it might not be a goal. It does bug me though that the console literally explodes when compiling the unit tests.

MSVC cries over it's Debug Iterators a lot inside the SYCL short vector classes, GCC keeps whining about deprecated function declarations referred to inside stock Ubuntu CL/cl.h and Clang... for just about everything else.

It is very hard to test a new feature of SYCL, whether it provokes even just a warning in any of the tests, when the console is just beyond being of any use.

Should there be some aim like: GCC 6+, Clang 4+, VS2017+, all with -Wall -std=c++1z or /W4 /std:latest be warning-free? (/Wall with MSVC is useless and a joke)

Building triSYCL without boost.Compute

I am trying to use travis to build the SYCL Khronos Parallel STL with triSYCL automatically on each commit. However, I cannot set up OpenCL on travis (or at least not easily), so I cannot use boost compute.
It seems that triSYCL should not require boost compute to work, however, when trying to build without it it fails, for example https://travis-ci.org/Ruyk/SyclParallelSTL/jobs/221413128

This seems to be caused by boost compute headers not if/def out in some headers, for example in CL/sycl/opencl_type.hpp
Is still the case that triSYCL can be built without boost compute and without OpenCL? That would be great since it will simplify my travis integration...
Thanks!

All the OpenCL interoperability .get() method should retain the OpenCL object before returning

Double-check that all the.get() method really increment the OpenCL reference count.

The 1.2.1 specification says:

2.6.9 Managing object lifetimes

When an OpenCL object that is encapsulated in a SYCL object is copied in C++, then the underlying OpenCL
object is not duplicated, but its OpenCL reference count is incremented. When the original or copied SYCL object
is destroyed, then the OpenCL reference count is decremented.

Fails to build with GCC 7.1.0

It's pretty simple, it's failing to recognize {} as a general initialization form instead of as an initializer list. It might be a GCC bug, as clang 4.0.1 handles it fine.
I'm using Ubuntu 16.04, with hand-installed GCC and Clang.

.../triSYCL/build> g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/local/gcc-7.1/libexec/gcc/x86_64-pc-linux-gnu/7.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-7.1.0/configure --prefix=/usr/local/gcc-7.1 --with-gmp=/usr/local/gmp --with-mpfr=/usr/local/mpfr --disable-multilib
Thread model: posix
gcc version 7.1.0 (GCC) 

error:

.../triSYCL/build> make
cd /space/software/triSYCL/build/tests/buffer && /usr/local/gcc/bin/g++   -I/space/software/triSYCL/include -I/space/software/triSYCL/tests/common -I/usr/include/compute  -Wall -Wextra -Wno-ignored-attributes -Wno-sign-compare -Wno-deprecated-declarations -Wno-ignored-qualifiers -Wno-unused-parameter -fopenmp -std=gnu++1z -o CMakeFiles/buffer_buffer_unique_ptr.dir/buffer_unique_ptr.cpp.o -c /space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp
/space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp: In function ‘int test_main(int, char**)’:
/space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp:20:38: error: no matching function for call to ‘cl::sycl::buffer<int>::buffer(<brace-enclosed initializer list>)’
   buffer<int> a { std::move(init), N };
                                      ^
In file included from /space/software/triSYCL/include/CL/sycl.hpp:40:0,
                 from /space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp:8:
/space/software/triSYCL/include/CL/sycl/buffer.hpp:321:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(cl::sycl::buffer<T, Dimensions, Allocator>&, const cl::sycl::id<Dimensions>&, const cl::sycl::range<Dimensions>&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>]
   buffer(buffer<T, Dimensions, Allocator> &b,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:321:3: note:   candidate expects 4 arguments, 2 provided
/space/software/triSYCL/include/CL/sycl/buffer.hpp:299:3: note: candidate: template<class InputIterator, class ValueType> cl::sycl::buffer<T, Dimensions, Allocator>::buffer(InputIterator, InputIterator, Allocator)
   buffer(InputIterator start_iterator,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:299:3: note:   template argument deduction/substitution failed:
/space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp:20:38: note:   deduced conflicting types for parameter ‘InputIterator’ (‘std::unique_ptr<int []>’ and ‘long unsigned int’)
   buffer<int> a { std::move(init), N };
                                      ^
In file included from /space/software/triSYCL/include/CL/sycl.hpp:40:0,
                 from /space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp:8:
/space/software/triSYCL/include/CL/sycl/buffer.hpp:254:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(cl::sycl::unique_ptr_class<T>&&, const cl::sycl::range<Dimensions>&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>; cl::sycl::unique_ptr_class<T> = std::unique_ptr<int [], std::default_delete<int> >]
   buffer(unique_ptr_class<T> &&host_data,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:254:3: note:   no known conversion for argument 1 from ‘std::remove_reference<std::unique_ptr<int []>&>::type {aka std::unique_ptr<int []>}’ to ‘cl::sycl::unique_ptr_class<int, std::default_delete<int> >&& {aka std::unique_ptr<int [], std::default_delete<int> >&&}’
/space/software/triSYCL/include/CL/sycl/buffer.hpp:227:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(cl::sycl::shared_ptr_class<T>, const cl::sycl::range<Dimensions>&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>; cl::sycl::shared_ptr_class<T> = std::shared_ptr<int>]
   buffer(shared_ptr_class<T> host_data,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:227:3: note:   no known conversion for argument 1 from ‘std::remove_reference<std::unique_ptr<int []>&>::type {aka std::unique_ptr<int []>}’ to ‘cl::sycl::shared_ptr_class<int> {aka std::shared_ptr<int>}’
/space/software/triSYCL/include/CL/sycl/buffer.hpp:199:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(cl::sycl::shared_ptr_class<T>&, const cl::sycl::range<Dimensions>&, cl::sycl::mutex_class&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>; cl::sycl::shared_ptr_class<T> = std::shared_ptr<int>; cl::sycl::mutex_class = std::mutex]
   buffer(shared_ptr_class<T> &host_data,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:199:3: note:   candidate expects 4 arguments, 2 provided
/space/software/triSYCL/include/CL/sycl/buffer.hpp:170:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(T*, const cl::sycl::range<Dimensions>&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>]
   buffer(T *host_data,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:170:3: note:   no known conversion for argument 1 from ‘std::remove_reference<std::unique_ptr<int []>&>::type {aka std::unique_ptr<int []>}’ to ‘int*’
/space/software/triSYCL/include/CL/sycl/buffer.hpp:146:3: note: candidate: template<class Dependent, class> cl::sycl::buffer<T, Dimensions, Allocator>::buffer(const T*, const cl::sycl::range<Dimensions>&, Allocator)
   buffer(const T *host_data,
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:146:3: note:   template argument deduction/substitution failed:
/space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp:20:28: note:   cannot convert ‘std::move<std::unique_ptr<int []>&>(init)’ (type ‘std::remove_reference<std::unique_ptr<int []>&>::type {aka std::unique_ptr<int []>}’) to type ‘const int*’
   buffer<int> a { std::move(init), N };
                   ~~~~~~~~~^~~~~~
In file included from /space/software/triSYCL/include/CL/sycl.hpp:40:0,
                 from /space/software/triSYCL/tests/buffer/buffer_unique_ptr.cpp:8:
/space/software/triSYCL/include/CL/sycl/buffer.hpp:113:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(const cl::sycl::range<Dimensions>&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>]
   buffer(const range<Dimensions> &r, Allocator allocator = {})
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:113:3: note:   no known conversion for argument 1 from ‘std::remove_reference<std::unique_ptr<int []>&>::type {aka std::unique_ptr<int []>}’ to ‘const cl::sycl::range<1>&’
/space/software/triSYCL/include/CL/sycl/buffer.hpp:98:3: note: candidate: constexpr cl::sycl::buffer<T, Dimensions, Allocator>::buffer() [with T = int; int Dimensions = 1; Allocator = std::allocator<int>]
   buffer() = default;
   ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:98:3: note:   candidate expects 0 arguments, 2 provided
/space/software/triSYCL/include/CL/sycl/buffer.hpp:62:7: note: candidate: cl::sycl::buffer<int>::buffer(const cl::sycl::buffer<int>&)
 class buffer
       ^~~~~~
/space/software/triSYCL/include/CL/sycl/buffer.hpp:62:7: note:   candidate expects 1 argument, 2 provided
/space/software/triSYCL/include/CL/sycl/buffer.hpp:62:7: note: candidate: cl::sycl::buffer<int>::buffer(cl::sycl::buffer<int>&&)
/space/software/triSYCL/include/CL/sycl/buffer.hpp:62:7: note:   candidate expects 1 argument, 2 provided
tests/buffer/CMakeFiles/buffer_buffer_unique_ptr.dir/build.make:65: recipe for target 'tests/buffer/CMakeFiles/buffer_buffer_unique_ptr.dir/buffer_unique_ptr.cpp.o' failed

Multi-array build error

When trying to build Parallel STL with TriSYCL I get the following error:

/home/ruyman/Projects/ruyk-parallel-stl/include/experimental/algorithm:287:38:   required from ‘InputIt std::experimental::parallel::find(ExecutionPolicy&&, InputIt, InputIt, T) [with ExecutionPolicy = sycl::sycl_execution_policy<FindAlgorithm>&; InputIt = __gnu_cxx::__normal_iterator<float*, std::vector<float> >; T = float]’
/home/ruyman/Projects/ruyk-parallel-stl/tests/find.cpp:63:59:   required from here
/usr/include/boost/multi_array.hpp:477:30: error: no matching function for call to ‘sycl::impl::search_result::search_result()’
     std::uninitialized_fill_n(base_,allocated_elements_,T());

This seems to be a problem with the multi_array interface. Which version of multi-array is required for triSYCL? I am using 1.58.

Travis Parallel STL Job output:
https://travis-ci.org/Ruyk/SyclParallelSTL/jobs/221684886

Implement context to optimize OpenCL transfers

The current implementation of context is pretty minimal.

OpenCL interoperability

There is a problem with the buffer accessor that holds the boost::optional<boost::compute::buffer> cl_buf, every time a new allocator is created (eg. when passing arguments to the kernel) a transfer to the device is triggered even if we could have reused a previous boost::compute::buffer

A solution would be a caching mechanism inside of the buffer that associates a cl::sycl::context to a boost::compute::buffer in an unordered_map.
Implementing cl::sycl::context as a detail::shared_ptr_implementation would allow it to be hashed and be used inside a unordered_map.

Context implementation

A basic implementation of context could be similar to device or queue with a context/detail/host_context for the host, context/detail/opencl_context to hold the boost::compute::context and a context/detail/context that holds the virutal functions.

Compiler Error in buffer_unique_ptr test

This is the revision of PR #46 and issue #47 . At first, I thought the problem is just a minor typo or mistake. But it turns out that there seems to be a fundamental error in the buffer class. (Note: I'm testing with GCC / libstdc++ 7.1.1)

First, let's look at the original error message:

/root/triSYCL/tests/buffer/buffer_unique_ptr.cpp: In function 'int test_main(int, char**)':
/root/triSYCL/tests/buffer/buffer_unique_ptr.cpp:20:38: error: no matching function for call to 'cl::sycl::buffer<int>::buffer(<brace-enclosed initializer list>)'
   buffer<int> a { std::move(init), N };

We want to construct buffer from uniqu_ptr, so let's see what did it complained about that ctor:

/root/triSYCL/include/CL/sycl/buffer.hpp:254:3: note: candidate: cl::sycl::buffer<T, Dimensions, Allocator>::buffer(cl::sycl::unique_ptr_class<T>&&, const cl::sycl::range<Dimensions>&, Allocator) [with T = int; int Dimensions = 1; Allocator = std::allocator<int>; cl::sycl::unique_ptr_class<T> = std::unique_ptr<int [], std::default_delete<int> >]
   buffer(unique_ptr_class<T> &&host_data,
   ^~~~~~
/root/triSYCL/include/CL/sycl/buffer.hpp:254:3: note:   no known conversion for argument 1 from 'std::remove_reference<std::unique_ptr<int []>&>::type {aka std::unique_ptr<int []>}' to 'cl::sycl::unique_ptr_class<int, std::default_delete<int> >&& {aka std::unique_ptr<int [], std::default_delete<int> >&&}'

OK, what about modifying line 17 in the test into the following?

std::unique_ptr<int[], std::default_delete<int>> init { new int[N] };

Now it can find the desired buffer ctor, but it then fails on instantiating one of the shared_ptr type member in detail::buffer:

/root/triSYCL/tests/buffer/buffer_unique_ptr.cpp:20:38:   required from here
/root/triSYCL/include/CL/sycl/buffer/detail/buffer.hpp:181:22: error: no matching function for call to 'std::shared_ptr<int>::shared_ptr(<brace-enclosed initializer list>)'

(It's actually line 180 in include/CL/sycl/buffer/detail/buffer.hpp)

Our desired shared_ptr ctor is the one that takes unique_ptr, let's see what did the compiler complained about it.

/usr/include/c++/7.1.1/bits/shared_ptr.h:277:2: note: candidate: template<class _Yp, class _Del, class> std::shared_ptr<_Tp>::shared_ptr(std::unique_ptr<_Up, _Ep>&&)
  shared_ptr(unique_ptr<_Yp, _Del>&& __r)
  ^~~~~~~~~~
/usr/include/c++/7.1.1/bits/shared_ptr.h:277:2: note:   template argument deduction/substitution failed:

(Yes, there is nothing after the tailing colon)

Note that the first template parameter of the ctor, _Yp, should equal the type of the first template parameter for the unique_ptr ctor argument, but now they're different(_Yp vs _Up).

It turns out that the static type assertion for this ctor cause this error. Here is the ctor declaration from libstdc++ 7.1.1(include/c++/7.1.1/bits/shared_ptr.h)

      template<typename _Yp, typename _Del,
               typename = _Constructible<unique_ptr<_Yp, _Del>>>
        shared_ptr(unique_ptr<_Yp, _Del>&& __r)
        : __shared_ptr<_Tp>(std::move(__r)) { }

_Constructible is just a thin wrapper around std::is_contructible, which check if __shared_ptr can be contructed by __r.

I think is make senses that the ctor fails here because back to the ctor of detail::buffer that contructed by unique_ptr(line 180 in include/CL/sycl/buffer/detail/buffer.hpp), it takes unique_ptr<int[], std::default_delete<int>> type of unique pointer as the argument, which would cause the _Yp template parameter in the above shared_ptr ctor to be resolved into int[]. However, the shared pointer member in detail::buffer<int> class, input_shared_pointer, is declared shared_ptr<int>.

My guess is that: prior to version 7.x, GCC or libstdc++ think it is _Constructible in this situation, where the later versions don't.

Here is a quick solution. Change line 180 in include/CL/sycl/buffer/detail/buffer.hpp:

input_shared_pointer{ host_data.release(), host_data.get_deleter() },

That is, use the T* ctor instead of unique_ptr ctor for constructing input_shared_pointer

It took me nearly a weekend to verify most of the assumptions in this paragraph, so I think this issue is non-trivial and thus worth some discussions before sending a PR.

Also, shouldn't unique_ptr_class, declared in include/CL/sycl/detail/default_classes.hpp, be like

template <class T, class D = std::default_delete<T[]>>
using unique_ptr_class = std::unique_ptr<T[], D>;

rather than

template <class T, class D = std::default_delete<T>>
using unique_ptr_class = std::unique_ptr<T[], D>;

The former one would leverage the proper delete[] where the latter wouldn't. (Or should I put this problem into separated issue?)

CC @keryell
Thanks for your patient reading

Make asynchronous kernel launch mode the default one

In SYCL the kernels are launched as an asynchronous task graph.
For now this happens only when compiled with -DTRISYCL_ASYNC=1.
Since this is the default mode, this variable should be renamed TRISYCL_NO_ASYNC or something like that.

Kernels not running on GPU: cl::sycl::gpu_selector not work

Hi,

I successfully installed triSYCL on my PC with:

  • a NVIDIA GeForce GT 745M GPU hoted in an Intel I5.
  • Ubuntu 14.04.
  • CUDA Driver / Runtime 8.0
  • gcc / g++ 4.8 and 6.0
  • Boost 1.63.0

All test programs (located in triSYCL/tests) compiled without any error and runtime error.

But when I did some examples and added try/catch with:
queue myQueue([&](exception_list eL) { try { for (auto &e : eL) { std::rethrow_exception(e); } } catch (cl::sycl::exception e) { std::cout << " An exception has been thrown: " << e.what() << std::endl; } });
It compiled and linked successfully. But I got the following runtime error:

Error: using a non implemented feature

Declaring a queue with queue myQueue; all things running fine but myQueue.is_host() return true, so kernel are running on host.

When I tried to use:
cl::sycl::gpu_selector selector; cl::sycl::queue myQueue(selector)

I got the same runtime error:

Error: using a non implemented feature

So I want to understand why my kernels are running on host instead of my GPU while I linked my programs with NVIDIA OpenCL implementation located in /usr/local/cuda/

Thanks.

Compile error in buffer test

Hi,
I bumped into a compile error on line 20 in tests/buffer/buffer_unique_ptr.cpp. It turns out that in line 17, managed type for unique_ptr should be int instead of int[].
I'd created PR #46 . The revision one also passed the test.
CC @keryell

Refactor `parallel_for_workitem()` implementation

In

template <int Dimensions, typename ParallelForFunctor>
void parallel_for_workitem(const group<Dimensions> &g,
                           ParallelForFunctor f)

I have the feeling that the loop on th_id could be just a normal (even metaprogrammed?) 1-3D collapsed loop. No need to have a thread id here anymore with all these division and modulo operations.

Example of something that is not simply per-element operations?

Hi Ronan,

Question: are there any examples of something that is not just a simple per-element operation? I'm thinking of eg:

  • matrix multiplication?
  • convolution?
  • reduction?

I lookedi n examples, and found per-element addition, multiplication etc, but couldnt seem to find anything that wasnt simply per-element?

Finish implementation of vec.hpp

Complete include/CL/sycl/vec.hpp by improving small_array.
Check/improve the tests in tests/vector.

Probably skip all the swizzle for now, to be put in another issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.