Comments (12)
Sorry, it took me so long to respond to you. I have a very big kernel with many classes. I have to cut it down so that you can have a better look at it. This is the piece of code that can cause the exception.
If you delete any line of the code in the kernel, the exception will disappear.
public class MyComplexKernel
{
private static void _MyKernel(Index2 index, ArrayView<byte> out_bitmap)
{
var N = new SVector3Df(0.5f, 0.5f, 0.5f);
out_bitmap[0] = (byte)_CalculatePixelValue(N);
}
private static ushort _CalculatePixelValue(SVector3Df N)
{
float param1 = 0.5f;
float param2 = 0.5f;
float param3 = 0.5f;
float shininess = 100.0f;
var H = new SVector3Df(0.5f, 0.5f, 0.5f);
var spec_coeff = (float)XMath.Pow(XMath.Max(H.DotProduct(N), 0.0f), shininess);
var color = spec_coeff;
return (ushort)XMath.Min(255.0f, spec_coeff);
}
public static void Main()
{
foreach (var acceleratorId in Accelerator.Accelerators)
{
using (var GpuContext = new ILGPU.Context())
using (var GpuAccelerator = Accelerator.Create(GpuContext, acceleratorId))
using (var out_bitmap = GpuAccelerator.Allocate<byte>(1))
{
GpuContext.EnableAlgorithms();
var kernel = GpuAccelerator.LoadStreamKernel(new Action<Index2, ArrayView<byte>>(_MyKernel));
kernel(
new Index2(1, 1),
out_bitmap
);
GpuAccelerator.Synchronize();
}
}
}
public struct SVector3Df
{
public float X { get; set; }
public float Y { get; set; }
public float Z { get; set; }
public SVector3Df(float x, float y, float z)
{
X = x;
Y = y;
Z = z;
}
public float DotProduct(SVector3Df vector)
{
return X * vector.X + Y * vector.Y + Z * vector.Z;
}
}
}
`
from ilgpu.
hi @nguyenvuduc, @m4rs-mt,
From my quick initial investigation, it looks like the issue is because EnableAlgorithms
is called AFTER the Accelerator
has been created.
If I call EnableAlgorithms
immediately after the ILGPU.Context
has been created, then it all works.
FIXED CODE:
public static void Main()
{
foreach (var acceleratorId in Accelerator.Accelerators)
{
using (var GpuContext = new ILGPU.Context())
{
GpuContext.EnableAlgorithms();
using (var GpuAccelerator = Accelerator.Create(GpuContext, acceleratorId))
using (var out_bitmap = GpuAccelerator.Allocate<byte>(1))
{
var kernel = GpuAccelerator.LoadStreamKernel(new Action<Index2, ArrayView<byte>>(_MyKernel));
kernel(
new Index2(1, 1),
out_bitmap
);
GpuAccelerator.Synchronize();
}
}
}
}
QUESTION: Not sure why @m4rs-mt cannot reproduce the issue - race condition?
QUESTION Not sure why calling EnableAlgorithms
after the accelerator has been created is a problem - does the accelerator take a copy of the Intrinsics Manager?
from ilgpu.
@MoFtZ Thank you for examining this issue. You are right: EnableAlgorithms
registers all intrinsics within a global (contextual) IntrinsicImplementationManager
hosted by an ILGPU Context
instance. When an Accelerator
instance is instantiated, the corresponding Backend
instance is created. This in turn triggers a specialization phase of all registered Intrinsics. Consequently, the problem is related to location of the EnableAlgorithms
call: The Accelerator
instance does not recognize the intrinsics, since they have not been registered yet.
@MoFtZ @nguyenvuduc The reason why I could not reproduce this issue was a mistake on my part. I accidentally moved the call to EnableAlgorithms
because I expected a bug inside the intrinsic specialization phases that take place within each backend (based on the original bug report). Basically I was looking in the wrong place and accidentally "circumvented" the actual issue.
from ilgpu.
I found a temporary solution to workaround this bug. Just replace XMath.Pow(@base, exp)
with XMath.Exp2(exp * XMath.Log2(@base))
and it will work.
However, I think this is still a bug as I can see the PTX implementation of ILGPU.Algorithm trying to substitute Pow function with exp and log function, and it should work.
from ilgpu.
@nguyenvuduc Thank you for your very detailed bug report. It seems to be an internal ILGPU intrinsic-specialization error. I will take a closer look 🔢.
from ilgpu.
I have investigated the potential issue in more detail and tested different configurations. Unfortunately, I cannot reproduce your issue at the moment. For instance, the following code can be compiled without any further problems on all test machines:
static void TestPow(Index index, ArrayView<float> view, float val)
{
view[index] = XMath.Pow(2.0f, val);
}
Can you provide a minimal example that crashes on your machine?
from ilgpu.
@nguyenvuduc Thank you for your sample program for reproducing the issue. I will take a closer look at it.
from ilgpu.
Unfortunately I have to admit that I am still unable to reproduce the problem on three different NVIDIA GPUs. Can you give me more information about your execution environment? All kernels could be specialized without further problems. I suspect the problem could be related to an invalid intrinsic initialization phase (but this is just a wild guess at this point).
I also recommend using a single Context
instance for the whole program. This reduces runtime and processing overhead considerably and enables internal caches. This context instance is compatible with several accelerators at the same time.
from ilgpu.
Hi, I think you are right. The exception doesn't occur when I changed the code that I sent you so that it uses a single Context instance for whole program.
Here is my execution environment:
- Windows 10 Pro
- Processor: Intel Core i7-6700 CPU @ 3.4GHz
- RAM: 16GB RAM
- Display cards:
Intel (R) HD Graphics 530
NVIDIA GeForce GTX 750 Ti - List of Accelerators discovered by the ILGPU (in the same order that it appears in runtime):
- CPUAccelerator [WarpSize: 1, MaxNumThreadsPerGroup: 8, MemorySize: 9223372036854775807]
- GeForce GTX 750 Ti [WarpSize: 32, MaxNumThreadsPerGroup: 1024, MemorySize: 2147483648]:
- Intel(R) HD Graphics 530 [WarpSize: 16, MaxNumThreadsPerGroup: 256, MemorySize: 6831112192]
- Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz [WarpSize: 128, MaxNumThreadsPerGroup: 8192, MemorySize: 17077784576]
from ilgpu.
hi @m4rs-mt, I am able to reproduce this issue. I'll try and make some time to look into this later today or tomorrow.
Environment:
- Windows 10 Pro (1809)
- GeForce GTX 1070
- CUDA Runtime v10.2
- Graphics Driver 441.22
from ilgpu.
hi @nguyenvuduc, have you managed to fix your code by moving the call to EnableAlgorithms
to be made earlier?
Can this issue be closed now?
from ilgpu.
hi @MoFtZ
Yes, I did two things as you and @m4rs-mt suggested: (1) call EnableAlgorithms
right before the Context is created and (2) Use only one single Context
instance for the whole program, and it has fixed the issue.
I will close this issue.
from ilgpu.
Related Issues (20)
- Intel GPUs ,float64 type is not supported on this device HOT 5
- Are vector data types supported? HOT 3
- A Tensor Library HOT 6
- VelocityDevice and MaxGridSize HOT 3
- Sample of "AlgorithmsRadixSort" failed on OpenCL device HOT 4
- `NullReferenceException` when passing empty `ArrayView`s to OpenCL kernel HOT 1
- XMath.Pow() only work on CPU HOT 3
- Better error messages when kernel program failed to run. HOT 1
- Is it possible to use a stored dataset on GPU again and again with throwing extra data to GPU, and even change the value of the established dataset? HOT 1
- Add a CPU-GPU-Shared MemoryBuffer for systems that support it HOT 2
- Iteration of value with loops on GPU slows down significantly HOT 5
- Feature request: cudaStreamWaitEvent HOT 7
- Higher precision float (decimal) support? HOT 2
- Passing Int128 as kernel parameter is not working HOT 3
- System.BadImageFormatException in System.Reflection.Metadata.dll HOT 4
- OpenCL.CLException HOT 2
- [QUESTION]: Exception in Accelerator.Synchronize on CUDA HOT 2
- [BUG] Cuda 12 SDK not supported with ILGPU 1.5.X HOT 2
- [POTENTIAL BUG]: CopyToCpu is using refs in unsafe way but there is no indication of that. HOT 3
- [BUG]: Unit tests failing on GitHub runner with MacOS 14 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ilgpu.