Giter Site home page Giter Site logo

Comments (12)

nguyenvuduc avatar nguyenvuduc commented on May 21, 2024 1

Sorry, it took me so long to respond to you. I have a very big kernel with many classes. I have to cut it down so that you can have a better look at it. This is the piece of code that can cause the exception.
If you delete any line of the code in the kernel, the exception will disappear.

public class MyComplexKernel
{
    private static void _MyKernel(Index2 index, ArrayView<byte> out_bitmap)
    {
        var N = new SVector3Df(0.5f, 0.5f, 0.5f);
        out_bitmap[0] = (byte)_CalculatePixelValue(N);
    }

    private static ushort _CalculatePixelValue(SVector3Df N)
    {
        float param1 = 0.5f;
        float param2 = 0.5f;
        float param3 = 0.5f;
        float shininess = 100.0f;
        var H = new SVector3Df(0.5f, 0.5f, 0.5f);
        var spec_coeff = (float)XMath.Pow(XMath.Max(H.DotProduct(N), 0.0f), shininess);

        var color = spec_coeff;
        return (ushort)XMath.Min(255.0f, spec_coeff);
    }

    public static void Main()
    {
        foreach (var acceleratorId in Accelerator.Accelerators)
        {
            using (var GpuContext = new ILGPU.Context())
            using (var GpuAccelerator = Accelerator.Create(GpuContext, acceleratorId))
            using (var out_bitmap = GpuAccelerator.Allocate<byte>(1))
            {
                GpuContext.EnableAlgorithms();
                var kernel = GpuAccelerator.LoadStreamKernel(new Action<Index2, ArrayView<byte>>(_MyKernel));

                kernel(
                    new Index2(1, 1),
                    out_bitmap
                );

                GpuAccelerator.Synchronize();
            }
        }
    }


    public struct SVector3Df
    {
        public float X { get; set; }
        public float Y { get; set; }
        public float Z { get; set; }

        public SVector3Df(float x, float y, float z)
        {
            X = x;
            Y = y;
            Z = z;
        }

        public float DotProduct(SVector3Df vector)
        {
            return X * vector.X + Y * vector.Y + Z * vector.Z;
        }
    }
}

`

from ilgpu.

MoFtZ avatar MoFtZ commented on May 21, 2024 1

hi @nguyenvuduc, @m4rs-mt,

From my quick initial investigation, it looks like the issue is because EnableAlgorithms is called AFTER the Accelerator has been created.

If I call EnableAlgorithms immediately after the ILGPU.Context has been created, then it all works.

FIXED CODE:


public static void Main()
{
    foreach (var acceleratorId in Accelerator.Accelerators)
    {
        using (var GpuContext = new ILGPU.Context())
        {
            GpuContext.EnableAlgorithms();
            
            using (var GpuAccelerator = Accelerator.Create(GpuContext, acceleratorId))  
            using (var out_bitmap = GpuAccelerator.Allocate<byte>(1))
            {
                var kernel = GpuAccelerator.LoadStreamKernel(new Action<Index2, ArrayView<byte>>(_MyKernel));

                kernel(
                    new Index2(1, 1),
                    out_bitmap
                );

                GpuAccelerator.Synchronize();
            }
        }
    }
}

QUESTION: Not sure why @m4rs-mt cannot reproduce the issue - race condition?

QUESTION Not sure why calling EnableAlgorithms after the accelerator has been created is a problem - does the accelerator take a copy of the Intrinsics Manager?

from ilgpu.

m4rs-mt avatar m4rs-mt commented on May 21, 2024 1

@MoFtZ Thank you for examining this issue. You are right: EnableAlgorithms registers all intrinsics within a global (contextual) IntrinsicImplementationManager hosted by an ILGPU Context instance. When an Accelerator instance is instantiated, the corresponding Backend instance is created. This in turn triggers a specialization phase of all registered Intrinsics. Consequently, the problem is related to location of the EnableAlgorithms call: The Accelerator instance does not recognize the intrinsics, since they have not been registered yet.

@MoFtZ @nguyenvuduc The reason why I could not reproduce this issue was a mistake on my part. I accidentally moved the call to EnableAlgorithms because I expected a bug inside the intrinsic specialization phases that take place within each backend (based on the original bug report). Basically I was looking in the wrong place and accidentally "circumvented" the actual issue.

from ilgpu.

nguyenvuduc avatar nguyenvuduc commented on May 21, 2024

I found a temporary solution to workaround this bug. Just replace XMath.Pow(@base, exp) with XMath.Exp2(exp * XMath.Log2(@base)) and it will work.

However, I think this is still a bug as I can see the PTX implementation of ILGPU.Algorithm trying to substitute Pow function with exp and log function, and it should work.

from ilgpu.

m4rs-mt avatar m4rs-mt commented on May 21, 2024

@nguyenvuduc Thank you for your very detailed bug report. It seems to be an internal ILGPU intrinsic-specialization error. I will take a closer look 🔢.

from ilgpu.

m4rs-mt avatar m4rs-mt commented on May 21, 2024

I have investigated the potential issue in more detail and tested different configurations. Unfortunately, I cannot reproduce your issue at the moment. For instance, the following code can be compiled without any further problems on all test machines:

static void TestPow(Index index, ArrayView<float> view, float val)
{
    view[index] = XMath.Pow(2.0f, val);
}

Can you provide a minimal example that crashes on your machine?

from ilgpu.

m4rs-mt avatar m4rs-mt commented on May 21, 2024

@nguyenvuduc Thank you for your sample program for reproducing the issue. I will take a closer look at it.

from ilgpu.

m4rs-mt avatar m4rs-mt commented on May 21, 2024

Unfortunately I have to admit that I am still unable to reproduce the problem on three different NVIDIA GPUs. Can you give me more information about your execution environment? All kernels could be specialized without further problems. I suspect the problem could be related to an invalid intrinsic initialization phase (but this is just a wild guess at this point).

I also recommend using a single Context instance for the whole program. This reduces runtime and processing overhead considerably and enables internal caches. This context instance is compatible with several accelerators at the same time.

from ilgpu.

nguyenvuduc avatar nguyenvuduc commented on May 21, 2024

Hi, I think you are right. The exception doesn't occur when I changed the code that I sent you so that it uses a single Context instance for whole program.

Here is my execution environment:

  • Windows 10 Pro
  • Processor: Intel Core i7-6700 CPU @ 3.4GHz
  • RAM: 16GB RAM
  • Display cards:
    Intel (R) HD Graphics 530
    NVIDIA GeForce GTX 750 Ti
  • List of Accelerators discovered by the ILGPU (in the same order that it appears in runtime):
    • CPUAccelerator [WarpSize: 1, MaxNumThreadsPerGroup: 8, MemorySize: 9223372036854775807]
    • GeForce GTX 750 Ti [WarpSize: 32, MaxNumThreadsPerGroup: 1024, MemorySize: 2147483648]:
    • Intel(R) HD Graphics 530 [WarpSize: 16, MaxNumThreadsPerGroup: 256, MemorySize: 6831112192]
    • Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz [WarpSize: 128, MaxNumThreadsPerGroup: 8192, MemorySize: 17077784576]

from ilgpu.

MoFtZ avatar MoFtZ commented on May 21, 2024

hi @m4rs-mt, I am able to reproduce this issue. I'll try and make some time to look into this later today or tomorrow.

Environment:

  • Windows 10 Pro (1809)
  • GeForce GTX 1070
  • CUDA Runtime v10.2
  • Graphics Driver 441.22

from ilgpu.

MoFtZ avatar MoFtZ commented on May 21, 2024

hi @nguyenvuduc, have you managed to fix your code by moving the call to EnableAlgorithms to be made earlier?

Can this issue be closed now?

from ilgpu.

nguyenvuduc avatar nguyenvuduc commented on May 21, 2024

hi @MoFtZ
Yes, I did two things as you and @m4rs-mt suggested: (1) call EnableAlgorithms right before the Context is created and (2) Use only one single Context instance for the whole program, and it has fixed the issue.

I will close this issue.

from ilgpu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.