Giter Site home page Giter Site logo

Comments (21)

jrfonseca avatar jrfonseca commented on June 7, 2024

IIUC, the issue is tied to gl*Pointer with GL_HALF_FLOAT.

I just tried apitrace with https://gitlab.freedesktop.org/mesa/demos/-/blob/main/src/trivial/vp-array-hf.c and it worked alright.

I can't download neither of the mega.nz links. Can you share this or any other trace that shows the issue?

That or modify vp-array-hf to exercise the problem.

from apitrace.

zmike avatar zmike commented on June 7, 2024

Strange, the mega links still work for me.

I've re-uploaded one of the traces to https://gitlab.freedesktop.org/zmike/dumps/-/blob/master/nep-fantasy-zone-compressed.trace so this should be accessible

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

Thanks. Maybe my ISP is blocking mega or their local servers are broken, as it just shows me a cloud and nothing clickable. I got it from your link now.

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

I don't think the issue is related to GL_HALF_FLOAT.

The issue is that glDrawElements from call 2191125 is preceeded by the fake gl*Pointer calls, but the glDrawElements from 2191160 somehow was not:

2191117 glScissor(x = 0, y = 0, width = 1920, height = 1088)
2191118 glClientActiveTexture(texture = GL_TEXTURE0) // fake
2191119 glTexCoordPointer(size = 2, type = GL_HALF_FLOAT, stride = 32, pointer = blob(259044)) // fake
2191120 glClientActiveTexture(texture = GL_TEXTURE4) // fake
2191121 glColorPointer(size = 4, type = GL_UNSIGNED_BYTE, stride = 32, pointer = blob(259044)) // fake
2191122 glNormalPointer(type = GL_HALF_FLOAT, stride = 32, pointer = blob(259046)) // fake
2191123 glVertexPointer(size = 3, type = GL_FLOAT, stride = 32, pointer = blob(259052)) // fake
2191124 glVertexAttribPointer(index = 14, size = 3, type = GL_HALF_FLOAT, normalized = GL_FALSE, stride = 32, pointer = blob(259046)) // fake
2191125 glDrawElements(mode = GL_TRIANGLES, count = 1044, type = GL_UNSIGNED_INT, indices = blob(4176))
2191126 glMatrixMode(mode = GL_PROJECTION)
2191127 glLoadIdentity()
2191128 glMatrixMode(mode = GL_MODELVIEW)
2191129 glLoadIdentity()
2191130 glFrontFace(mode = GL_CCW)
2191131 glPolygonMode(face = GL_FRONT_AND_BACK, mode = GL_FILL)
2191132 glDisable(cap = GL_LIGHTING)
2191133 glEnable(cap = GL_TEXTURE_2D)
2191134 glClientActiveTexture(texture = GL_TEXTURE0)
2191135 glActiveTexture(texture = GL_TEXTURE0)
2191136 glBindTexture(target = GL_TEXTURE_2D, texture = 263)
2191137 glBindMultiTextureEXT(texunit = GL_TEXTURE0, target = GL_TEXTURE_2D, texture = 263)
2191138 glEnable(cap = GL_TEXTURE_2D)
2191139 glClientActiveTexture(texture = GL_TEXTURE1)
2191140 glActiveTexture(texture = GL_TEXTURE1)
2191141 glBindTexture(target = GL_TEXTURE_2D, texture = 264)
2191142 glBindMultiTextureEXT(texunit = GL_TEXTURE1, target = GL_TEXTURE_2D, texture = 264)
2191143 glEnable(cap = GL_TEXTURE_2D)
2191144 glClientActiveTexture(texture = GL_TEXTURE2)
2191145 glActiveTexture(texture = GL_TEXTURE2)
2191146 glBindTexture(target = GL_TEXTURE_2D, texture = 265)
2191147 glBindMultiTextureEXT(texunit = GL_TEXTURE2, target = GL_TEXTURE_2D, texture = 265)
2191148 glEnable(cap = GL_TEXTURE_2D)
2191149 glClientActiveTexture(texture = GL_TEXTURE3)
2191150 glActiveTexture(texture = GL_TEXTURE3)
2191151 glBindTexture(target = GL_TEXTURE_2D, texture = 10)
2191152 glBindMultiTextureEXT(texunit = GL_TEXTURE3, target = GL_TEXTURE_2D, texture = 10)
2191153 glEnable(cap = GL_TEXTURE_2D)
2191154 glClientActiveTexture(texture = GL_TEXTURE4)
2191155 glActiveTexture(texture = GL_TEXTURE4)
2191156 glBindTexture(target = GL_TEXTURE_2D, texture = 262)
2191157 glBindMultiTextureEXT(texunit = GL_TEXTURE7, target = GL_TEXTURE_2D, texture = 262)
2191158 glViewport(x = 0, y = 0, width = 1920, height = 1088)
2191159 glScissor(x = 0, y = 0, width = 1920, height = 1088)
2191160 glDrawElements(mode = GL_TRIANGLES, count = 3912, type = GL_UNSIGNED_INT, indices = blob(15648))

Probably 2191160's indices have a larger maximum than 2191125's, so end. up trigginer a buffer overflow. Eitherway, the glDrawElements call should always be preceeded by fake gl*Pointer calls, as data might change between calls.

So the real question here is why didn't apitrace emit those fake calls? apitrace's _need_user_arrays() must have returned false somehow...

There are two possibilities here:

  1. one is that apitrace has a bug
  2. the other is that one of Mesas's glGet* calls is returning bogus data, misleading apitrace into thinking no user arrays are being used.

Possibility 2 happened a few times before...

from apitrace.

zmike avatar zmike commented on June 7, 2024

Hm interesting, and thanks for checking.

If you can point me to the potentially problematic glGet* calls I can try looking, though this part of mesa is not exactly my wheelhouse

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

If re-trace the trace on NVIDIA I see the fake

$ apitrace trim --calls 0-2191160 --output nep-fantasy-zone-trimmed.trace nep-fantasy-zone-compressed.trace
$ apitrace trace -o glretrace.trace glretrace nep-fantasy-zone-trimmed.trace
$ apitrace dump -v glretrace.trace | tail -25
[...]
3805318 glViewport(x = 0, y = 0, width = 1920, height = 1088)
3805320 glScissor(x = 0, y = 0, width = 1920, height = 1088)
3805322 glClientActiveTexture(texture = GL_TEXTURE0) // fake
3805323 glTexCoordPointer(size = 2, type = GL_HALF_FLOAT, stride = 32, pointer = blob(516324)) // fake
3805324 glClientActiveTexture(texture = GL_TEXTURE4) // fake
3805325 glColorPointer(size = 4, type = GL_UNSIGNED_BYTE, stride = 32, pointer = blob(516324)) // fake
3805326 glNormalPointer(type = GL_HALF_FLOAT, stride = 32, pointer = blob(516326)) // fake
3805327 glVertexPointer(size = 3, type = GL_FLOAT, stride = 32, pointer = blob(516332)) // fake
3805328 glVertexAttribPointer(index = 14, size = 3, type = GL_HALF_FLOAT, normalized = GL_FALSE, stride = 32, pointer = blob(516326)) // fake
3805329 glDrawElements(mode = GL_TRIANGLES, count = 3912, type = GL_UNSIGNED_INT, indices = blob(15648))
3805331 glFinish()

But I repeated with Mesa llvmpipe, and it also worked fine. Not sure if the issue is specific to zink, or random.

Perhaps you can repeat the experiment locally with zink, and see if those are not traced?

from apitrace.

zmike avatar zmike commented on June 7, 2024
3804400 glGetError() = GL_NO_ERROR
3804401 glActiveTexture(texture = GL_TEXTURE3)
3804402 glGetError() = GL_NO_ERROR
3804403 glBindTexture(target = GL_TEXTURE_2D, texture = 10)
3804404 glGetError() = GL_NO_ERROR
3804405 glBindMultiTextureEXT(texunit = GL_TEXTURE3, target = GL_TEXTURE_2D, texture = 10)
3804406 glGetError() = GL_NO_ERROR
3804407 glEnable(cap = GL_TEXTURE_2D)
3804408 glGetError() = GL_NO_ERROR
3804409 glClientActiveTexture(texture = GL_TEXTURE4)
3804410 glGetError() = GL_NO_ERROR
3804411 glActiveTexture(texture = GL_TEXTURE4)
3804412 glGetError() = GL_NO_ERROR
3804413 glBindTexture(target = GL_TEXTURE_2D, texture = 262)
3804414 glGetError() = GL_NO_ERROR
3804415 glBindMultiTextureEXT(texunit = GL_TEXTURE7, target = GL_TEXTURE_2D, texture = 262)
3804416 glGetError() = GL_NO_ERROR
3804417 glGetIntegerv(pname = GL_DRAW_FRAMEBUFFER_BINDING, params = &4)
3804418 glViewport(x = 0, y = 0, width = 1920, height = 1088)
3804419 glGetError() = GL_NO_ERROR
3804420 glScissor(x = 0, y = 0, width = 1920, height = 1088)
3804421 glGetError() = GL_NO_ERROR
3804422 glDrawElements(mode = GL_TRIANGLES, count = 3912, type = GL_UNSIGNED_INT, indices = blob(15648))
3804423 glGetError() = GL_NO_ERROR
3804424 glFinish()

from apitrace.

zmike avatar zmike commented on June 7, 2024

Captured again with -b to strip out the glGetError calls

1919775 glEnable(cap = GL_TEXTURE_2D)                          
1919776 glClientActiveTexture(texture = GL_TEXTURE1)
1919777 glActiveTexture(texture = GL_TEXTURE1)
1919778 glBindTexture(target = GL_TEXTURE_2D, texture = 264)
1919779 glBindMultiTextureEXT(texunit = GL_TEXTURE1, target = GL_TEXTURE_2D, texture = 264)
1919780 glEnable(cap = GL_TEXTURE_2D)
1919781 glClientActiveTexture(texture = GL_TEXTURE2)
1919782 glActiveTexture(texture = GL_TEXTURE2)
1919783 glBindTexture(target = GL_TEXTURE_2D, texture = 265)
1919784 glBindMultiTextureEXT(texunit = GL_TEXTURE2, target = GL_TEXTURE_2D, texture = 265)
1919785 glEnable(cap = GL_TEXTURE_2D)
1919786 glClientActiveTexture(texture = GL_TEXTURE3)
1919787 glActiveTexture(texture = GL_TEXTURE3)
1919788 glBindTexture(target = GL_TEXTURE_2D, texture = 10)
1919789 glBindMultiTextureEXT(texunit = GL_TEXTURE3, target = GL_TEXTURE_2D, texture = 10)
1919790 glEnable(cap = GL_TEXTURE_2D)
1919791 glClientActiveTexture(texture = GL_TEXTURE4)
1919792 glActiveTexture(texture = GL_TEXTURE4)
1919793 glBindTexture(target = GL_TEXTURE_2D, texture = 262)
1919794 glBindMultiTextureEXT(texunit = GL_TEXTURE7, target = GL_TEXTURE_2D, texture = 262)
1919795 glGetIntegerv(pname = GL_DRAW_FRAMEBUFFER_BINDING, params = &4)
1919796 glViewport(x = 0, y = 0, width = 1920, height = 1088)
1919797 glScissor(x = 0, y = 0, width = 1920, height = 1088)
1919798 glDrawElements(mode = GL_TRIANGLES, count = 3912, type = GL_UNSIGNED_INT, indices = blob(15648))
1919799 glFinish()

Does indeed seem like all those faked calls are missing.

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

Thanks. So you're reliably reproducing the issue.

Please build apitrace from source applying the following patch:

diff --git a/wrappers/gltrace.py b/wrappers/gltrace.py
index f48d7a04..248db80f 100644
--- a/wrappers/gltrace.py
+++ b/wrappers/gltrace.py
@@ -483,6 +483,8 @@ class GlTracer(Tracer):
 
             print('    GLMemoryShadow::commitAllWrites(_ctx, trace::fakeMemcpy);')
 
+            if function.name == 'glDrawElements':
+                print(r'   if(count == 3912) __asm("int3");')
             print('    if (_need_user_arrays(_ctx)) {')
             if 'Indirect' in function.name:
                 print(r'        os::log("apitrace: warning: %s: indirect user arrays not supported\n");' % (function.name,))

This will trigger a breakpoint when tracing the troublesome draw call. Then trace under gdb, by running

$ apitrace trace --debug -o glretrace.trace glretrace nep-fantasy-zone-trimmed.trace
[...]
(gdb) run
[...]
Program received signal SIGTRAP, Trace/breakpoint trap.
glDrawElements (mode=4, count=3912, type=5125, indices=0x555576c897b0) at /home/jfonseca/projects/apitrace/wrappers/glxtrace.cpp:26426
26426	    if (_need_user_arrays(_ctx)) {

Then step through from here. _need_user_arrays should return true. For me it returns true after _glClientActiveTexture call.

from apitrace.

zmike avatar zmike commented on June 7, 2024

Hm which line are you saying it returns from exactly?

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

It's in wrappers/glxtrace.cpp code generated at build time, so line numbers might vary. For me it returned true here:

    // glTexCoordPointer
  if ((profile.desktop() || es1)) {
    GLint max_units = 0;
    if (_ctx->profile.desktop())
        _glGetIntegerv(GL_MAX_TEXTURE_COORDS, &max_units);
    else
        _glGetIntegerv(GL_MAX_TEXTURE_UNITS, &max_units);
    GLint client_active_texture = GL_TEXTURE0;
    if (max_units > 0) {
        _glGetIntegerv(GL_CLIENT_ACTIVE_TEXTURE, &client_active_texture);
    }
    GLint unit = 0;
    do {
        GLint texture = GL_TEXTURE0 + unit;
        if (max_units > 0) {
            _glClientActiveTexture(texture);
        }
    if (_glIsEnabled(GL_TEXTURE_COORD_ARRAY) &&
        _glGetInteger(GL_TEXTURE_COORD_ARRAY_BUFFER_BINDING) == 0) {
    if (max_units > 0) {
        _glClientActiveTexture(client_active_texture);
    }
        return true;   <==============================================
    }
    } while (++unit < max_units);
    if (max_units > 0) {
        _glClientActiveTexture(client_active_texture);
    }
  }

But even if that didn't returned true early, it should also return true on any of the folllowing points:

    // glColorPointer
  if ((profile.desktop() || es1)) {
    if (_glIsEnabled(GL_COLOR_ARRAY) &&
        _glGetInteger(GL_COLOR_ARRAY_BUFFER_BINDING) == 0) {
        return true;  <==========================================
    }
  }

    // glNormalPointer
  if ((profile.desktop() || es1)) {
    if (_glIsEnabled(GL_NORMAL_ARRAY) &&
        _glGetInteger(GL_NORMAL_ARRAY_BUFFER_BINDING) == 0) {
        return true; <=================================================
    }
  }

    // glVertexPointer
  if ((profile.desktop() || es1)) {
    if (_glIsEnabled(GL_VERTEX_ARRAY) &&
        _glGetInteger(GL_VERTEX_ARRAY_BUFFER_BINDING) == 0) {
        return true;   <==============================================
    }
  }

    // ES1 does not support generic vertex attributes
    if (es1)
        return false;

    // glVertexAttribPointer
    GLint _max_vertex_attribs = _glGetInteger(GL_MAX_VERTEX_ATTRIBS);
    for (GLint index = 0; index < _max_vertex_attribs; ++index) {
        if (_glGetVertexAttribi(index, GL_VERTEX_ATTRIB_ARRAY_ENABLED) &&
            _glGetVertexAttribi(index, GL_VERTEX_ATTRIB_ARRAY_BUFFER_BINDING) == 0) {
            return true;  <=================
        }
    }

That is, it should never reach the return false at the end of the function.

from apitrace.

zmike avatar zmike commented on June 7, 2024

Alright, I'm closing in on something. _glGetInteger(GL_TEXTURE_COORD_ARRAY_BUFFER_BINDING) is returning -1, but only sometimes. Other times it (correctly) returns 0.

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

Ah, -1 is a weird number though. It can't be a true buffer binding.

Mesa often changes the dispatch tables to do things. That in turn could be causing the glGet to get wrong data. Perhaps something like this is happening here

I'm surprised I don't see the bug with Xlib state tracker + llvmpipe though. AFAICT, Zink is just another gallium driver, so it's unclear why it should be affected any more than for example llvmpipe.

from apitrace.

zmike avatar zmike commented on June 7, 2024

It seems to be related to glthread? Or at least I'm sometimes getting different values running with mesa_glthread=false since zink enables it by default. Maybe if you use mesa_glthread=true with llvmpipe you'll see the issue there?

from apitrace.

zmike avatar zmike commented on June 7, 2024

2180847 Seems to be the first broken draw call, or at least it's the first one where checking GL_TEXTURE_COORD_ARRAY_BUFFER_BINDING returns -1.

Following this, I think I've found the issue: glthread does internal buffer object allocation from user data, and all its buffers get allocated with id = -1. Thus, binding one of these buffers will return -1 as the ARRAY_BUFFER_BINDING value. That confuses the replayer, which expects 0.

from apitrace.

zmike avatar zmike commented on June 7, 2024

That's in the debug replay environment, at least. Changing the id value to 0 (glthread_bufferobj.c:36) causes _need_user_arrays to return true there.

This doesn't seem to help replaying the trace normally (still crashes), and setting mesa_glthread=false doesn't affect it either.

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

This doesn't seem to help replaying the trace normally (still crashes), and setting mesa_glthread=false doesn't affect it either.

Right. That ship sailed -- the trace we got is hopelessly broken. But once we fix the glGet weirdness, then future traces (of the real app) should replay fine.

from apitrace.

zmike avatar zmike commented on June 7, 2024

Ahh, okay. So then I think I know enough about what the issue is to try submitting a fix, and then I can ask people to re-create traces. If that fails somehow, then maybe we can revisit.

Thanks again for the help!

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

Following this, I think I've found the issue: glthread does internal buffer object allocation from user data, and all its buffers get allocated with id = -1. Thus, binding one of these buffers will return -1 as the ARRAY_BUFFER_BINDING value.

I see. Yes, that sounds a bad thing to do. glthread should not have side effects visible to the application, other than performance.

That confuses the replayer, which expects 0.

Above all, it confuses the LD_PRELOAD tracer.

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

Thanks again for the help!

No prob! Thanks for narrowing it down.

from apitrace.

jrfonseca avatar jrfonseca commented on June 7, 2024

I saw you fixed this in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22293 . Just making a note here for future reference. Thanks

from apitrace.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.