Giter Site home page Giter Site logo

Comments (14)

jazzyb avatar jazzyb commented on September 23, 2024

Can you point to a gist or repo with the tests that are causing the assertion failures?

from esqlite.

obmarg avatar obmarg commented on September 23, 2024

@jazzyb I'm just running mix test in a clean copy of https://github.com/mmmries/sqlitex at commit 68343859.

It doesn't happen every time, but it does happen frequently - maybe about 10% of the times I run mix test I get this assertion failed error.

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

Strange. I haven't seen that error on Ubuntu or FreeBSD. Do you know which of the OTP 17 releases you are using? Are you on Yosemite (10.10.3 or 10.10.4)?

from esqlite.

obmarg avatar obmarg commented on September 23, 2024

Yeah, it is a strange one - figured it wouldn't happen for everyone or it'd likely have been found or fixed by now.

Yosemite 10.10.3.
Erlang & Elixir are both from homebrew. Homebrew says it's Erlang 17.4.
Erlang just says 17:

erlang:system_info(otp_release).
"17"

I did wonder if this could be something to do with the destruct_esqlite_connection function calling enif_thread_join - presumably this blocks and could cause the "strange behaviour" mentioned in the NIF documentation:

A native function that do lengthy work before returning will degrade responsiveness of the VM, and may cause miscellaneous strange behaviors. Such strange behaviors include, but are not limited to, extreme memory usage, and bad load balancing between schedulers. Strange behaviors that might occur due to lengthy work may also vary between OTP releases.

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

It certainly could be the scheduler; however, I understand the library was designed to work with the BEAM scheduler. Furthermore, unless you're performing queries across thousands of rows -- which the Sqlitex tests do not do -- I wouldn't expect execution to even stay in the NIF long enough to throw-off the scheduler.

I'll start investigating the issue this evening. Thanks for bringing this up.

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

One final question: Was there any pattern to which Sqlitex test triggered the assertion, or did it seem random?

from esqlite.

obmarg avatar obmarg commented on September 23, 2024

@jazzyb Thanks, would be great to get to the bottom of this.

It's pretty much always in SqlitexTest, but other than that it seems pretty random precisely which test fails. So far I have seen it fail in "it inserts Erlang datetime tuples", "it inserts nil", "decimal types with scale and precision" and "that it returns an error for a bad query". Probably others as well, but I haven't been running the tests with --trace most of the time.

Around the scheduler: the majority of the work does run on a background thread, but (as far as I understand) destruct_sqlite_connection (which calls the queue_send function with the assert) will run on be called on an erlang thread by the GC. On this line it calls enif_thread_join to wait for the background thread to finish. That's kind of where I imagined the issue to arise - if the background thread didn't stop quickly.

If you look at my printf output in the report you'll notice some "Thread Finished" messages that I added as well - the second Destruct occurs before both "Thread Finished" and "Thread Destructed", which implies that the first destruction has stopped executing somewhere before the end of the function.

But anyway, this is just a guess - I definitely don't know enough about Erlangs garbage collection and/or scheduler to say more than that.

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

Just an quick update: I've just got a VM setup, and I have verified that I see the same error on OS X 10.10 with Erlang 17.5 and Elixir 1.0.5.

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

I think I've isolated the Sqlitex test that is causing the crash and have a temporary work-around for the tests.

This test is causing the crash although I don't yet know why:

  test "server query times out" do
    {:ok, conn} = Sqlitex.Server.start_link(":memory:")
    assert match?({:timeout, _},
      catch_exit(Sqlitex.Server.query(conn, "SELECT * FROM sqlite_master", timeout: 0)))
  end

What this test is doing is making sure that the server will timeout on long queries. The thing about timeouts is that they give up waiting on a response, but the server will continue to process the request and eventually try to send a response which will just sit in the caller's mailbox because the caller is no longer waiting on it. I think the crash is happening somewhere at this point because the NIF is still trying to process the request when the process is killed by the end of the test. I don't know why that should be so, but it looks like that is what is happening. I'll need to spend more time later looking through the esqlite code to determine the actual cause.

In the meantime I may have a work-around to at least get the Sqlitex tests to pass. @obmarg, can you pull my assertion-error branch to your system and see if it fixes the issue for you. It appears to fix the issue on my end. The change simply waits for the timed-out response before ending the test. If that does fix the issue, then the good news is that you probably won't run into this error in your own code unless you're using the Sqlitex.Server and your query times-out.

One other thing about the addresses you saw being repeated in your original comment: I saw many connection addresses being reused for runs that didn't crash. I don't think that is necessarily an issue; the esqlite code looks like it's just reusing memory for different database connections.

from esqlite.

obmarg avatar obmarg commented on September 23, 2024

Thanks for taking the time to investigate this so quickly, I really appreciate it.

I'll pull down that branch and see later on, just quickly wanted to respond to your last paragraph: it wasn't addresses being re-used that I thought was a problem. It was addresses being destructed twice without being constructed in the middle. Might be clearer if I cut down my sample output above to just the relevant address:

Constructed 0x162000D8
Destructing 0x162000D8
Thread ended 0x162000D8
Destructed 0x162000D8
Constructed 0x162000D8
Destructing 0x162000D8
Destructing 0x162000D8

The first 4 lines are fine - address is constructed, destructing starts, the thread ends and then destructing finishes. It's the next three lines that worry me - construction, destruction starts, destruction starts again. The assertion failure happens immediately after that last line, so I'd be seriously surprised if this was not at least related to the problem

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

Ah. I see now. I'll keep my eye on that when I investigate further. Thanks.

from esqlite.

obmarg avatar obmarg commented on September 23, 2024

Tried that assertion-error branch does seem to have stopped mix test from crashing at least.

from esqlite.

mmzeeman avatar mmzeeman commented on September 23, 2024

Sorry, I was on a cycling/camping trip with my son without a computer.

Thanks for making such a detailed description, I'll try to find out what is going on.

from esqlite.

jazzyb avatar jazzyb commented on September 23, 2024

@mmzeeman I do not believe this fixes the underlying issue. Your most recent fix now causes assertion failures for "Attempting to destroy a non-empty queue." I'll probably get some spare cycles this week to look into it.

from esqlite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.