Giter Site home page Giter Site logo

Segfault with subleveldown about subleveldown HOT 27 CLOSED

level avatar level commented on May 24, 2024
Segfault with subleveldown

from subleveldown.

Comments (27)

ralphtheninja avatar ralphtheninja commented on May 24, 2024 1

Are you sure the following commands are correct?

git clone https://github.com/Level/subleveldown
cd subleveldown
git checkout -b sub
git pull origin sub
npm ci
node test/read.js

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024 1

@christianbundy Do you have enough information to continue?

from subleveldown.

christianbundy avatar christianbundy commented on May 24, 2024 1

Sorry, I mean that I've also been unable to write a minimal test case. You can still use these instructions but I haven't had the time to whittle it down any further.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024 1

OK. FYI, it might take some time for us to find a proper fix for the subleveldown bug. Also, don't use that #open-close-hack branch for anything other than tests 😉

from subleveldown.

christianbundy avatar christianbundy commented on May 24, 2024

Oof, sorry about that. I've double-checked that this is 100% correct:

git clone https://github.com/flumedb/flumeview-level
cd flumeview-level
git checkout -b sub
git pull origin sub
npm ci
node test/read.js

from subleveldown.

ralphtheninja avatar ralphtheninja commented on May 24, 2024

I personally get a SIGSEGV:

screenshot from 2019-02-02 00-48-47

It's probably the same issue but blows up somewhere else due to timing etc. It's difficult to see directly where this goes wrong. My gut feeling tells me something is trying to read data after the database has been closed. This is (unfortunately) a known problem with leveldown.

from subleveldown.

ralphtheninja avatar ralphtheninja commented on May 24, 2024

Btw, is this test code new or has it been around for a while? Wondering if there's something that has changed in leveldown that all of a sudden broke the tests.

from subleveldown.

ralphtheninja avatar ralphtheninja commented on May 24, 2024

@vweevers Any ideas?

from subleveldown.

christianbundy avatar christianbundy commented on May 24, 2024

Test code is super new. I'm on mobile so this will be brief, but the gist is that on Scuttlebutt we have many instances of leveldown running and I wanted to experiment with a single instance shared by subleveldown. As far as I can tell this passes all previous tests except "retest", which tries to close and re-open to database to ensure persistence.

Is my intuition correct that sharing an instance of leveldown via subleveldown may be more performant, or is this all just a wild goose chase?

Thanks for your help! I'm really appreciative at how quickly you've triaged this issue.

from subleveldown.

ralphtheninja avatar ralphtheninja commented on May 24, 2024

Test code is super new.

This is good to know.

Is my intuition correct that sharing an instance of leveldown via subleveldown may be more performant, or is this all just a wild goose chase?

It might be more performant. You'd have to write some benchmarks to verify this. Also depends on what type of performance you're after. Maybe lower memory usage at the expense of more key processing? Just guessing here.

Anyway. I think you should take a closer look at the code dealing with closing and re-opening the database. Take care to close the database once you know all iterators are done (read streams etc), then re-open.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

On Windows I get a nullptr in a get() operation:

image

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

@christianbundy I think the issue (or at least one issue) is that the database isn't actually closed. Because flumedb.close() does not close the flumelog-level instance or its underlying db.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

You can check this by adding the following lines to flumelog-level:

db.on('closing', () => console.error('closing'))
db.on('closed', () => console.error('closed'))

Both don't happen, until I make the following patch to flumedb:

index ca81863..e77e925 100644
--- a/index.js
+++ b/index.js
@@ -179,8 +179,11 @@ module.exports = function (log, isReady, mapper) {
           if(sv.close) sv.close(cb)
           else cb()
         }
-      })) (cb)
-
+      }))(function (err) {
+        log.db.close(function (err2) {
+          cb(err || err2)
+        })
+      })
     }
   }
   return flume

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

Side note: mkdirp() before opening should not be needed, LevelDB does that too.

from subleveldown.

ralphtheninja avatar ralphtheninja commented on May 24, 2024

On Windows I get a nullptr in a get() operation:

This leads me to think it hasn't been opened since db is NULL in the Database class until we open it.

https://github.com/Level/leveldown/blob/82b9675a927c9af5482d1acbc7d342b0229a9273/src/database.cc#L26

https://github.com/Level/leveldown/blob/82b9675a927c9af5482d1acbc7d342b0229a9273/src/database.cc#L43

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

@ralphtheninja with the close() patch above, I no longer get the null pointer. So the issue is not that the test doesn't open the db, but that it opens a db still in use by a previous instance. The "empty close X" tests still fail, but that's a separate issue.

from subleveldown.

christianbundy avatar christianbundy commented on May 24, 2024

Yeah, it sounds like I'm trying to open an already-open database, but I'm not clear on why that's segfaulting rather than just throwing an error. I've tried to write a minimal reproducible test-case, but I'm not sure exactly why this segfaults.

I'm happy to just use the workaround of not trying to open an already-open database, but I was surprised by the segfault.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

True, the segfault does warrant further investigation. I'll have a look.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

@christianbundy I'm struggling to write a test case (I can't seem to produce a segfault unless I create an iterator, but that's not the issue here AFAICT and already slated to be fixed by #597).

Could you share yours?

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

It seems ralphtheninja was on the right track in https://github.com/Level/leveldown/issues/590#issuecomment-459952479.

The chain of db's is not closed properly somewhere, leading to a broken state. Then when reopened, which wraps some of the existing db's in new db's, the underlying LevelDOWN instance is not opened, which then leads to a null pointer on the first get() (or other operation).

Initially, the state is:

image

Once opened, all those statuses change to open, like they should. But before and after closing, the state is:

image

Note that EncodingDOWN stays "open" for some reason. Could be an issue in subleveldown (this is really hard to track).

Then, in the # reload test, the db is re-wrapped, inheriting some of that bad state:

image

Because EncodingDOWN believes it's open, it does not open its underlying LevelDOWN instance. So after open, the state is:

image

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

So subleveldown, because it unwraps the db chain to get to leveldown and then takes control of it, closes that leveldown instance. It bypasses encoding-down because it doesn't want that particular layer; it rewraps leveldown with a new encoding-down instance. But, if you then reuse the original encoding-down instance (directly or via levelup) and it was already opened, you get the above situation.

Hard to explain, but I think I have enough info now to write a test.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

Also, the reason that the close patch above (https://github.com/Level/leveldown/issues/590#issuecomment-459950304) fixes the situation, is because in that case, the original encoding-down gets closed and therefore is in a good state to be reused.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

Long story short, the bug is in subleveldown. It opens the levelup instance that you pass in, but then ignores (what that does to the state of) encoding-down. On close, it only closes leveldown, not the originally passed in levelup or by extension encoding-down. To fix, we should either:

  1. Unwrap the levelup instance in the subleveldown constructor, discarding both the levelup and encoding-down layers. Then subleveldown only has to open the leveldown instance.
  2. Keep that part as-is, but also take control of the levelup instance and close it.

I prefer 1, but feel like I'm missing something because the way it currently works suddenly seems needlessly complicated. AFAICT subleveldown does not use the levelup instance for any other purpose than to reopen the underlying store.

Moving this issue to the subleveldown repo.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

but feel like I'm missing something

Ah yes. I was missing the fact that the levelup instance that is passed in, may already have subleveldown in its chain. Gah. Also, we lack a mechanism in abstract-leveldown to hook into an opening db (there are no events).

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

@christianbundy With npm i Level/subleveldown#open-close-hack on top of your test setup I can avoid the segfault, but one of the tests fails on incorrect data.

Could that be because flumelog-level does not use subleveldown? E.g. it reads+writes unprefixed data, conflicting with the keyspace of flumeview-level. I'm not familiar with the design of flume so I can't really investigate that part. Did these tests pass before you introduced subleveldown?

(The "empty close" tests also still fail but to reiterate, that's a separate problem and I want to put that aside for now)

from subleveldown.

christianbundy avatar christianbundy commented on May 24, 2024

No, these tests never passed -- this was an experiment with adding subleveldown, and when I got a segfault I just reported with my test-case. It's very possible that it's writing unprefixed data, I don't think that I understood that a prefix was necessary so that's probably been implemented incorrectly by me.

from subleveldown.

vweevers avatar vweevers commented on May 24, 2024

@christianbundy Why did you close this? Did you find a fix on your end?

from subleveldown.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.