Giter Site home page Giter Site logo

Comments (9)

Pumpapa avatar Pumpapa commented on August 25, 2024

To be complete: I've just reinstalled node and all modules from scratch, currently v0.12.7 (v4.0.0 obviously doesn't work yet). My machine is MacOSX Yosemite v10.10.5

from node-lmdb.

Venemo avatar Venemo commented on August 25, 2024

This is interesting. The output file does contain 26 characters (the correct length of the original string), but somehow every second character is a ^@. The interesting part is that console.log can properly output the second string, so at the moment I don't see a reason why fs.write can't.

But here is where it gets totally weird: if I create a Buffer from both s1 and s2 then console.log tells me that the two buffers contain different bytes.

I also tried closing the transaction only after the file has been written, with the same result.

from node-lmdb.

Venemo avatar Venemo commented on August 25, 2024

Oh, note: I use node v0.12.2 on Fedora 22.

from node-lmdb.

Venemo avatar Venemo commented on August 25, 2024

I checked: the bytes returned from LMDB (and that are wrapped by the CustomExternalStringResource class) are exactly what they should be. My current guess is that either V8 or node does something somewhere internally that messes with external string resources.

Even if I return a dummy string from CustomExternalStringResource::data() node fails to put it into a buffer or write it into a file.

from node-lmdb.

Venemo avatar Venemo commented on August 25, 2024

Okay, I've found the problem:
https://github.com/nodejs/node/blob/v0.12.7/src/string_bytes.cc#L331

In JavaScript, strings are UTF-16 so node-lmdb stores them in UTF-16 as well. The above implementation simply uses memcpy to copy bytes. However copying a number of bytes is not the same as copying a number of UTF-16 characters...

It seems that they've fixed it, at least the madness is not present in the master branch:
https://github.com/nodejs/node/blob/master/src/string_bytes.cc#L382

I'm not sure if they still maintain the 0.x branches, but if they do, this might be worth a bugreport in nodejs.

from node-lmdb.

Pumpapa avatar Pumpapa commented on August 25, 2024

As a workaround I force a copy, which results in the proper string, e.g. by doing ('*'+s).slice(1), as below. I'm unsure why this works, while directly using lmdb's result doesn't, if both strings internally are represented as UTF-16, but it does work.

    var fs = require("fs"),
        lmdb = require('node-lmdb');

    var env = new lmdb.Env();
    env.open({
        path: "./lmdb",
        mapSize: 2*1024*1024*1024, // maximum database size
        maxDbs: 2
    });
    var dbi = env.openDbi({
        name: "moindb",
        create: true
    })

    var nm='name',
        s1 = "This is an ordinary string";
    fs.writeFileSync('1st', s1, {encoding:'utf8'});
    var txnw = env.beginTxn();
    txnw.putString(dbi,nm,s1);
    txnw.commit();
    var txnr = env.beginTxn({ readOnly: true }),
        s2 = txnr.getString(dbi,nm);
    txnr.commit();
    fs.writeFileSync('2nd',('*'+s2).slice(1), {encoding:'utf8'});

from node-lmdb.

Venemo avatar Venemo commented on August 25, 2024

@Pumpapa Take a look at the link I gave. It has working code for strings on the V8 heap (which uses WriteUtf8) and a buggy special case for external string resources (which uses a simple memcpy).

from node-lmdb.

Pumpapa avatar Pumpapa commented on August 25, 2024

Yeah. I looked at the link but didn't look deeper into the context. I agree that it's a bug on their side which has been resolved in the latest version and which will automatically flow into node-lmdb as soon as that is on par with v4.0.0. I added my work-around only for people who, like me, need something today. Thanks for looking into it though. Like I said, I'm very happy with node-lmdb, which serves me much better than my previous solution (mongodb, which is buggy, frankly).

from node-lmdb.

Venemo avatar Venemo commented on August 25, 2024

@Pumpapa I'm glad to hear that. Stay tuned for node 4.0 support! :)

from node-lmdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.