Giter Site home page Giter Site logo

node-word2vec's Issues

Had to remove arr.pop() on line 373 in model.js to get this library working

This line here was removing the last item from each vector which made their lengths different which caused a whole bunch of chaos down the line (multiplying by undefined).

I have no idea why it's there, but I did notice that it works for the example vector.txt in this project. Maybe something to do with \r and \n?

Also I added this:

if(isNaN(words) || isNaN(size)) {
  throw new Error("First line of input text file should be <number of words> <length of vector>. See example data 'vectors.txt' in repo");
}

After this line since that caused me a lot of trouble (I don't think it's mentioned anywhere in the readme).

Thanks for the awesome library!

Child process exited with code null

var w2v = require( 'word2vec' );

w2v.word2vec( __dirname + '/input.txt', __dirname + '/output.txt', {
	cbow: 1,
	size: 200,
	window: 8,
	negative: 25,
	hs: 0,
	sample: 1e-4,
	threads: 20,
	iter: 15,
	minCount: 2
});

the example don't seem to work? it only returns Child process exited with code null also output.txt empty

npm Test problem

Hello,
It seems there are some files missing for the "npm test":

  1. word2phrase can be called successfully:
    Uncaught AssertionError: expected false to be true

Do you continue this project?

Thanks,
Thomas

.mostSimilar() returns array of undefined words for an out-of-dictionary word

As the title says, .mostSimilar() returns array of objects with word = undefined and dist = -1 for an out-of-dictionary word. The array is as long as the number of entries requested in the second argument of mostSimilar. I would expect it to return an empty array or null if the word is not found in the dictionary.

Cheers

Code 126

Hi there - I am running a very basic example, but something seems not to work. I get a code 126 all the time, but not sure what is happening. Here is my code


w2v = require('word2vec');

w2v.word2phrase( 'in.txt', 'out.txt', {
    threshold:100,
    debug:2,
    minCount: 5
}, done);

function done(data)
{
  console.log(data);
}

'make' is not recognized as an internal or external command, in windows 64 bit

make --directory=src

'make' is not recognized as an internal or external command,
operable program or batch file.
npm ERR! Windows_NT 10.0.14393
npm ERR! argv "F:\studies\node\node.exe" "F:\studies\node\node_modules\npm\bin\npm-cli.js" "install" "word2vec"
npm ERR! node v6.11.0
npm ERR! npm v3.10.10
npm ERR! code ELIFECYCLE

npm ERR! [email protected] postinstall: make --directory=src
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] postinstall script 'make --directory=src'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the word2vec package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR! make --directory=src
npm ERR! You can get information on how to open an issue for this project with:
npm ERR! npm bugs word2vec
npm ERR! Or if that isn't available, you can get their info via:
npm ERR! npm owner ls word2vec
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR! F:\git repository\npm-debug.log

Why is 1GB of ram required to load a 160mb word vector file?

Are there some pre-computations done which cause this lib to eat up RAM? Because I'm using a 160mb text file of word vectors and the node process is taking up 900mb+ of RAM.

Just wondering whether there is a good reason for this, or whether I should dig about looking for some inefficiencies somewhere.

Thanks

Pass text and get vectors as output without the use of files?

Is it possible to call the word2vec method and pass training data in raw and get the vectors out as raw also ?

The documentation seems to suggest you can only pass a path to the to input data and output data files.

Any suggestions to overcome this ?

LoadModel memory overflow

Hi there, I trained a model on the google news corpus and was able to successfully create an output to load, however when I use the loadModel function with the output dataset I'm getting a memory overflow from node.

Im running Node v18, linux

I also tried including the flag for memory allocation in the npm run script, and while it seems to run for a longer period before overflowing, it still doesn't complete with up to 12Gb allocated.

I'm running on a 16Gb RAM, but I was wondering if I'm missing an optimization step. The word embeddings file is only 3Gb.

If there's anything that can be done to better utilize memory I feel like it should work, as I was able to train this model on the same machine.

Any help would be much appreciated! Thanks.

clang: error: the clang compiler does not support '-march=native' (Apple M1)

Thanks for your effort, but when I try to do a "npm install word2vec", I get an error, as mentioned in the title of the issue:
"clang: error: the clang compiler does not support '-march=native'". I also have a computer with an Apple M1 chip, a MacBook Air (M1, 2020). I was trying to figure it out, but it ended up not being so straightforward for me. Could you please give me a hint to help me solve this? Thanks!

Error: spawn ./word2vec ENOENT

Hello, When I start node, I get this error

Error: spawn ./word2vec ENOENT
at _errnoException (util.js:1024:11)
at Process.ChildProcess._handle.onexit (internal/child_process.js:190:19)
at onErrorNT (internal/child_process.js:372:16)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
at Function.Module.runMain (module.js:678:11)
at startup (bootstrap_node.js:187:16)
at bootstrap_node.js:608:3

I really do not know how to fix it

`mostSimilar` outputs numbers when using Fasttext word vectors

Hi,

First of all, thanks for the awesome work!

I am trying to import the pre-trained files from the fasttext repo: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md

The model loads without a problem; however, when I try mostSimilar, the most similar words appear to be numbers:

loadedModel.mostSimilar('hi')

> [ { word: '73301', dist: 0.4461598818767161 },
  { word: '266', dist: 0.44462500361860946 },
  { word: '399', dist: 0.44260747560473973 },
  { word: '-0.13061', dist: 0.4250619904094889 },
  { word: '745', dist: 0.4089746546859616 },
  { word: '7', dist: 0.39388342200258686 },
  { word: '233', dist: 0.38675386429631425 },
  { word: '.33347', dist: 0.38672456155896373 },
  { word: '999', dist: 0.3798941950492955 },
  { word: '.5158', dist: 0.3761412428047805 },
  { word: '4785', dist: 0.3756878374324986 },
  { word: '', dist: 0.3753017613199615 },
  { word: '4091', dist: 0.3728785618174816 },
  { word: '0.18393', dist: 0.3702285209309231 },
  { word: '5', dist: 0.3694416515730196 },
  { word: '', dist: 0.3682340927295216 },
  { word: '2', dist: 0.3682152969462404 },
  { word: '68', dist: 0.36721353813091373 },
  { word: '10285', dist: 0.36564681449501635 },
  { word: '', dist: 0.36526450978156066 },
  { word: '014575', dist: 0.36389461240841203 },
  { word: '468', dist: 0.36371019302454455 },
  { word: '-0.00046764', dist: 0.3637013226972051 },
  { word: '.012665', dist: 0.36367885124101007 },
  { word: '142', dist: 0.3636392745394945 },
  { word: '574', dist: 0.36060934864973193 },
  { word: '0.6865', dist: 0.3602319353978014 },
  { word: '91', dist: 0.357913584485305 },
  { word: '53', dist: 0.35790250493633724 },
  { word: '925', dist: 0.3576282053138198 },
  { word: '1942', dist: 0.35588944804722655 },
  { word: '', dist: 0.3558833583782604 },
  { word: '3', dist: 0.3546257354328858 },
  { word: '-0.059739', dist: 0.3546232535404894 },
  { word: '', dist: 0.35400407472165496 },
  { word: '08', dist: 0.3536348589615367 },
  { word: '093', dist: 0.35353088901048624 },
  { word: '0.11736', dist: 0.3529077373455495 },
  { word: '.12359', dist: 0.3511316591255266 },
  { word: '10224', dist: 0.35079793819829935 } ]

I also tried hello it says it is out of the dictionary. How can I import the Fasttext files so that this won't happen?

npm install error

Tried installing and got this error

distance.c:18:10: fatal error: 'malloc.h' file not found
#include <malloc.h>
         ^
1 error generated.
make: *** [distance] Error 1
npm ERR! Darwin 14.3.0
npm ERR! argv "node" "/usr/local/bin/npm" "install" "word2vec"
npm ERR! node v0.12.2
npm ERR! npm  v2.7.4
npm ERR! code ELIFECYCLE

npm ERR! [email protected] postinstall: `make --directory=src`
npm ERR! Exit status 2
npm ERR!
npm ERR! Failed at the [email protected] postinstall script 'make --directory=src'.
npm ERR! This is most likely a problem with the word2vec package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     make --directory=src
npm ERR! You can get their info via:
npm ERR!     npm owner ls word2vec
npm ERR! There is likely additional logging output above.

Vec2Word

Given the example provided.
vector('king') - vector('man') + vector('woman')

It would be nice to be able to pass in the raw vector and get the word.

"Child process exited with code 126" error

When I ran:
const w2v = require( 'word2vec' );
const corpusFilePath = 'cleared_words.txt';

w2v.word2vec(corpusFilePath, "vectors.txt", { size: 300 }, () => {
console.log("DONE");
});

It returns:
Child process exited with code 126
DONE

leaving vectors.txt unchanged. Is this because I'm running this on MacOS?

In Readme

function is written as .word2phrases() but it is .word2phrase()

post installation script fails on windows.

I tried to install this package on windows and it failed with following error:

npm ERR! [email protected] postinstall: `make --directory=src`
npm ERR! Exit status 2
npm ERR!
npm ERR! Failed at the [email protected] postinstall script 'make --directory=src'.

npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the word2vec package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     make --directory=src
npm ERR! You can get information on how to open an issue for this project with:
npm ERR!     npm bugs word2vec
npm ERR! Or if that isn't available, you can get their info via:
npm ERR!     npm owner ls word2vec
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR!     C:\rbws\learning\nlp\projects\nlpnode\npm-debug.log

How to successfully load the GoogleNews-vectors-negative300 model?

Hi Philipp,
I downloaded from https://code.google.com/p/word2vec/ the file GoogleNews-vectors-negative300.bin.gz

w2v = require('word2vec');
{ word2vec: [Function: word2vec],
word2phrase: [Function: word2phrase],
loadModel: [Function: loadModel],
WordVector: [Function: WordVector] }

w2v.loadModel("/home/marco/crawlscrape/bashUtilitiesDir/GoogleNews-vectors-negative300.bin", function(err, model) {
... console.log(model);
... });
undefined
TypeError: Cannot read property 'length' of undefined
at /home/marco/node_modules/word2vec/lib/model.js:408:30
at FSReqWrap.wrapper as oncomplete

w2v.loadModel("/home/marco/crawlscrape/bashUtilitiesDir/GoogleNews-vectors-negative300.bin", function(err, model) {
... console.log(model);
... });
undefined
TypeError: undefined is not a function
at readOne (/home/marco/node_modules/word2vec/lib/model.js:433:55)
at FSReqWrap.wrapper as oncomplete

What do I have to do in order to successfully load the GoogleNews-vectors-negative300 model?

Looking forward to your kind help.
Marco

'make' is not recognized as an internal or external command,

warning "@tensorflow/tfjs > @tensorflow/[email protected]" has unmet peer dependency "seedrandom@^3.0.5".
[4/4] Building fresh packages...
error C:\Users\lenov\Desktop\apps\test\s\a\node_modules\word2vec: Command failed.
Exit code: 1
Command: make --directory=src
Arguments:
Directory: C:\Users\lenov\Desktop\apps\test\s\a\node_modules\word2vec
Output:
'make' is not recognized as an internal or external command,
operable program or batch file.
info Visit https://yarnpkg.com/en/docs/cli/add for documentation about this command.

Make error postinstall

Hello dev guys! :)

Would appreciate your help with install failure. I got GnuWin32 installed and path is specified in system and user path settings. Still I got this error.

Output (from npm):

`C:\Users\kolyk\WebstormProjects\whislabackend>npm install word2vec

> [email protected] postinstall C:\Users\kolyk\WebstormProjects\whislabackend\node_modules\word2vec
> make --directory=src

make: Entering directory `C:/Users/kolyk/WebstormProjects/whislabackend/node_modules/word2vec/src'
gcc word2vec.c -o word2vec -lm -pthread -O3 -march=native -Wall -funroll-loops -Wno-unused-result -fno-stack-protector
process_begin: CreateProcess(NULL, gcc word2vec.c -o word2vec -lm -pthread -O3 -march=native -Wall -funroll-loops -Wno-unused-result -fno-stack-protector, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [word2vec] Error 2
make: Leaving directory `C:/Users/kolyk/WebstormProjects/whislabackend/node_modules/word2vec/src'
npm ERR! code ELIFECYCLE
npm ERR! errno 2
npm ERR! [email protected] postinstall: `make --directory=src`
npm ERR! Exit status 2
npm ERR!
npm ERR! Failed at the [email protected] postinstall script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     C:\Users\kolyk\AppData\Roaming\npm-cache\_logs\2017-08-20T14_11_30_352Z-debug.log

C:\Users\kolyk\WebstormProjects\whislabackend>

Any ideas?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.