Giter Site home page Giter Site logo

Comments (8)

rxy1212 avatar rxy1212 commented on May 27, 2024

你是不是更改了ngram的值,这个值由默认值1改为其他值的时候会导致模型大接近1000倍

from fasttext4j.

jimichan avatar jimichan commented on May 27, 2024

MAX_VOCAB_SIZE这个和你内存溢出没有直接关系,@rxy1212可能是对的。
每个fasttext模型消息内存的大小和模型文件有关,你导入多个可能导致溢出,
如果是测试环境,你可以把c语言的模型转换为fasttext4j自由的java格式,然后通过内存映射的方式读入模型,只是查询速度会慢

from fasttext4j.

Alan000 avatar Alan000 commented on May 27, 2024

ngram值为1,没有动。加载一个或少量模型的时候是不会报错,我这边出错的场景是顺序一次性加载多个(有100多个)已经训练好的fasttext bin模型文件。加载完一部分后就会内存溢出错误,定位错误的位置是在MAX_VOCAB_SIZE这个地方。
单个模型都很小,总模型文件加起来32MB。
kotlin语言不太了解,我的意思是:是不是因为每个模型都申请这么大的内存空间,导致多个模型加载会内存溢出。

from fasttext4j.

jimichan avatar jimichan commented on May 27, 2024

MAX_VOCAB_SIZE这个不可以设置小,因为他是一个hash桶。
另外你这个fasttext模型是用来分类的吗?如果是分类模型,那么你可以使用乘积量化来压缩模型看看。
你这个场景真的很特别,100多个模型?

from fasttext4j.

rxy1212 avatar rxy1212 commented on May 27, 2024

from fasttext4j.

jimichan avatar jimichan commented on May 27, 2024

模型是java版本的么,还是c语言版本。我考虑是不是能共用其中的词典部分,来减少内存占用

from fasttext4j.

rxy1212 avatar rxy1212 commented on May 27, 2024

from fasttext4j.

jimichan avatar jimichan commented on May 27, 2024
<dependency>
  <groupId>com.mayabot.mynlp</groupId>
  <artifactId>fastText4j</artifactId>
  <version>3.1.0</version>
</dependency>

已经修复这个问题,还可以指定这个参数

from fasttext4j.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.