Comments (8)
你是不是更改了ngram的值,这个值由默认值1改为其他值的时候会导致模型大接近1000倍
from fasttext4j.
MAX_VOCAB_SIZE这个和你内存溢出没有直接关系,@rxy1212可能是对的。
每个fasttext模型消息内存的大小和模型文件有关,你导入多个可能导致溢出,
如果是测试环境,你可以把c语言的模型转换为fasttext4j自由的java格式,然后通过内存映射的方式读入模型,只是查询速度会慢
from fasttext4j.
ngram值为1,没有动。加载一个或少量模型的时候是不会报错,我这边出错的场景是顺序一次性加载多个(有100多个)已经训练好的fasttext bin模型文件。加载完一部分后就会内存溢出错误,定位错误的位置是在MAX_VOCAB_SIZE这个地方。
单个模型都很小,总模型文件加起来32MB。
kotlin语言不太了解,我的意思是:是不是因为每个模型都申请这么大的内存空间,导致多个模型加载会内存溢出。
from fasttext4j.
MAX_VOCAB_SIZE这个不可以设置小,因为他是一个hash桶。
另外你这个fasttext模型是用来分类的吗?如果是分类模型,那么你可以使用乘积量化来压缩模型看看。
你这个场景真的很特别,100多个模型?
from fasttext4j.
from fasttext4j.
模型是java版本的么,还是c语言版本。我考虑是不是能共用其中的词典部分,来减少内存占用
from fasttext4j.
from fasttext4j.
<dependency>
<groupId>com.mayabot.mynlp</groupId>
<artifactId>fastText4j</artifactId>
<version>3.1.0</version>
</dependency>
已经修复这个问题,还可以指定这个参数
from fasttext4j.
Related Issues (20)
- fasttext的序列化 HOT 4
- 训练词向量时,请问如何调整bucket参数,谢谢! HOT 3
- 如何评估测试集? HOT 2
- 相同模型文件(官方 c++ 训练的 ftz), 预测结果不一致 HOT 8
- load 官方python版bin classifier 报错 HOT 3
- 您能提供全java代码吗?Kotlin不熟,还需要时间学习,如果java就直接上手撸了,程序员都喜欢这样。。。:) HOT 1
- cutoff and retrain args missing while quantizing HOT 2
- TrainArgs中wordNgrams参数设置
- nnSearch and analogies does not return the specified k HOT 1
- Automatic hyperparameter optimization HOT 2
- IllegalArgumentException "Unknown EntryType enum second" when loading saved model.
- train和predict好用,单没有找到test方法, HOT 2
- How to load crawl-300d-2M-subword.zip vector? HOT 1
- 模型测试问题 HOT 1
- 加载模型出错 HOT 6
- This library returns wrong vectors when reading from cpp binary
- 预训练词向量没有生效 HOT 1
- FastText训练速度慢 HOT 1
- 在python版中用到了reduce_model在这里没有找到 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fasttext4j.