Comments (6)
@xiocode 可以把range条件放在ScoringCriteria中,当文档不在range中的时候返回空切片。例如:
type RangedScoringCriteria struct {
MinScore, MaxScore float32
}
func (criteria RangedScoringCriteria) Score(
doc types.IndexedDocument, fields interface{}) []float32 {
if reflect.TypeOf(fields) != reflect.TypeOf(RangedScoringFields{}) {
return []float32{}
}
rsf := fields.(RangedScoringFields)
// 检查是否在range内。
if rsf.Score < criteria.MinScore || rsf.Score > criteria.MaxScore {
return []float32{}
}
// 进行后续打分。
}
悟空引擎现在还没有索引持久化,会在后续版本中实现,不过重启后将索引表从硬盘载入内存的这个时间应该是不可避免的,如果你需要zero downtime的话,应该考虑多个服务器duplication。
from wukong.
@huichen 嗯,谢谢。
从硬盘读取索引应该要比重建一次索引好点,也比较实际。。
from wukong.
你好我想再请教一下,你这里索引的内存占用情况,100G的文本的话,索引内存占用大概多少?
from wukong.
刚刚做了下测试,leveldb 68M的文本内容,约64W条微博,内存占用 1.25G,词典417M,索引占用800多M,这个是不是太夸张了
from wukong.
@xiocode 我测量的词典占用大概100M内存,你的417M可能是在GC之前测量到的值。索引你应该是用了LocationsIndex,这种索引占用内存是比较大的,因为需要存储所有分词的位置,每个docid-分词对需要大约24个字节。如果不需要存储分词位置,可以考虑使用其它索引类型,见
https://github.com/huichen/wukong/blob/master/types/indexer_init_options.go#L4
from wukong.
1.如果我只想按照属性值查询所有id,关键字为空的话,wukong是怎么支持这种查询的。看完源代码,没有关键字就不能进行评分了,是否需要有一个特殊的table保存所有的docID,当没有关键字时只按照属性评分搜索。
2.如果我是优先属性值的评分找到所有docID,然后关键字出现与否算是一个评分标准。这种wukong可以做不
from wukong.
Related Issues (20)
- 关于demo占用大量内存的疑问 HOT 1
- wukong engine adds many extra cmdline flags to the packages that imports wukong engine
- 评分规则必须要在建索引的时候指定么? HOT 1
- Demo挂了???
- timeout的bug
- 是不是还不支持Go 1.8? HOT 2
- 能否查询所有已经建立索引的词条呢
- why same as wukong HOT 1
- Demo打不开, 请修复下 HOT 3
- 请教关于索引创建时的效率问题
- 能否动态更新索引呢? HOT 1
- 动态处理数据
- 116万条数据 内存占用1.5G
- 添加docment时二分查找返回处理
- 爬取微博内容的不灵了啊?
- 请问项目还在维护吗 HOT 1
- go的异步不适合做密集运算,全文检索在语言选择上不应该选go吧。
- 这个代码与riot有什么关系 ? HOT 3
- 引擎是否有商用化的案例呀? HOT 1
- 请问是否可以支持 UUID 作为 docId ?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wukong.