yywlo / sphinx-for-chinese Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/sphinx-for-chinese
Automatically exported from code.google.com/p/sphinx-for-chinese
What steps will reproduce the problem?
1. cmake / make / make install 安装mysql-5.5.1
2. sphinx-VERSION/configure --prefix=/usr/local/sphinx
--with-mysql-includes=/user/local/mysql
3. 报错
What is the expected output? What do you see instead?
checking whether to compile with MySQL support... yes
configure: error: invalid MySQL root directory
'includes=/usr/local/mysql/include'; neither bin/mysql_config, nor include/ and
lib/ were found there
What version of the product are you using? On what operating system?
rhel4
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 22 Nov 2011 at 8:57
# pwd
/usr/local/sphinxcn
# ./bin/mkdict /root/Downloads/sphinx/xdict_1.1.txt etc/xdict
# vi etc/sphinx.conf
source src1
{
......
......
chinese_dictionary = /usr/local/sphinxcn/etc/xdict
}
# ./bin/indexer --all
sphinx-for-chinese 2.1.0-dev (r3361)
Copyright (c) 2008-2012, sphinx-search.com
using config file '/usr/local/sphinxcn/etc/sphinx.conf'...
ERROR: unknown key name 'chinese_dictionary' in
/usr/local/sphinxcn/etc/sphinx.conf line 24 col 20.
FATAL: failed to parse config file '/usr/local/sphinxcn/etc/sphinx.conf'
Original issue reported on code.google.com by [email protected]
on 2 Nov 2012 at 9:20
下来试试,感谢 blueflycn
Original issue reported on code.google.com by [email protected]
on 29 Sep 2009 at 6:20
举个例子
有用户 昵称为
abc,在没更改昵称前索引没有问题,如果更改为def,那么此时合
并索引后,
abc,def都可以查出这个用户,请问下,怎么来避免这种问题?
Original issue reported on code.google.com by [email protected]
on 20 Apr 2010 at 1:56
现在的情况是这样的。我们用了:http://www.coreseek.cn
的版本。发现用了词表后
的结果效果不是很好。后来干脆用了字索引。效果还行。但��
�,search性能急剧下
降。所以,想用2元分词。但是发现,sphinx貌似没有实现,它�
��n-gram只支持0和1。
不知道,你这个版本可不可以支持2元分词。
2元分词的例子:
中华人民共和国 => 中华 华人 人民 民共 共和 和国
Original issue reported on code.google.com by [email protected]
on 21 Sep 2009 at 8:08
http://www.coreseek.com/products/ft_down/
不知道你们跟他们有什么不同呢?
Original issue reported on code.google.com by [email protected]
on 10 Jun 2009 at 1:33
我测试了下
命令行下直接用search查询 love -kitty 可以查处含love
不含kitty的结果
用php客户端,查出来的实际上是不支持语法的
love & kitty 的结果,love -kitty 的结果,都是love 与 kitty
两个词的结果
也许是sphinx的问题吧
Original issue reported on code.google.com by [email protected]
on 4 Jul 2010 at 9:16
[root@localhost ~]# indexer news --rotate
sphinx-for-chinese 2.1.0-dev (r3006)
Copyright (c) 2008-2011, sphinx-search.com
using config file '/usr/local/etc/sphinx.conf'...
indexing index 'news'...
collected 81208 docs, 228.0 MB
WARNING: sort_hits: merge_block_size=140 kb too low, increasing mem_limit may
improve performance
ERROR: index 'news': /srv/data/sphinx/news.tmp.spp: write error: 479232 of
524288 bytes written.
*** Oops, indexer crashed! Please send the following report to developers.
Sphinx 2.1.0-dev (r3006)
-------------- report begins here ---------------
Current document: docid=81208, hits=654017
Current batch: minid=0, maxid=0
Hit pool start: docid=81107, hit=2623452
-------------- backtrace begins here ---------------
Program compiled with gcc 4.1.2
Host OS is Linux localhost.localdomain 2.6.18-238.12.1.el5 #1 SMP Tue May 31
13:23:01 EDT 2011 i686 i686 i386 GNU/Linux
Stack bottom = 0x0, thread stack size = 0x14000
begin of system backtrace:
begin of system symbols:
indexer(_Z12sphBacktraceib+0x262)[0x819f102]
indexer(_Z7sigsegvi+0x104)[0x808b124]
[0x35b420]
indexer(_ZN10CSphString10SetSprintfEPKcz+0xa8)[0x80992d8]
indexer(_Z17sphWriteThrottlediPKvxPKcR10CSphString+0x350)[0x80a7230]
indexer(_ZN10CSphWriter5FlushEv+0x70)[0x80a7f00]
indexer(_ZN10CSphWriterD1Ev+0x25)[0x80c6575]
indexer(_ZN17CSphDictCRCTraitsD2Ev+0xd6)[0x80c69a6]
indexer(_ZN11CSphDictCRCILb1EED0Ev+0x18)[0x81198c8]
indexer(_ZN9CSphIndexD2Ev+0x40)[0x80b8a80]
indexer(_ZN13CSphIndex_VLND0Ev+0x25d)[0x80ec64d]
indexer(_Z7DoIndexRK17CSphConfigSectionPKcRK17SmallStringHash_TIS_EbP8_IO_FILE+0
x217d)[0x80943ad]
indexer(main+0x2a09)[0x80977d9]
/lib/libc.so.6(__libc_start_main+0xdc)[0x755e9c]
indexer(__gxx_personality_v0+0x205)[0x808a9f1]
Backtrace looks OK. Now you have to do following steps:
1. Run the command over the crashed binary (for example, 'indexer'):
nm -n indexer > indexer.sym
2. Attach the binary, generated .sym and the text of backtrace (see above) to the bug report.
Also you can read the section about resolving backtraces in the documentation.
-------------- backtrace ends here ---------------
Original issue reported on code.google.com by [email protected]
on 20 Mar 2012 at 5:24
请问sphinx-for-chinese用的什么分词算法?,与coreseek的区别是什
么?
Original issue reported on code.google.com by [email protected]
on 2 Jun 2010 at 2:18
创建了一个QQ群102348532,希望大家加入!
Original issue reported on code.google.com by [email protected]
on 16 Jan 2010 at 11:23
可以看:
http://www.sphinxsearch.com/forum/view.html?id=3439
这个bug,就是无法做增量索引,在sfc0.0.3_alpha版本中也是存在
的。不知你们是怎么解决
的?或者有什么新的版本?
Original issue reported on code.google.com by [email protected]
on 18 Sep 2009 at 2:43
我摸索着安装上之后乱码...郁闷,能否写一个安装文档
Original issue reported on code.google.com by [email protected]
on 17 Nov 2009 at 10:36
比如有按性别 m, f
的搜索,搜索m,会小几率出现f的搜索结果
Original issue reported on code.google.com by [email protected]
on 23 Nov 2009 at 8:18
参照《Sphinx Search Beginner Guide》书上例子,尝试了
分布式检索。对一个有180多万条记录的表,分拆各一半,
分别对应建了两个索引(items、item-2),对中文是用”一元
分词“。在本机上起了两个Searchd(端口分别为9312、9313),
然后用Java接口程序来测试:
java test -p 9312 -i master -e 检索词
检索英文词,没有问题,返回结果正常(与之前做的非分布��
�检索结果对比)
当检索中文词,发现中文检索结果有缺失,如检索”病“,��
�果是:
'病' found 915 times in 906 documents
而之前做的非分布式检索结果是:
'病' found 7837 times in 7825 documents
分布式检索英文词没有问题,而中文词检索结果缺失,请问��
�什么原因?
两个Conf文件如下:
#dis-1.conf 文件
source items
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = test
sql_db = data_monitor
sql_query_pre = SELECT @total := count(sql_id) FROM sql_log_table
sql_query_pre = SET @sql = CONCAT('SELECT * FROM sql_log_table limit 0,', CEIL(@total/2))
sql_query_pre = PREPARE stmt FROM @sql
sql_query = EXECUTE stmt
}
index items
{
source = items
path = d:/data/items-distributed
morphology = none
min_word_len = 1
charset_type = utf-8
min_prefix_len = 0
html_strip = 1
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
ngram_len = 1
ngram_chars = U+3000..U+2FA1F
}
indexer
{
mem_limit = 128M
}
index master
{
type = distributed
charset_type = utf-8
# Local index to be searched
local = items
# agent (index) to be searched
agent = localhost:9313:items-2
}
searchd
{
listen = 9312
log = d:/log/searchd-distributed.log
query_log = d:/log/query-distributed.log
max_children = 30
max_matches = 10000000
seamless_rotate = 1
preopen_indexes = 1
unlink_old = 1
compat_sphinxql_magics = 0
pid_file = d:/log/searchd-distributed.pid
binlog_path =
}
# dis-2.conf 文件
source items
{
type = mysql
# we will use remote host (first server)
sql_host = localhost
sql_user = root
sql_pass = test
sql_db = data_monitor
sql_query_pre = SET NAMES utf8
sql_query_pre = SELECT @total := count(sql_id) FROM sql_log_table
sql_query_pre = SET @sql = CONCAT('SELECT * FROM sql_log_table limit ', CEIL(@total/2), ',', CEIL(@total/2))
# Prepare the sql statement
sql_query_pre = PREPARE stmt FROM @sql
# Execute the prepared statement. This will return rows
sql_query = EXECUTE stmt
# Once documents are fetched, drop the prepared statement
sql_query_post = DROP PREPARE stmt
}
index items-2
{
source = items
path = D:/data/items-2-distributed
morphology = none
min_word_len = 1
charset_type = utf-8
min_prefix_len = 0
html_strip = 1
charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
ngram_len = 1
ngram_chars = U+3000..U+2FA1F
}
indexer
{
mem_limit = 128M
}
searchd
{
listen = 9313
log = d:/log/searchd-distributed-2.log
query_log = d:/log/query-distributed-2.log
max_children = 30
max_matches = 10000000
seamless_rotate = 1
preopen_indexes = 1
unlink_old = 1
compat_sphinxql_magics = 0
pid_file = d:/log/searchd-distributed-2.pid
binlog_path =
}
Original issue reported on code.google.com by [email protected]
on 8 Jan 2013 at 11:51
在使用php api的时候显示
bool(false) string(71) "connection to localhost:9312 failed (errno=111,
msg=Connection refused)"
Original issue reported on code.google.com by [email protected]
on 19 Mar 2011 at 8:06
sfc有没有在linux下详细的安装 配制 调用的教程啊
Original issue reported on code.google.com by [email protected]
on 29 Mar 2010 at 3:54
敲入命令:
D:\wamp\www\sphinx\bin>indexer.exe --all
显示:
using config file './sphinx.conf'...
indexing index 'idx_venues'...
ERROR: index 'idx_venues': column number 6 has no name.
total 0 docs, 0 bytes
total 0.090 sec, 0 bytes/sec, 0.00 docs/sec
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 0 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
错误如下:
ERROR: index 'idx_venues': column number 6 has no name.
请问index里column number 是什么概念?
Original issue reported on code.google.com by [email protected]
on 9 Jun 2010 at 4:20
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.