pouchdb-community / pouchdb-quick-search Goto Github PK
View Code? Open in Web Editor NEWFull-text search engine on top of PouchDB
License: Apache License 2.0
Full-text search engine on top of PouchDB
License: Apache License 2.0
Hi,
I'm using this module with lunr.jp.js
but it doesn't work.
The problem is that the lunr.jp.js
replaces from original lunr.tokenizer
to Japanese tokenizer.
The pouchdb-quick-search
seems not to use global.lunr.tokenizer
because it uses an internally initialized lunr which doesn't have the Japanese tokenizer.
I want to replace the tokenizer in the pouchdb-quick-search
into Japanese one but it doesn't expose lunr.
It would be nice to add an option which can specify arbitrary instance of lunr.
I'm building a Magic the gathering search engine that will work on-/offline.
I have a quite big data for clientside search about 16.9MB json data it would be awesome if I somehow could build the index on the server and send the built version to client.
Do you think this would be possible?
Given a document like:
{
"some_field": "foo",
"meta": [
{
"nested": "First meta"
},
{
"nested": "Second meta"
}]
}
How can I index the "First meta" and "Second meta" values? I tried meta.nested
and meta[].nested
but they do not seem to work.
want to query
class == "book" && bookname = "pouch"
no example there
below is the example ,i change the keyword to "me" "it" ,the result is null why?
thanks
(function () {
var pouch;
var doc = {_id: 'mydoc', title: "Guess who?", text: "It's-a me, Mario!"};
function log(str) {
document.getElementById('display').innerHTML += str + '\n';
}
// destroy the db so you can see the document being put
// each time you load the page. obviously you wouldn't want
// to do this in production.
PouchDB.destroy('mydb').then(function () {
pouch = new PouchDB('mydb');
}).then(function () {
log('putting doc: ' + JSON.stringify(doc));
return pouch.put(doc);
}).then(function () {
var query = {
query: 'it',
// query:'me'
//query:'guess' , this is ok
fields: ['title', 'text'],
include_docs: true,
highlighting: true
};
log('searching with query: ' + JSON.stringify(query));
return pouch.search(query);
}).then(function (res) {
log('result: ' + JSON.stringify(res));
});
})();
Full disclosure : I am a total noob when it comes to FTS and have never used lunr before.
From what i understand of the source code, the actual searching of terms from the query is done here :
https://github.com/nolanlawson/pouchdb-quick-search/blob/master/index.js#L174-L176
Wouldn't it be possible, if queryTerms.length is 1, to do a startKey/endKey search instead ? This way querying for "somethi" would get us docs containg the term "something".
I imagine some adaptions are needed in the rest of the search algorithm. If you think it might work / would merge a PR about this, I will work on it over the week-end.
Thanks
How can i search with empty filter i order to get all records ?
If I don't care about tokenization or stemming, is it still possible to make search work for languages like Korean, or Chinese. I thought some kind of simple string match would take place if tokenization and stemming are not used. But that doesn't seem to be the case when I search for Korean and Chinese. I didn't get any result back.
Is it just a matter of adding a new language in lunr-languages?
As the doc said, to setup this library in the browser (vs in node.js) we include 2 script tags: one for pouchdb and one for pouchdb-quick-search in that order.The first script tag attach PouchDB to the window object. the second script tag look for window.PouchDB to register the plugin: window.PouchDB.plugin(exports);
Requirejs does not attach PouchDB to window object. Consequently, pouchdb-quick-search can not be registered as a plugin. Any suggestion?
Hi, I want to create a book reader using pouchdb.
As this db is static because the book will not change in the future, this is a operation to do one time.
The idea is to automatically fill the db, reading the text from a normal csv file, where each line is a record and the fields are separated form a comma or a dot comma symbol.
Can I connect to db from linux shell?
I think to create a bash script with something like:
IPS="\n"
for Line in `cat book.csv`; do
PouchDBinsertCommad "$Line"
done
someone have a solution for that?
regards
MaX
To include this library as a plugin for PouchDB, the doc said:
var PouchDB = require('pouchdb');
PouchDB.plugin(require('pouchdb-quick-search'));
My question is: Where is this require() method coming from? Is it related to nodejs?(i don't know anything about nodejs).
I am using requirejs+angular and this is my attempt.
define(
[
'angular'
,'pouchdb'
,'pouchdb_quick_search'
]
,function
(
angular
,PouchDB
,pouchdb_quick_search
)
{
var mod = angular.module('service/db',[]);
mod.factory('service/db/get',function(){
return function(db_name){
PouchDB.plugin(pouchdb_quick_search);
return new PouchDB(db_name);
}
});
})
The pouchdb instance returning from 'service/db/get' has search method as undefined.
Thank you in advance.
I would like to remove stemming and possibly trimming because I search for non-language words with greek letters in them. Is it possible to do it?
I've tried to pass a regex (with 'igm' flags) without much success... I want to be able to get "Mario" results even if I look for "ario"
Any plan to support it?
I'm mainly searching across strings of numbers, rather than English words, and it seems like the tf-idf algorithm isn't really suited to this. For example, if there is an entry with id = 123456
and I search for 3456
then it doesn't show up as a result, despite there being no other document with a 3456
string in it.
How would one go about changing the search algorithm to something else?
It seems that there is an index object identified by the search parameters, what do you think about making it explicit?
var index = pouch.searchIndex({ fields: ..., filter: ... })
index.search({ query: ..., include_docs: ...})
this way it would be easier to maintain multiple indexes, build them in advance, and reason about them
Hi, I'm running in to an odd bug where the filter option for searching seems to be cached (at least I think that's what's happening).
I've managed to reproduce the issue with this code: http://bl.ocks.org/Darkle/9bcf54994859b53dc3da
The second search should return doc7, but it seems to have cached the first filter, even though the domainToSearchFor
is changed.
Also, I found that if ran the code in the browsers dev console and did the searches manually, and then if I changed the code in the filter for the second search (even just adding a blank new line) it works as expected and updates the filter.
I know I said in the readme that you don't need this for prefix search, but I realize now that you do, because we have the benefit of being case-insensitive.
I want to get all docs by passing empty query ""
. Is it possible .. ?
Hello!
Getting problems to serach in nested array with this doc structure
{
"_id": "1077",
"_rev": "1-805dcb10756e24f3cf31b5eaf826a430",
"updated": "0",
"name": "Swegon",
"persons": [
{
"firstname": "Henrik",
"person_id": "2005",
"lastname": "Bork",
"updated": "1319029129",
},
{
"firstname": "Ernst Børge",
"person_id": "2006",
"lastname": "Johansen",
"updated": "0",
},
{
"firstname": "Leif",
"person_id": "2463",
"lastname": "Hamrebø",
"updated": "0",
}
]
}
Search would work with first array item like firstname, but will not work with others like person_id or lastname.
Querry:
var opts = {
fields: ['persons[].lastname'],
query: 'Johansen'
};
I'm a little bit confused with lunr implementation in this plugin.
Let me explain...
In the past, I've tried to add multi language support without success ( I'm not using requireJS - using Angular 2 though - and would like to avoid if possible) and I don't really understand how to use pipelines (would like to remove accents when indexing). The problem is that I don't know how to follow lunr tutorials because there is this plugin implementation and it's confusing...
It would be awesome if someone could give me an example of implementation of a pipeline to modify the index (like removing accents) and adding mult-language support (without using requireJS).
We could even add these examples to the documentation....
Apparently this is fixed on the lunr-languages side, so we don't need to require this anymore: MihaiValentin/lunr-languages#2.
Is it possible?
I'm a little bit confused with how to enable language support (in web browser).
I've tried to add the following:
var idx = lunr(function () {
this.use(lunr.multiLanguage('en', 'fr'));
});
But I get the following warning:
Overwriting existing registered function: lunr-multi-trimmer-en-fr
Function is not registered with pipeline. This may cause problems when serialising the index.
I tried different things like adding lunr.Index.load(idx);
but I get another warning (version mismatch: current 0.7.1 importing undefined
) and I can't even build the index...
Any suggestion would be really appreciated
Somehow I cannot get search to find anything except if the query is work
or pattern
. Other terms like computer
, lion
and dog
do not return any results. Test code is below. None of these terms are stopwords, so I don't understand why no results are found.
var assert = require('assert');
var uuid = require('node-uuid');
var PouchDB = require('pouchdb');
PouchDB.plugin(require('pouchdb-quick-search'));
var text = 'lion';
var db = new PouchDB('test-db');
var obj = {
_id: uuid.v4(),
text: text
};
function build(cb) {
cb();
}
db.put(obj, function (err, res) {
if (err) console.log(err, res);
db.get(obj._id, function (err, o) {
assert(!err && o._id === obj._id);
var options = {
query: text,
fields: ['text'],
};
db.search(options, function (err, result) {
assert(!err);
assert(result.total_rows > 0);
build(function (err) {
assert(!err);
db.search(options, function (err, result) {
assert(!err);
assert(result.total_rows > 0);
});
});
});
});
});
Is it possible? My database contains multiple languages. It would be nice to be able to search not just English.
Search is working for me very well on Chrome and Firefox. However as soon as I call search on Safari (Mac OS X or iOS), I get this dialog box:
Allow this website to use space on your disk? The website “http://localhost:3000” is requesting 5 MB of disk space to store “_pouch_glossary-search-41aedf802eabccbf360b0ea56f611df1” as a database on your disk. Currently, this website is allowed to use 5 MB of disk space.
When I click "Allow" the dialog box goes away and then comes back up again and again. This happens for about 20 times and when the dialog box finally goes away, the search index has not been created.
Here's the relevant code:
return db.search({
fields: ['name', 'tags'],
build: true
});
Hi.
I've problems with search. Hope any body help me!
example:
{query: "nguyễn",
fields: ["display_name"],
highlighting: true,
include_docs: true}
however: i've docs with display_name
is nguyen
or nguyễn
. And i want to list all this.
LIKE query in mysql
:example:
{query: "%guye%",
fields: ["display_name"],
highlighting: true,
include_docs: true}
and i want to show all docs with between is guye
example: nguyễn, nguyen, nguye, guyen ...
Thanks for watching!
Is it possible to create the index with a filter so we can index before we query (for a faster "first" query)
When I create it, I get the ok response but when I do the first query, it takes a lot of time
As discussed in the issue:
ionic-team/ionic-framework#8356
This library has a compatibility issue with Rollup because of a 'strict error' in the "md5-jkmyers" library, which seems to no longer being maintained.
Knowing that, I would like to know if you could change it to the same library used in pouchdb js-spark-md5? Or fork of the "md5-jkmyers" repository and apply the bug fixes to it.
Thank you.
Hi,
I am using pouchdb on node js with this plugin but not able to search because of the above issue mentioned...
var PouchDB = require('pouchdb');
PouchDB.plugin(require('pouchdb-quick-search'));
var db = new PouchDB('http://127.0.0.1:5984/users');
db.search({
query: 'maria',
fields: ['name']
}).then(function(result) {
console.log(result,"========")
res.send(200,result);
// handle results
}).catch(function(err) {
// handle error
console.log(err,"========")
res.send(500,err);
});
output of the above code is this : {
"code": "ESOCKETTIMEDOUT",
"connect": false,
"status": 500
}
Now I checked pouch db server logs and I find :
Please double-check your map/reduce function.
ReferenceError: isFiltered is not defined
at evalmachine.:3:9
at /usr/local/lib/node_modules/pouchdb-server/node_modules/pouchdb-mapreduce/lib/index.js:147:14
at tryMap (/usr/local/lib/node_modules/pouchdb-server/node_modules/pouchdb-abstract-mapreduce/lib/index.js:166:7)
at createDocIdsToChangesAndEmits (/usr/local/lib/node_modules/pouchdb-server/node_modules/pouchdb-abstract-mapreduce/lib/index.js:589:13)
at processBatch (/usr/local/lib/node_modules/pouchdb-server/node_modules/pouchdb-abstract-mapreduce/lib/index.js:572:37)
at process._tickCallback (internal/process/next_tick.js:103:7)
I want to know where I am doing wrong and I have users greater than 30k in the database
Running this chrome with PouchDb 5.4.4. Got the latest search dist.
Like bellow:
db.search({
query: 'foo',
fields: ['_id'],
include_docs: true
});
Hi,
I was looking for a full text index that works on both the browser (offline) and Couchdb for a while. My actual setup is to use a PostgreSQL cache with Fulltext index on the server and a primitive search using a simple map function in Pouch.
Your solution using Lunr
is much better. I would use it on the server too but I don't want to duplicate my data from Couchdb to Pouchdb (it can be very large and that is why i want to dump the PG index).
I did some tests and manage to create a Couchdb Map/Reduce view using CommonJS (in my fork repo https://github.com/jfgirard/pouchdb-quick-search). I added the required libs (lunr + stemmerSupport, lurn-LANG if needed) with a tweaked version of your map function. All 35 tests passes (with the added missing stale option pouchdb/mapreduce#197) using TEST_DB=http://localhost:5984/quick-search
.
Is it something you want to add to your code ?
If yes, I can make a PR... But I had to make some changes to hook the code to add / remove the design documents in Couchdb. With your help, it can be better done. Also, it works only with Pouch in nodejs and read the the libs from couchdb_libs folder.
Jeff
Great plugin. Just curious if partial word matching is available?
As in, user searches for:
mega
and the document title (not _id) is:
megaman
I know I can handle this with secondary index, just curious if this plugin support this.
Alternatively, the whole peer dependency could be removed.
npm ERR! peerinvalid Peer [email protected] wants pouchdb@>= 2.2.0
I'm having trouble that the promise is always set to pending:
var promise = db.search({query:'java',fields: [ 'value.volumeInfo.title', 'value.volumeInfo.subtitle', 'value.volumeInfo.publisher', 'value.volumeInfo.authors', 'value.volumeInfo.publishedDate' ]})
and i got the following value for promise
Promise {cancel: function, [[PromiseStatus]]: "pending", [[PromiseValue]]: undefined}
any idea?
PouchDB Version 5.4.4
When developing and testing on my desktop computer with IE 11 on Windows 10 (sorry, I'm a Visual Studio guy and I find it convenient build and test in IE (it's a work thing) ), the PouchDB database fails to persist data whenever the app shuts down. Upon a restart of the app the data is gone. Also, PouchDB will frequently throw an error message saying the database has been corrupted. When that happens it is not possible to use the database anymore and you must destroy the database and create a new one to recover from it.
When I type "test" it should search from
2 and 4 works fine with following code, but 1 and 3 doesn't? Could anyone please help?
function searchPages(searchTerm) {
var deferred = $q.defer();
pouchdb.query(searchMap, {
startkey : searchTerm,
endkey : searchTerm + '\uFFFF',
include_docs : true
}).then(function (result) {
console.log(result);
var results = result.rows.map(function(r) {
return r.doc;
});
deferred.resolve(results);
}).catch(function (err) {
// handle errors
deferred.reject(err);
});
function searchMap(doc) {
if (doc.type === 'page') {
emit(doc.content);
}
}
return deferred.promise;
}
P.S I've already went through comments at #8
I then decided to use map/reduce, so this issue might not exactly be related to this plugin, still wanted to give it a try :)
I'm asking because you said we could ask for support for other languages.
There is info on this here: olivernn/lunr.js#16
And implementations for lunr here: https://github.com/MihaiValentin/lunr-languages
But you probably know that.
hi,
does this plugin support arabic search?
also,
can i use it instead of find plugin? what's their main difference?
I'm using a map/reduce fork for performance reasons, but I should update because it's had some changes recently.
I noticed a difference in the way words are tokenized compared to Postgresql.
select to_tsvector('Pseudo-Mercator');
"'mercat':3 'pseudo':2 'pseudo-merc':1"
Basically, PG index both the "Pseudo-Mercator" and the sub words "Pseudo" and "Mercator".
Searching for "mercator" gives me a result with PG.
But, because Lunr only tokenize on white char, a search for "mercator" won't work.
I could create a afterTokenizer function to split each token and add them to the list.
function afterTokenizer(tokens) {
var split;
tokens.forEach(function(token){
split = token.split(/-/g);
if(split.length > 1){
tokens = tokens.concat(split);
}
});
return tokens;
}
So, index.pipeline.run(lunr.tokenizer(text));
would be index.pipeline.run(afterTokenizer(lunr.tokenizer(text)));
Is this the best way to acheive the same behavior ?
I need a way to filter the documents to include in the index. For example, documents with property trashed
to true
would be ignored. An other example is to index only docs of a specific type
(when multiple doc type sharing a same property, such as name
or title
).
I want to avoid apply the filter on the result since it makes query with limit
and skip
more complicated and less efficient.
It would also be faster to build the index if it contains only the docs I want to search for.
With Couchdb-Lucene, the "fulltext" function defined in the design document allow me to filter what doc to index.
This is my attempt to acheive it: jfgirard@cd1a926
It rely on a "evil" new Function
code though.
Is this something you want ?
J
I have the search in my angular js controller
I do get the results fine.
However my angular template does not get passed the results.
I am updating the scope variable in the .then() function.
Is there an other event i have to use?
Is it possible?
I try to use this plugin configured with webpack.
I' m sure the pouchdb. js and this plugin is successfully loaded.
And according to your ways,
var PouchDB = require('pouchdb');
PouchDB.plugin(require('pouchdb-quick-search'));
var pouch = new PouchDB('mydb');
var doc = {_id: 'mydoc', title: "Guess who?", text: "It's-a me, Mario!"};
pouch.search({
query: 'your query here',
fields: ['title', 'text']
}).then(function (res) {
// handle results
}).catch(function (err) {
// handle error
});
But, error log “TypeError: pouch.search is not a function”.
Is there anyone using this plugin successfully with webpack?
Hi there!
In my database the content of property "content" is wrapped withHTML, e.g.
Test Page 1
this is page 1 to test any search function
The doc structure is very easy: _id, displayName and content.
The search is based on field "content".
The search will not return any result. If I remove all HTML tags the search function will work.
Do you have any ideas how to make my database searchable?
Regards
Carsten
I came across this website and was wondering if it could be implemented instaed of Lunr. It brings more features (such as being able to search specific fields)
can we get total_rows in response similar to allDocs ?
I tried to do a simple query, but I only get
Object {error: "ReferenceError", name: "ReferenceError", reason: "isFiltered is not defined", message: "isFiltered is not defined", status: 500}
Any ideas?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.