Giter Site home page Giter Site logo

norch's Introduction

NPM version NPM downloads MIT License Build Status Join the chat at https://gitter.im/fergiemcdowall/search-index

      ___           ___           ___           ___           ___      
     /\__\         /\  \         /\  \         /\  \         /\__\     
    /::|  |       /::\  \       /::\  \       /::\  \       /:/  /     
   /:|:|  |      /:/\:\  \     /:/\:\  \     /:/\:\  \     /:/__/      
  /:/|:|  |__   /:/  \:\  \   /::\~\:\  \   /:/  \:\  \   /::\  \ ___  
 /:/ |:| /\__\ /:/__/ \:\__\ /:/\:\ \:\__\ /:/__/ \:\__\ /:/\:\  /\__\ 
 \/__|:|/:/  / \:\  \ /:/  / \/_|::\/:/  / \:\  \  \/__/ \/__\:\/:/  / 
     |:/:/  /   \:\  /:/  /     |:|::/  /   \:\  \            \::/  /  
     |::/  /     \:\/:/  /      |:|\/__/     \:\  \           /:/  /   
     /:/  /       \::/  /       |:|  |        \:\__\         /:/  /    
     \/__/         \/__/         \|__|         \/__/         \/__/     

Deploy

npm install norch and then start with norch

or programatically:

require('norch')(options, function(err, norch) {
  // Norch server started on http://localhost:3030 (or the specified host/port)
})

Put stuff in

curl -X POST -d @myData.json http://localhost:3030/add (where myData.json is a newline separated file of JSON objects)

Search for hits (uses search-index's API)

http://localhost:3030/search?q={"query":[{"AND":{"*":["usa"]}}]}

(http://localhost:3030/search returns everything)

Make autosuggest

http://localhost:3030/matcher?q=usa

Export, import, and replicate an index

# create a snapshot on the server (available under /latestSnapshot)
curl -X POST http://localhost:3030/snapshot
# get latest snapshot
curl -X GET http://anotherIndex:3030/latestSnapshot > export.json
# replicate an export file into a new index on another server
curl -X POST -d @export.json http://someOtherServer:3030/import

API

Endpoint Method Response Typical Use Case
/add POST status code Add documents to the index
/availableFields GET stream Discover the name of fields which can be searched in
/buckets GET stream Aggregate documents on ranges of metadata
/categorize GET stream Aggregate documents on single metadata values
/concurrentAdd POST status code For when more than one source is adding documents to the index at the same time
/createSnapshot POST status code Create a snapshot of the index
/delete DELETE status code Remove documents from index
/docCount GET object Counts total document in index
/flush DELETE status code Remove all documents from index
/get GET stream Get documents by ID
/import POST file Import/merge an existing index into this one
/latestSnapshot GET file Download the latest index snapshot
/listSnapshots GET file See list of snapshots
/match GET stream Match by linguistic similarity- autosuggest, autocomplete
/search GET stream Search in the index
/totalHits GET object Show number of hits that a given query returns

The Norch API docs are here. Norch is essentially an http wrapper around search-index.

About Norch

Norch.js is an experimental search engine built with Node.js and search-index featuring, Full text search, Stopword removal, aggregation Matching (Autosuggest), Phrase search, Fielded search, Field weighting, Relevance weighting (tf-idf), Paging (offset and resultset length)

Logging

On Linux and OSX. Install bunyan, tail the log-file and pipe to bunyan.

Install bunyan:

npm install -g bunyan

Tail log-file:

tail -f log-info.log |bunyan

Mailing list: [email protected] - subscribe by sending an email to [email protected]

License

MIT, Copyright (c) 2013-16 Fergus McDowall

norch's People

Contributors

christopherpoole avatar eivindee avatar eklem avatar eskaufel avatar fergiemcdowall avatar ilblog avatar jstclair avatar mewwts avatar mikeknoop avatar nhhagen avatar rtucker88 avatar vierbergenlars avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

norch's Issues

Logo!

We need a new logo since Forage is dead and Norch is back. Dawn of the dead!

cant get vagrant to run on OSX

Vagrant wont play nice on OSX- see error below...

bash-3.2$ vagrant up
Bringing machine 'default' up with 'virtualbox' provider...
[default] Setting the name of the VM...
[default] Clearing any previously set forwarded ports...
[default] Creating shared folders metadata...
[default] Clearing any previously set network interfaces...
[default] Preparing network interfaces based on configuration...
[default] Forwarding ports...
[default] -- 22 => 2222 (adapter 1)
[default] -- 3000 => 3000 (adapter 1)
[default] Running 'pre-boot' VM customizations...
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
[default] VM booted and ready for use!
[default] Mounting shared folders...
[default] -- /vagrant
[default] -- /tmp/vagrant-puppet/manifests
[default] -- /tmp/vagrant-puppet/modules-0
[default] Running provisioner: puppet...
Running Puppet with default.pp...
stdin: is not a tty
Could not find class apt for precise64.lyse.net at /tmp/vagrant-puppet/manifests/default.pp:10 on node precise64.lyse.net
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

puppet apply --modulepath '/etc/puppet/modules:/tmp/vagrant-puppet/modules-0' --detailed-exitcodes /tmp/vagrant-puppet/manifests/default.pp || [ $? -eq 2 ]

Stdout from the command:

Stderr from the command:

stdin: is not a tty
Could not find class apt for precise64.lyse.net at /tmp/vagrant-puppet/manifests/default.pp:10 on node precise64.lyse.net

bash-3.2$

Autogeneration of document ids

Could we implement some way of indexing a document without specifying the id?
That is, could we index this:
{
'title':'A really interesting document',
'body':'This is a really interesting document',
'metadata':['red', 'potato']
},

and autoassign it some id?

Content (document storage) (fields).

Fergie, is there a specific reason (possibly performance on fetch) that fields are replicated across each 'field' key storage element.
I just created a 20,000 doc index (20 fields/doc) 100mb total, and have a data space of 9gb.
Still getting my head around index code, but it seems a this could be optimized away using a field cache within leveldb. Keyspace obviously cannot change, but valuespace could be tweaked at expense of second level field fetch as results are returned (possibly even streamed via a transform containing keys).

Indexing Memory Problem

This looks to probably be most prevalent in natural module, but other closures seem to be left hanging when indexing.

Will hopefully have at least a partial solution shortly for pull.

Make documentation match example data

This is a big job, but it would be really nice with documentation matching at least one sample set of data. That means it's much easier to test all the functionality and figure out how stuff is working.

Singlequotes not correct JSON?

A small error in the documentation? Got error on singlequotes in JSON, it should be double-quotes, shouldn't it?

[
  {
    'id':'1',
    'title':'A really interesting document',
    'body':'This is a really interesting document',
    'metadata':['red', 'potato']
  },
  {
    'id':'2',
    'title':'Another interesting document',
    'body':'This is another really interesting document that is a bit different',
    'metadata':['yellow', 'potato']
  }
]

Partial queries

First of all, great work on Forage!

Second - it would be great if it supported partial queries, since this would ensure search-as-you-type is possible.
What do you think?

What about this - fergiemcdowall/search-index#3 ? Does "closed" mean that it's implemented?

Documentation: Search index and Norch from webtorrents

A similar idea to the search engine as a bookmarklet. #32 But since the bookmarklet thingy is not the first to happen, it's good to describe some of the ideas more likely to happen.

If the "search engine running in your browser" works, how about the index, and maybe the search engine it self, as torrents/webtorrents. Then, the one making a webapp or webservice will almost need no server/hardware power at all. You only need a seeder of the app/search engine and search index.

extract out adapters from forage-document-processor

It would be nice if the jQuery part of forage-document-processor was extracted out so that it could be easily edited and made interchangeable with other drop-in jQuery files. booming voice These files shall be known as "Adapters"

Looks interesting ...

Hi,

I am working on an alpha version of a website and we want to add a simple site search. A little googling led me to forage and it looks like it might work very well for us!

What I would like to do is to index our content (this is currently stored as a structured directory of markdown files). How might I go about indexing this into forage? Is it as simple as using the forage-document-processor command with a cusatom adaptor to do the conversion and then running the forage-indexer on the resulting files?

Any guidance would be greatly appreciated.

David

Delete document function

Make it possible to delete a single document from the index along with all associated references

getDocByID(docID)

Utility function to get a specific document directly via ID without using search().
Useful for launching say a display.html page.

Will submit pull shortly

Forage running inside browser, added with bookmarklet

Got an idea last week to use the browser as a virtual machine for Forage. Forage could then be added with a bookmarklet to any page. With some simple UI-stuff you could define the Forage Document Processor Adapter, set up rules for Forage Crawler, crawl (and process) and then possibly test-search with HTML5 local storage instead of levelDB. When the user is satisfied she or he could download the JSON-file with processed items + scripts for adding a search box, search result and navigators to a page.

The real benefit would be that the user would not need any server to test Forage and actually crawl a site. When page crawled the user can download JSON + setup-stuff or add it to a cloud service.

foraging-01-page
The user finds a page to crawl ...

foraging-02-bookmarklet
... clicks the bookmarklet ...

foraging-03-bookmarklet
... that adds Forage JavaScript-stuff to the page ...

foraging-04-foragin-pristine
... much like a browser plugin ...

foraging-05-jquery-selector-test
... tests a jQuery selector satement ...

foraging-06-jquery-selector-added
... and adds the field to the item when satisfied. This process is repeated until a full item is defined.

More to come on the crawl rule setup, test-search and downloading of data + adapter + crawl-rules

Bookmark search tool

I've always wanted a better bookmark search tool. I have hundreds of them and I hate labeling them with searchable tags. I want to search their metadata!

This looks like the perfect project to fork into a simple command line app.

Basically I just want to index my bookmarks (given some format: Safari, Chrome, Firefox, JSON, etc.) and then be able to search them. It would also be nice to save a snapshot. It will also be super badass to do this all through the command line too!

Anyways, I'm curious if anyone here wants to help me with this since I'm not terribly familiar with all the work that has been done here. I hope someone will agree that this is a pretty awesome tool. This is a very simple extension of what has already been implemented here.

I'd appreciate any help or any suggestions on how to get started.

Thanks

Error when trying to install

While running
sudo npm install -g norch
on ubuntu trusty

[email protected] install /usr/local/lib/node_modules/norch/node_modules/search-index/node_modules/level/node_modules/leveldown
node-gyp rebuild

npm http 304 https://registry.npmjs.org/apparatus
/bin/sh: 1: node: not found
gyp: Call to 'node -e "require('nan')"' returned exit status 127. while trying to load binding.gyp

forage version of the Pecita font

Fergus,
I built a version of
the Pecita font including a "forage" ligature.
foragepreview
This ligature can easely be modified by you using fontforge.
If you are interested send me a private email at pecita2014 at gmail dot com so I'll be able to reply with a zip attached.
Regards,
Philippe

relative paths

When running forage using path/to/forage.js directories are not treated as relative (i.e path.join(__dirname, 'xxxx'); )
This makes running forage via say forever (init.d) needing cd commands in script etc.

I have corrected code, and will submit pull shortly.

Great work !

Hello,
I'm very interested in your projet. Really good work. Thank you.
A question : I don't see documents ? Only filters and facets. What's wrong ?

Facets could be more "visual" : chart, pie, map, tree, ...
JSON documents could be stored on noSql database before indexation. It would be nice to index CouchDB rather than a directory.
Crawling could be triggered directly from the browser that view the page you want to index. (kind of "bookmark")

Philppe.

cannot install norch on windows 64bit

cannot install norch
C:\projects\sideLoad>node --version
v0.10.29

C:\projects\sideLoad>npm --version
1.4.14

npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\request\node_
modules\hawk\node_modules\cryptiles\Makefile
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\reque
st\node_modules\hawk\node_modules\cryptiles\Makefile
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\request\node_modules\hawk\node_modules\boom\images\boom.png'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\request\node_
modules\hawk\node_modules\boom\images\boom.png
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\reque
st\node_modules\hawk\node_modules\boom\images\boom.png
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\request\node_modules\hawk\node_modules\hoek\images\hoek.png'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\request\node_
modules\hawk\node_modules\hoek\images\hoek.png
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\reque
st\node_modules\hawk\node_modules\hoek\images\hoek.png
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! error rolling back Error: EPERM, unlink 'C:\projects\sideLoad\node_modu
les\norch\node_modules\cheerio\node_modules\cheerio-select\node_modules\CSSselec
t\test\tools\bench.js'
npm ERR! error rolling back [email protected] { [Error: EPERM, unlink 'C:\projects\si
deLoad\node_modules\norch\node_modules\cheerio\node_modules\cheerio-select\node_
modules\CSSselect\test\tools\bench.js']
npm ERR! error rolling back errno: 50,
npm ERR! error rolling back code: 'EPERM',
npm ERR! error rolling back path: 'C:\projects\sideLoad\node_modules\norch
\node_modules\cheerio\node_modules\cheerio-select\node_modules\CSSselect
test\tools\bench.js' }
npm ERR! Error: ENOENT, open 'C:\projects\sideLoad\node_modules\norch\node_modul
es\request\node_modules\form-data\node_modules\combined-stream\node_modules\dela
yed-stream\package.json'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\request\node_
modules\form-data\node_modules\combined-stream\node_modules\delayed-stream\packa
ge.json
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\multer\node_modules\busboy\node_modules\readable-stream\node_modules\string_
decoder\README.md'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\multer\node_m
odules\busboy\node_modules\readable-stream\node_modules\string_decoder\README.md

npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\multe
r\node_modules\busboy\node_modules\readable-stream\node_modules\string_decoder\R
EADME.md
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\cheerio\node_modules\htmlparser2\node_modules\readable-stream\lib_stream_wr
itable.js'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\cheerio\node_
modules\htmlparser2\node_modules\readable-stream\lib_stream_writable.js
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\cheer
io\node_modules\htmlparser2\node_modules\readable-stream\lib_stream_writable.js

npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\multer\node_modules\busboy\node_modules\readable-stream\node_modules\inherit
s\LICENSE'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\multer\node_m
odules\busboy\node_modules\readable-stream\node_modules\inherits\LICENSE
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\multe
r\node_modules\busboy\node_modules\readable-stream\node_modules\inherits\LICENSE

npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\cheerio\node_modules\cheerio-select\node_modules\CSSselect\test\tools\bench.
js'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\cheerio\node_
modules\cheerio-select\node_modules\CSSselect\test\tools\bench.js
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\cheer
io\node_modules\cheerio-select\node_modules\CSSselect\test\tools\bench.js
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\cheerio\node_modules\htmlparser2\node_modules\domhandler\tests\13-comment_in
_text.json'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\cheerio\node_
modules\htmlparser2\node_modules\domhandler\tests\13-comment_in_text.json
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\cheer
io\node_modules\htmlparser2\node_modules\domhandler\tests\13-comment_in_text.jso
n
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\multer\node_modules\busboy\node_modules\readable-stream\node_modules\core-ut
il-is\README.md'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\multer\node_m
odules\busboy\node_modules\readable-stream\node_modules\core-util-is\README.md
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\multe
r\node_modules\busboy\node_modules\readable-stream\node_modules\core-util-is\REA
DME.md
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\multer\node_modules\busboy\node_modules\dicer\test\fixtures\many-noend\part7
.header'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\multer\node_m
odules\busboy\node_modules\dicer\test\fixtures\many-noend\part7.header
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\multe
r\node_modules\busboy\node_modules\dicer\test\fixtures\many-noend\part7.header
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
npm ERR! Error: ENOENT, lstat 'C:\projects\sideLoad\node_modules\norch\node_modu
les\search-index\node_modules\natural\lib\natural\phonetics\phonetic.js'
npm ERR! If you need help, you may report this entire log,
npm ERR! including the npm and node versions, at:
npm ERR! http://github.com/npm/npm/issues

npm ERR! System Windows_NT 6.1.7601
npm ERR! command "C:\Program Files\nodejs\node.exe" "C:\Program Files\nod
ejs\node_modules\npm\bin\npm-cli.js" "install" "norch"
npm ERR! cwd C:\projects\sideLoad
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! path C:\projects\sideLoad\node_modules\norch\node_modules\search-index
node_modules\natural\lib\natural\phonetics\phonetic.js
npm ERR! fstream_path C:\projects\sideLoad\node_modules\norch\node_modules\searc
h-index\node_modules\natural\lib\natural\phonetics\phonetic.js
npm ERR! fstream_type File
npm ERR! fstream_class FileWriter
npm ERR! code ENOENT
npm ERR! errno 34
npm ERR! fstream_stack C:\Program Files\nodejs\node_modules\npm\node_modules\fst
ream\lib\writer.js:284:26
npm ERR! fstream_stack Object.oncomplete (fs.js:107:15)
|

[email protected] install C:\projects\sideLoad\node_modules\norch\node_modules
search-index\node_modules\level\node_modules\leveldown
node-gyp rebuild
C:\projects\sideLoad\node_modules\norch\node_modules\search-index\node_modules\l
evel\node_modules\leveldown>node "C:\Program Files\nodejs\node_modules\npm\bin\n
ode-gyp-bin....\node_modules\node-gyp\bin\node-gyp.js" rebuild
gyp ERR! configure error
gyp ERR! stack Error: Python executable "python" is v3.4.2, which is not support
ed by gyp.
gyp ERR! stack You can pass the --python switch to point to Python >= v2.5.0 & <
3.0.0.
gyp ERR! stack at failPythonVersion (C:\Program Files\nodejs\node_modules\np
m\node_modules\node-gyp\lib\configure.js:108:14)
gyp ERR! stack at C:\Program Files\nodejs\node_modules\npm\node_modules\node
-gyp\lib\configure.js:97:9
gyp ERR! stack at ChildProcess.exithandler (child_process.js:645:7)
gyp ERR! stack at ChildProcess.emit (events.js:98:17)
gyp ERR! stack at maybeClose (child_process.js:755:16)
gyp ERR! stack at Socket. (child_process.js:968:11)
gyp ERR! stack at Socket.emit (events.js:95:17)
gyp ERR! stack at Pipe.close (net.js:465:12)
gyp ERR! System Windows_NT 6.1.7601
gyp ERR! command "node" "C:\Program Files\nodejs\node_modules\npm\node_modu
les\node-gyp\bin\node-gyp.js" "rebuild"
gyp ERR! cwd C:\projects\sideLoad\node_modules\norch\node_modules\search-index\n
ode_modules\level\node_modules\leveldown
gyp ERR! node -v v0.10.29
gyp ERR! node-gyp -v v0.13.1
gyp ERR! not ok
npm ERR!
npm ERR! Additional logging details can be found in:
npm ERR! C:\projects\sideLoad\npm-debug.log
npm ERR! not ok code 0

Ideas for forage-fetch

Stuff I think forage-fetch needs to handle:

  • Patterns of url's to follow
  • Patterns of url's to not follow
  • Patterns of url's to download
  • A --follow-robotstxt flag yes|no
  • A way of setting how agressive you should fetch/download
  • ID for the crawler
  • URL to a page that explains why this crawler visited your site/server

These are thoughts based on earlier use of Fast ESP and Scrapy, so the way they solve it may not be the correct answer, but the outcome of it is needed, I think.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.