Giter Site home page Giter Site logo

dropbox-search's Introduction

Dropbox Search

Index and search the contents of your documents stored in Dropbox. Requires node.js and Solr, supports many document types and keyword shortcuts, and updates index as you add or edit files. Bundled with a simple web front end with snippets and ajax loading of results.

Search screenshot

Example Searches

recipeSearch for text (case and stemming aware)
"cake recipe"Phrase search
jam recipe in:FilesMatch documents within a folder
when:yesterdayAll documents modified yesterday
recipe when:2012Matches from year 2012
by:MikeMatch the given author
dogs type:imageReturn only images
where:40"47'Match lat/long in image metadata

Installation

  1. Get a Dropbox API key.
  2. Set up and run your Solr instance.
  • Use the included solr/schema.xml file, note lines marked EDIT dropbox-search
  1. Edit environment variables as below.
  2. npm install to download dependencies (solr, dbox, express, dateformat).
  3. node indexer.js to index your documents.
  4. node server.js to launch the web app.
  5. Browse to http://localhost:8888/search

Environment Variables

DROPBOX_APP_KEY =
DROPBOX_APP_SECRET =
DROPBOX_UID =
DROPBOX_OAUTH_TOKEN =
DROPBOX_OAUTH_SECRET =
SOLR_HOST = 127.0.0.1
SOLR_PORT = 8983
ROOT_PATH = /

The code doesn't yet implement the oauth protocol, so you must do this manually and provide token and secret for now.

Indexing

Dropbox-search uses ExtractingRequestHandler to index multiple file types, including: pdf, doc and docx (Word), xls (Excel), ppt, odt, csv, html, rtf, txt, and more. In addition to text content, it extracts metadata such as author and date. For image files, it extracts exif metadata like gps_latitude.

I also define some useful shortcuts like:

  • when : matches a date (e.g. today, yesterday, year-mm-dd, year-mm, or year)
  • type : matches a file type (e.g. image or rtf)
  • in : matches files within the given folder or path fragment (e.g. MyFiles)
  • by : same as author
  • where : matches gps_latitude or gps_longitude

The indexer listens for Dropbox API delta events to fetch documents that need to be added or removed from the index.

Note: Dropbox may rate-limit excessive file fetches by returning 503 errors. I try to handle this by queueing file fetches to happen at most once per second.

File type icons © Dropbox Icon Library.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.