Giter Site home page Giter Site logo

simonw / datasette-ripgrep Goto Github PK

View Code? Open in Web Editor NEW
70.0 4.0 2.0 66 KB

Web interface for searching your code using ripgrep, built as a Datasette plugin

Home Page: https://ripgrep.datasette.io

License: Apache License 2.0

Python 80.50% HTML 19.50%
datasette datasette-plugin ripgrep codesearch datasette-io

datasette-ripgrep's Introduction

datasette-ripgrep

PyPI Changelog Tests License

Web interface for searching your code using ripgrep, built as a Datasette plugin

For background on this project see datasette-ripgrep: deploy a regular expression search engine for your source code.

Demo

Try this plugin out at https://ripgrep.datasette.io/-/ripgrep - where you can run regular expression searches across the source code of Datasette and all of the datasette-* plugins belonging to the simonw GitHub user.

Some example searches:

Installation

Install this plugin in the same environment as Datasette.

$ datasette install datasette-ripgrep

The rg executable needs to be installed such that it can be run by this tool.

Usage

This plugin requires configuration: it needs to a path setting so that it knows where to run searches.

Create a metadata.json file that looks like this:

{
    "plugins": {
        "datasette-ripgrep": {
            "path": "/path/to/your/files"
        }
    }
}

Now run Datasette using datasette -m metadata.json. The plugin will add an interface at /-/ripgrep for running searches.

Plugin configuration

The "path" configuration is required. Optional extra configuration options are:

  • time_limit - floating point number. The rg process will be terminated if it takes longer than this limit. The default is one second, 1.0.
  • max_lines - integer. The rg process will be terminated if it returns more than this number of lines. The default is 2000.

Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

cd datasette-ripgrep
python3 -mvenv venv
source venv/bin/activate

Or if you are using pipenv:

pipenv shell

Now install the dependencies and tests:

pip install -e '.[test]'

To run the tests:

pytest

datasette-ripgrep's People

Contributors

simonw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

datasette-ripgrep's Issues

JSON API

It would be cool if this offered a JSON API.

Demo has gone stale

Just noticed the workflow for the demo got suspended for repo inactivity back in June.

Try basic faceted search using "rg ... -l -c"

The -l -c options return just counts-per-file information. I could try using this to add very simple faceted search by filename.

(datasette-ripgrep) datasette % time rg plugin -l -c

datasette/app.py:36
datasette/templates/patterns.html:1
datasette/static/codemirror-5.57.0-sql.min.js:1
datasette/views/base.py:1
datasette/views/database.py:5
datasette/views/table.py:5
datasette/cli.py:20
datasette/publish/cloudrun.py:9
datasette/publish/common.py:9
datasette/url_builder.py:2
datasette/publish/heroku.py:20
datasette/utils/__init__.py:9
datasette/default_menu_links.py:2
datasette/hookspecs.py:2
datasette/plugins.py:23
test-metadata.json:12
docs/authentication.rst:19
docs/contributing.rst:11
fixtures-plugins/my_plugin_2.py:2
README.md:29
metadata.json:12
v1-of-link-to-blob.diff:1
docs/datasette-publish-cloudrun-help.txt:4
tests/fixtures.py:22
tests/test_config_dir.py:10
tests/test_permissions.py:1
tests/test_cli.py:10
tests/test_internals_urls.py:5
tests/test_cli_serve_get.py:10
tests/test_publish_cloudrun.py:4
tests/test_api.py:4
tests/test_docs.py:11
tests/test_html.py:1
tests/plugins/my_plugin_2.py:2
tests/plugins/my_plugin.py:9
tests/conftest.py:1
tests/test_publish_heroku.py:8
tests/test_plugins.py:62
tests/test_utils.py:3
fixtures-plugins/my_plugin.py:9
docs/datasette-package-help.txt:2
docs/installation.rst:12
docs/pages.rst:1
docs/datasette-publish-heroku-help.txt:4
docs/internals.rst:24
docs/publish.rst:8
docs/sql_queries.rst:2
docs/binary_data.rst:2
docs/ecosystem.rst:9
docs/plugin_hooks.rst:73
docs/introspection.rst:5
docs/datasette-serve-help.txt:1
docs/plugins.rst:36
docs/changelog.rst:113
docs/deploying.rst:2
docs/testing_plugins.rst:13
docs/getting_started.rst:1
docs/index.rst:5
docs/settings.rst:6
metadata.yaml:2
docs/writing_plugins.rst:59
test-plugins/my_plugin_2.py:2
test-plugins/my_plugin.py:9
plugins/apikey.py:2
rg plugin -l -c  0.01s user 0.04s system 160% cpu 0.034 total

Expand config docs

Feel free to close this, but I struggled to figure out how to configure this plugin, what the config file should be called, and where it should live.

I skimmed docs and started with config.json, but I keep seeing The path plugin configuration is required.

Eventually, I noticed datasette serve --metadata=metadata.json in some examples from your main docs. (I'll submit a patch for a typo I noticed too.)

I'm opening this for anyone else who tries it out and runs into this issue.

Copy results as Markdown feature

I often find myself using this tool to find examples of something that needs to be changed - so being able to copy and paste the results of a search into a GitHub issue as markdown would be really useful.

Example where I could have benefited from this: simonw/datasette#1432 (comment)

Optionally use "ripgrep-all"

Dear Simon,

I would like to salute you for conceiving this excellent tool which stretches the usage of Datasette beyond its original use-case for publishing SQLite databases [1].

As @pansen just made me aware of ripgrep-all by @phiresky and contributors (most probably through [2]), I would like to humbly ask if this would also spark your interest on this matter.

So, the current limitation of conveniently searching through code might well be expanded to searching through a number of other document types [3] - maybe without having to change the code base significantly.

With kind regards,
Andreas.

[1] https://simonwillison.net/2020/Nov/28/datasette-ripgrep/
[2] https://news.ycombinator.com/item?id=25277280
[3] https://phiresky.github.io/blog/2019/rga--ripgrep-for-zip-targz-docx-odt-epub-jpg/

Show context around search matches

I can use the -C 2 flag for this. Here's the resulting JSON:

(datasette-ripgrep) datasette-ripgrep % rg pytest -C 2 --json
{"type":"begin","data":{"path":{"text":"README.md"}}}
{"type":"context","data":{"path":{"text":"README.md"},"lines":{"text":"To run the tests:\n"},"line_number":61,"absolute_offset":2099,"submatches":[]}}
{"type":"context","data":{"path":{"text":"README.md"},"lines":{"text":"\n"},"line_number":62,"absolute_offset":2117,"submatches":[]}}
{"type":"match","data":{"path":{"text":"README.md"},"lines":{"text":"    pytest\n"},"line_number":63,"absolute_offset":2118,"submatches":[{"match":{"text":"pytest"},"start":4,"end":10}]}}
{"type":"end","data":{"path":{"text":"README.md"},"binary_offset":null,"stats":{"elapsed":{"secs":0,"nanos":687238,"human":"0.000687s"},"searches":1,"searches_with_match":1,"bytes_searched":2129,"bytes_printed":527,"matched_lines":1,"matches":1}}}
{"type":"begin","data":{"path":{"text":"setup.py"}}}
{"type":"context","data":{"path":{"text":"setup.py"},"lines":{"text":"    package_data={\"datasette_ripgrep\": [\"templates/*.html\"]},\n"},"line_number":31,"absolute_offset":936,"submatches":[]}}
{"type":"context","data":{"path":{"text":"setup.py"},"lines":{"text":"    install_requires=[\"datasette\"],\n"},"line_number":32,"absolute_offset":998,"submatches":[]}}
{"type":"match","data":{"path":{"text":"setup.py"},"lines":{"text":"    extras_require={\"test\": [\"pytest\", \"pytest-asyncio\", \"httpx\"]},\n"},"line_number":33,"absolute_offset":1034,"submatches":[{"match":{"text":"pytest"},"start":30,"end":36},{"match":{"text":"pytest"},"start":40,"end":46}]}}

Use -e to avoid input being treated as a command-line switch

I noticed that when you search for something that is a command line argument to ripgrep this will get interpreted.

For example searching for '-v' will cause ripgrep to interpret the -v as a command line switch and return the non matching lines instead of searching for the string -v.

Solution might be as easy as adding '-e' option before the search term in run_ripgrep to tell ripgrep to interpret the next argument as a pattern.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.