Giter Site home page Giter Site logo

findus23 / pylanguagetool Goto Github PK

View Code? Open in Web Editor NEW
121.0 4.0 8.0 251 KB

Python Library and CLI for the LanguageTool JSON API

Home Page: https://pylanguagetool.lw1.at/

License: MIT License

Python 100.00%
cli languagetool grammar spell-check spellcheck

pylanguagetool's Introduction

pyLanguagetool

gitlab-ci license Latest Version pypi_versions

A python library and CLI for the LanguageTool JSON API.

LanguageTool is an open source spellchecking platform. It supports a large variety of languages and has advanced grammar support.

image

Installation

pyLanguagetool can be installed with pip/pipenv:

pip install pylanguagetool
# or via pipenv
pipenv install pylanguagetool

Basic Usage

# pipe text to pylanguagetool
echo "This is a example" | pylanguagetool

# read text from a file
pylanguagetool textfile.txt

# read text from stdin
pylanguagetool < textfile.txt

# read text from the systems clipboard
pylanguagetool -c # get text from system clipboard

All samples above will return a list of detected errors and possible replacements.

# Use "an" instead of 'a' if the following word starts with a vowel sound, e.g. 'an article', 'an hour'
#   ✗ This is a example
#             ^
#   ✓ This is an example

Configuration

All LanguageTool API parameters can be set via command line arguments, environment variables or a configuration file (~/.config/pyLanguagetool.conf) For more information about the configuration file syntax, read the ConfigArgParse documentation.

Parameters

$ pylanguagetool --help
usage: pylanguagetool [-h] [-v] [-a API_URL] [--no-color] [-c] [-s]
                      [-t {txt,html,md,rst,ipynb}] [-l LANG]
                      [-m MOTHER_TONGUE] [-p PREFERRED_VARIANTS]
                      [-e ENABLED_RULES] [-d DISABLED_RULES]
                      [--enabled-categories ENABLED_CATEGORIES]
                      [--disabled-categories DISABLED_CATEGORIES]
                      [--enabled-only] [--pwl PWL]
                      [input file]

Args that start with '--' (eg. -v) can also be set in a config file
(~/.config/pyLanguagetool.conf). Config file syntax allows: key=value,
flag=true, stuff=[a,b,c] (for details, see syntax at
https://pypi.org/project/ConfigArgParse/). If an arg is specified in more than
one place, then commandline values override environment variables which
override config file values which override defaults.

positional arguments:
  input file            input file

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         [env var: VERBOSE]
  -a API_URL, --api-url API_URL
                        the URL of the v2 languagetool API, should end with
                        '/v2/' [env var: API_URL]
  --no-color            don't color output [env var: NO_COLOR]
  -c, --clipboard       get text from system clipboard [env var: CLIPBOARD]
  -s, --single-line     check every line on its own [env var: SINGLE_LINE]
  -t {txt,html,md,rst,ipynb}, --input-type {txt,html,md,rst,ipynb}
                        if not plaintext [env var: CLIPBOARD]
  -r, --rules           show the matching rules [env var: RULES]
  --rule-categories     show the the categories of the matching rules [env
                        var: RULE_CATEGORIES]
  -l LANG, --lang LANG  A language code like en or en-US, or auto to guess the
                        language automatically (see preferredVariants below).
                        For languages with variants (English, German,
                        Portuguese) spell checking will only be activated when
                        you specify the variant, e.g. en-GB instead of just
                        en. [env var: TEXTLANG]
  -m MOTHER_TONGUE, --mother-tongue MOTHER_TONGUE
                        A language code of the user's native language,
                        enabling false friends checks for some language pairs.
                        [env var: MOTHER__TONGUE]
  -p PREFERRED_VARIANTS, --preferred-variants PREFERRED_VARIANTS
                        Comma-separated list of preferred language variants.
                        The language detector used with language=auto can
                        detect e.g. English, but it cannot decide whether
                        British English or American English is used. Thus this
                        parameter can be used to specify the preferred
                        variants like en-GB and de-AT. Only available with
                        language=auto. [env var: PREFERRED_VARIANTS]
  -e ENABLED_RULES, --enabled-rules ENABLED_RULES
                        IDs of rules to be enabled, comma-separated [env var:
                        ENABLED_RULES]
  -d DISABLED_RULES, --disabled-rules DISABLED_RULES
                        IDs of rules to be disabled, comma-separated [env var:
                        DISABLED_RULES]
  --enabled-categories ENABLED_CATEGORIES
                        IDs of categories to be enabled, comma-separated [env
                        var: ENABLED_CATEGORIES]
  --disabled-categories DISABLED_CATEGORIES
                        IDs of categories to be disabled, comma-separated [env
                        var: DISABLED_CATEGORIES]
  --enabled-only        enable only the rules and categories whose IDs are
                        specified with --enabled-rules or --enabled-categories

  --pwl PWL, --personal-word-list PWL
                        File name of personal dictionary. A private dictionary
                        can be used to add special words that would otherwise
                        be marked as spelling errors. [env var:
                        PERSONAL_WORD_LIST]

Privacy

By default, pyLanguagetool sends all text via HTTPS to the LanguageTool server (see their privacy policy). You can also set up your own server and use it by changing the --api-url attribute.

pylanguagetool's People

Contributors

codingjoe avatar findus23 avatar fkarg avatar matze-dd avatar pyup-bot avatar scheijan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pylanguagetool's Issues

Field 'enabledOnly' in HTML request

As LT expects 'true' or 'yes', the field should be set in api.py with

if enabled_only:
    post_parameters["enabledOnly"]  = 'true'

See line 285 in LT's TextChecker.java:

    boolean useEnabledOnly = "yes".equals(parameters.get("enabledOnly")) || "true".equals(parameters.get("enabledOnly"));

verbosity

organize errors, warnings, notices (with colors)

output could be like the jar file

I mean a lot of other tools use the output of the cli languagetool.

java -jar languagetool-commandline.jar -l xx <filename>

If if could output the same format will be awesome

Split large text bodies into multiple API requests

Hi there,

I am a longtime LanguageTool user and I just stumbled across your library (thanks btw!).
While trying to incorporate this tool into my workflow, I came across the following error:

ValueError: Error: Your text exceeds the limit of 20000 characters (it's 25869 characters). Please submit a shorter text.

Now, I'm not familiar with the internal workings of the LanguageTool API, but would it be possible to split large text bodies into smaller chunks and to send multiple API requests?

Support premium access

Client should support premium access.

It should be possible by adding the following lines into cli.py

 p.add_argument("-U", "--username", env_var="USERNAME",
                   help="username for the languagetool premium API"
                   )
    p.add_argument("-P", "--api-key", env_var="API_KEY", help="apiKey for the languagetool premium API",)

Add API documentation

Hi,

I noticed that only the CLI portion of this package is documented. I my humble opinion the Python API deserves some documentation too, especially since people are looking for an alternative to pyenchant.

Mind if I give it a go? I am fairly familiar with Sphinx and readthedocs setup as well as writing documentation.

I might throw in a little Sphinx extension, while I am at it ;)

Best
-Joe

Initial Update

Hi 👊

This is my first visit to this fine repo, but it seems you have been working hard to keep all dependencies updated so far.

Once you have closed this issue, I'll create seperate pull requests for every update as soon as I find one.

That's it for now!

Happy merging! 🤖

Consider editing repository setings to remove "Packages" section

"Packages No packages published" is displayed right now, fortunately this pointless section can be removed.

Edit repo page config to remove it (cog next to the description).

I am not making a PR as it is defined in proprietary github settings, not in a git repository - and I have no rights to modify repo settings.

Peek 2020-10-25 09-10

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.