Giter Site home page Giter Site logo

psawa / gecko-app Goto Github PK

View Code? Open in Web Editor NEW
32.0 2.0 9.0 3.31 MB

A web application that interfaces two GEC systems. [web instance is down]

Home Page: https://gecko-app.azurewebsites.net/

License: Other

Python 77.39% JavaScript 3.63% SCSS 11.50% HTML 6.83% Dockerfile 0.65%
grammatical error corrector sentence reordering discourse analysis language assisting

gecko-app's Introduction

GECko+

logo of gecko

More than a Grammatical Error Corrector

GECko+ is an English language assisting tool that corrects mistakes of various types on written texts. While many well-settled pieces of software of its kind correct mistakes at the grammatical level (orthography and syntax), our novel pipeline allows the tool to perform corrections both at grammatical and at discourse level.

demo

Use cases examples

Original text Corrected text
Whoever is happy wil make other persons happy to. Whoever is happy will make other people happy too.
The weather was so nice! Yesterday I go to beach. Yesterday I went to the beach. The weather was so nice!
The wood are lovely, dark,, and deep. And miles to go before I sleep. But I have promises to keep. The woods are lovely, dark, and deep. But I have promises to keep. And miles to go before I sleep.
Fool me twice, shame on me. Fool me once, shame on you. Fool me once, shame on you. Fool me twice, shame on me.
Secondy, prepare the pan using oil and butter. Then, put onions and carrots together with salt an pepper, inside the pan. Lastly, let them cooked for 15 minutes, and remove off the food fom the pan. First of all, cut some onions and carrots. First of all, cut some onions and carrots. Secondly, prepare the pan using oil and butter. Then, put the onions and carrots together with salt and pepper, inside the pan. Lastly, let them cook for 15 minutes, and remove the food from the pan.

Running locally and developing

Option 1 - Using a virtual environment

  1. Clone the repository.

  2. Create a virtual environment with Python 3.7.

  3. Install the requirements files pip install -r requirements.txt. If it doesn't work, the following can be a workaround: python3.7 -m pip install -r requirements.txt

  4. Download the models by executing:

    • mkdir -p application/models/gector/data/model_files && cd application/models/gector/data/model_files && curl -O https://grammarly-nlp-data-public.s3.amazonaws.com/gector/xlnet_0_gector.th.
    • mkdir -p application/models/sentence_reorder && cd application/models/sentence_reorder/ && curl -O http://tts.speech.cs.cmu.edu/sentence_order/nips_bert.tar && tar -xf nips_bert.tar && rm nips_bert.tar && mv nips_bert/ model/.
  5. Run the web app locally by executing run.py. The default development URL is http://127.0.0.1:5000/.

Option 2 - Using Docker

  1. Build the Docker image: docker build -t gecko-app ..
  2. Run the image: docker run -d -p 3000:80 gecko-app. The URL will be http://0.0.0.0:3000/.

Acknowledgments

Our tool implements the two following models, for tackling, respectively, grammatical and discourse errors:

  • Kostiantyn Omelianchuk, Vitaliy Atrasevych, Artem Chernodub and Oleksandr Skurzhanskyi "GECToR -- Grammatical Error Correction: Tag, Not Rewrite". In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications. [arXiv]
  • Prabhumoye, Shrimai, Ruslan Salakhutdinov, and Alan W. Black. "Topological Sort for Sentence Ordering." In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [arXiv]

Citation

The paper describing GECko+ is finally out! If you find this tool useful in your research, please consider citing it:

@inproceedings{calo-etal-2021-gecko,
    title = "{GEC}ko+: a Grammatical and Discourse Error Correction Tool",
    author = "Cal{\`o}, Eduardo  and
      Jacqmin, L{\'e}o  and
      Rosemplatt, Thibo  and
      Amblard, Maxime  and
      Couceiro, Miguel  and
      Kulkarni, Ajinkya",
    booktitle = "Actes de la 28e Conf{\'e}rence sur le Traitement Automatique des Langues Naturelles. Volume 3 : D{\'e}monstrations",
    month = "6",
    year = "2021",
    address = "Lille, France",
    publisher = "ATALA",
    url = "https://aclanthology.org/2021.jeptalnrecital-demo.3",
    pages = "8--11",
    abstract = "We introduce GECko+, a web-based writing assistance tool for English that corrects errors both at the sentence and at the discourse level. It is based on two state-of-the-art models for grammar error correction and sentence ordering. GECko+ is available online as a web application that implements a pipeline combining the two models.",
}

Contribute

We accept contributions, whether they intend to fix an issue or to add new functionalities. Don't hesitate to submit a pull request!

https://github.com/psawa/gecko-app/issues

License

The software is distributed under Common Creative 4.0 license.

License: CC BY-NC-SA 4.0

gecko-app's People

Contributors

dodo-s95 avatar jacqle avatar justsubh01 avatar psawa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

gecko-app's Issues

words been truncated during correction

. Hi we are using your code for GEC but we could see below sentence been corrected even though its correct

'Inpatients and Outpatients are allowed ' been modified as 'patients and patients..Could you please look on this

SEO description

When the website appears in google, the description provided is the description of Eduardo's profile. It should show the decription of the application itself (about section)

image

Detokenization doesn't work as expected

Current behavior: the TreebankWordDetokenizer fails to group back together hyphenated words.
Expected behavior: the output text does not contain any additional spaces compared to the input text.
E.g.: An out-of-the-box feature -> An out - of - the - box feature

Don't find "pysdb" "0.3.4"

During the docker build there was an arror that the "pysdb==0.3.4" is not found.
I got it running by changed the version to "pysdb==0.0.3".

The docker build brocke on:
/bin/sh: 1: cd: can't cd to application/models/gector/data/model_files

Please fix the issues.
Even better, please upload the docker file to hub.docker.com

Process the input text sentence by sentence

Current behaviour: I don't know exactly why, but if you want to correct several sentences, in some cases punctuation will be removed after correction.

Expected behaviour:
The input text should be processed sentence by sentence, and punctuation should remain when needed.

General UX improvements

We have various UX improvement to make, suggested by some students in cognitive science:

  • Set key bindings to trigger correction by typing enter, and go back to line by typing shift+enter

  • Check the quality of our profile descriptions. (Edu: "at Samsung", Leo: ", and", Thi: "which, I am convinced, can")

  • Change the About part adding the discourse feature description and citing the other model we'll implement

Add readme

We should add a readme.md which should address installation and some more general info

Modification in the title of the tab

Currently, the title displayed in the tab of the browser is GECko+ - Gobbles up your mistakes.
This + - is kinda odd, so I suggest to use a pipe instead of the hyphen.

GECko+ | Gobbles up your mistakes

Pasting text in non-empty input box

Not sure what is causing this, but I've noticed two related bugs:

  • When selecting a text to paste something over it, the selected text remains and the pasted text is added at the end.
  • When selecting a place in the text to paste, the pasted text is added at the end too.

Delete cache

Delete all the predicted files related to the query in paragraph folder

Add DEMO BUTTON

Add a demo button with the sentence of the recipes to show all the functionalities.

Visualize swapped sentences

Currently there is no visual indicator that the text has been re-ordered. I don't exactly know under which form it could be indicated. But I think this is very important, to let the user know what transformations have been applied to their text.

Display inconsistencies on various laptops

I noticed that on some computers, the website is displayed very "big". I mean that all the content seems to be very zoomed-in. Among the undesired effects:

  • The demo-area is pushed below the visible part of the screen
  • The about container is displayed vertically (The team is above the project, instead of having them side-by-side)

Potential track: I develop on Ubuntu, and I noticed that the system is scaled-out in comparison to when on Windows. The website with ubuntu displays well, even if I zoom to 150%.

My screen resolution: 1920:1080

Underline the differences between input and output text

We want that in the output box, token-wise differences with input text be highlighted.

The css classe of the highlighting style already exist, they are called delta-* in the css file.

I believe all we need is either comparing the input and the final output token-wise, or keep track of the changes during the correction process, then adding the css tags where needed..

Mobile layout

Improve CSS rules so that the website is nicely displayed on mobiles

Sentence reordering seems to be messed up when app deployed

I don't remember exactly how it was, but when the app was still working online, output sentences were super weird, i.e. mixed up one with the other.

I suspect that this is due to the fact that there is only 1 tsv file which is overwritten at each query. So if more than 1 user tries to predict, there will be a conflict.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.