Giter Site home page Giter Site logo

ru-syntax's People

Contributors

mnvx avatar sadov-m avatar tiefling-cat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ru-syntax's Issues

Tag description

I think, Readme needs a link to the output tag description. And it also would be greate if you provide an origin source for treetagger model.

issue with processing.py while trying to run ru-syntax on Windows 10

There's an issue with running ru-syntax on Windows 10: processing.py uses (as far as I understood) shlex.split which is (probably) good for "writing lexical analyzers for simple syntaxes resembling that of the Unix shell", but not for Windows command line.

So if someone faced or faces the same issue, just replace the 51th string "call(shlex.split(malt_call_line.format(malt_name, model_name, raw_fname, ofname)))" in processing.py with "call(malt_call_line.format(malt_name, model_name, raw_fname, ofname))". That should do the trick.

Problems with processing.py

Traceback (most recent call last):
File "ru-syntax.py", line 97, in
*tmp_fnames)
File "C:\Users\Maria\OneDrive\HSE\Projects\Sketches\ru-syntax-master\processing.py", line 31, in process
call(mystem_call_list)
File "C:\Users\Maria\Anaconda3\lib\subprocess.py", line 267, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Users\Maria\Anaconda3\lib\subprocess.py", line 707, in init
restore_signals, start_new_session)
File "C:\Users\Maria\Anaconda3\lib\subprocess.py", line 990, in _execute_child
startupinfo)
PermissionError: [WinError 5] Отказано в доступе

Is it possible to use this system with Russian-SynTagRus corpus?

I trying to use this system using Russian-SynTagRus corpus downloaded from http://universaldependencies.org (direct link is https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-1827/ud-treebanks-v1.4.tgz).

After training MaltParser on this corpus I try to build syntax tree.

echo "МИД опроверг закрытие американской школы в Москве в ответ на санкции" > example.txt
python3 ru-syntax.py example.txt

And I obtain next data in out/example.conll:

1       МИД     МИД     S       S n inan nom sg S n inan nom sg 2       nsubj   _       _
2       опроверг        ОПРОВЕРГАТЬ     V       V pf praet sg indic m - V pf praet sg indic m - 0       root    _       _
3       закрытие        ЗАКРЫТИЕ        S       S n inan acc sg S n inan acc sg 2       punct   _       _
4       американской    АМЕРИКАНСКИЙ    A       A - gen sg plen f -     A - gen sg plen f -     5       amod    _       _
5       школы   ШКОЛА   S       S f inan gen sg S f inan gen sg 3       nmod    _       _
6       в       В       PR      PR      PR      7       case    _       _
7       Москве  МОСКВА  S       S f inan loc sg S f inan loc sg 5       nmod    _       _
8       в       В       PR      PR      PR      9       case    _       _
9       ответ   ОТВЕТ   S       S m inan acc sg S m inan acc sg 11      punct   _       _
10      на      НА      PR      PR      PR      11      case    _       _
11      санкции САНКЦИЯ S       S f inan gen sg S f inan gen sg 0       ROOT    _       _
12      .       .       SENT    SENT    SENT    0       ROOT    _       _

As I see here is three roots (it is not tree) and some parts of speech are detected wrongly. Can you comment - what is wrong and how to build syntax tree correctly?

FileNotFoundError: [Errno 2] No such file or directory: ''

Have error using python3.5:

ru-syntax$ python3 ru-syntax.py example.txt 
Traceback (most recent call last):
  File "ru-syntax.py", line 58, in <module>
    os.chdir(os.path.dirname(__file__))
FileNotFoundError: [Errno 2] No such file or directory: ''

More info:

$ python3 --version
Python 3.5.2

My config:

[DEFAULT]
# full path to the folder containing ru-syntax.py
APP_ROOT = /home/ubuntu/soft/ru-syntax
# path to the folder containing Mystem, TreeTagger, and MaltParser
BIN_PATH = %(APP_ROOT)s/bin
# path for output folder
OUT_PATH = %(APP_ROOT)s/out
# path for temporary files folder
TMP_PATH = %(APP_ROOT)s/tmp

[mystem]
# path to mystem binary
MYSTEM_PATH = /home/ubuntu/soft/mystem/mystem

[malt]
# full path to the folder containing MaltParser
MALT_ROOT = /home/ubuntu/soft/maltparser-1.9.0
# name of MaltParser binary file
MALT_NAME = maltparser-1.9.0.jar
# name of MaltParser model
MODEL_NAME = test

[dicts]
# path to composites dictionary file
COMP_DICT_PATH = %(APP_ROOT)s/dictionaries/composites.csv

[treetagger]
# path to TreeTagger folder
TREETAGGER_BIN = /home/ubuntu/soft/tree-tagger/bin/tree-tagger
# path to Treetagger model
TREETAGGER_PAR = %(APP_ROOT)s/tree_alltags_model.par

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.