Giter Site home page Giter Site logo

monolingual-word-aligner's People

Contributors

ma-sultan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

monolingual-word-aligner's Issues

python find Value error

in coreNlpUtil.py line 190 ~ 200

Update 0: fix now

        try:
            left = item[1][0:item[1].rindex("-")]
            wordNumber = item[1][item[1].rindex("-")+1:]
            if wordNumber.isdigit() == False:
                continue
            left += '{' + words[int(wordNumber)-1][1]['CharacterOffsetBegin'] + ' ' + words[int(wordNumber)-1][1]['CharacterOffsetEnd'] + ' ' +  wordNumber + '}'
            newItem.append(left)

            # construct and append entry for 'right'
            right = item[2][0:item[2].rindex("-")]
            wordNumber = item[2][item[2].rindex("-")+1:]
            if wordNumber.isdigit() ==  False:
                continue
            right += '{' + words[int(wordNumber)-1][1]['CharacterOffsetBegin'] + ' ' + words[int(wordNumber)-1][1]['CharacterOffsetEnd'] + ' ' + wordNumber  + '}'
            newItem.append(right)
            result.append(newItem)
        except ValueError as e:
            print e

Some errors

This is the report:

Traceback (most recent call last):
File "/home/jianhong/Final_Year_Project/DLSbased_method.py", line 59, in
extract_and_produce_data(load_dir, save_dir)
File "/home/jianhong/Final_Year_Project/DLSbased_method.py", line 29, in extract_and_produce_data
alignments = align(sentence_1, sentence_2)
File "/home/jianhong/Final_Year_Project/aligner.py", line 1542, in align
sentence2ParseResult = parseText(sentence2)
File "/home/jianhong/Final_Year_Project/coreNlpUtil.py", line 24, in parseText
parseResult = nlp.parse(sentences)
File "/home/jianhong/Final_Year_Project/coreNlpUtil.py", line 18, in parse
return json.loads(self.server.parse(text))
File "/home/jianhong/Final_Year_Project/jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "/home/jianhong/Final_Year_Project/jsonrpc.py", line 907, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/home/jianhong/Final_Year_Project/jsonrpc.py", line 626, in loads_response
raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

Redundant sourceWordsBeingConsidered

In aligner.py lines 1267 and 1268, each source/target word may be appended many times to the sourceWordsBeingConsidered/targetWordsBeingConsidered lists, which make these lists too big due to redundant elements. I do not see the point of including words indices many times as this makes the next loop (line 1285) very time consuming.
To accelerate the execution, I converted sourceWordsBeingConsidered and targetWordsBeingConsidered lists to sets to remove duplicates. It is far faster now and I get the same alignment in testalign.py, however, I want to be sure that this does not deteriorate the alignment quality in other cases. Can you please confirm that removing redudancy is safe?

Errors while running testAlign.py

Any clue on how to fix this?
I have done a fresh install following the steps given in Readme and got the following error when ran testAlign.py

Traceback (most recent call last):
  File "testAlign.py", line 7, in <module>
    alignments = align(sentence1, sentence2)
  File "/home/saketh/stanford-corenlp-python/monolingual-word-aligner/aligner.py", line 1567, in align
    myWordAlignments = alignWords(sentence1LemmasAndPosTags, sentence2LemmasAndPosTags, sentence1ParseResult, sentence2ParseResult)
  File "/home/saketh/stanford-corenlp-python/monolingual-word-aligner/aligner.py", line 1198, in alignWords
    sourceDParse = dependencyParseAndPutOffsets(sourceParseResult)
  File "/home/saketh/stanford-corenlp-python/monolingual-word-aligner/coreNlpUtil.py", line 191, in dependencyParseAndPutOffsets
    left = item[1][0:item[1].rindex("-")]
ValueError: substring not found

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.