ma-sultan / monolingual-word-aligner Goto Github PK

View Code? Open in Web Editor NEW

81.0 81.0 24.0 1.77 MB

Python 100.00%

monolingual-word-aligner's People

Contributors

Stargazers

Watchers

monolingual-word-aligner's Issues

python find Value error

in coreNlpUtil.py line 190 ~ 200

Update 0: fix now

        try:
            left = item[1][0:item[1].rindex("-")]
            wordNumber = item[1][item[1].rindex("-")+1:]
            if wordNumber.isdigit() == False:
                continue
            left += '{' + words[int(wordNumber)-1][1]['CharacterOffsetBegin'] + ' ' + words[int(wordNumber)-1][1]['CharacterOffsetEnd'] + ' ' +  wordNumber + '}'
            newItem.append(left)

            # construct and append entry for 'right'
            right = item[2][0:item[2].rindex("-")]
            wordNumber = item[2][item[2].rindex("-")+1:]
            if wordNumber.isdigit() ==  False:
                continue
            right += '{' + words[int(wordNumber)-1][1]['CharacterOffsetBegin'] + ' ' + words[int(wordNumber)-1][1]['CharacterOffsetEnd'] + ' ' + wordNumber  + '}'
            newItem.append(right)
            result.append(newItem)
        except ValueError as e:
            print e

Traceback (most recent call last):
File "/home/jianhong/Final_Year_Project/DLSbased_method.py", line 59, in
extract_and_produce_data(load_dir, save_dir)
File "/home/jianhong/Final_Year_Project/DLSbased_method.py", line 29, in extract_and_produce_data
alignments = align(sentence_1, sentence_2)
File "/home/jianhong/Final_Year_Project/aligner.py", line 1542, in align
sentence2ParseResult = parseText(sentence2)
File "/home/jianhong/Final_Year_Project/coreNlpUtil.py", line 24, in parseText
parseResult = nlp.parse(sentences)
File "/home/jianhong/Final_Year_Project/coreNlpUtil.py", line 18, in parse
return json.loads(self.server.parse(text))
File "/home/jianhong/Final_Year_Project/jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "/home/jianhong/Final_Year_Project/jsonrpc.py", line 907, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/home/jianhong/Final_Year_Project/jsonrpc.py", line 626, in loads_response
raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

Redundant sourceWordsBeingConsidered

In aligner.py lines 1267 and 1268, each source/target word may be appended many times to the sourceWordsBeingConsidered/targetWordsBeingConsidered lists, which make these lists too big due to redundant elements. I do not see the point of including words indices many times as this makes the next loop (line 1285) very time consuming.
To accelerate the execution, I converted sourceWordsBeingConsidered and targetWordsBeingConsidered lists to sets to remove duplicates. It is far faster now and I get the same alignment in testalign.py, however, I want to be sure that this does not deteriorate the alignment quality in other cases. Can you please confirm that removing redudancy is safe?

Errors while running testAlign.py

Any clue on how to fix this?
I have done a fresh install following the steps given in Readme and got the following error when ran testAlign.py

Traceback (most recent call last):
  File "testAlign.py", line 7, in <module>
    alignments = align(sentence1, sentence2)
  File "/home/saketh/stanford-corenlp-python/monolingual-word-aligner/aligner.py", line 1567, in align
    myWordAlignments = alignWords(sentence1LemmasAndPosTags, sentence2LemmasAndPosTags, sentence1ParseResult, sentence2ParseResult)
  File "/home/saketh/stanford-corenlp-python/monolingual-word-aligner/aligner.py", line 1198, in alignWords
    sourceDParse = dependencyParseAndPutOffsets(sourceParseResult)
  File "/home/saketh/stanford-corenlp-python/monolingual-word-aligner/coreNlpUtil.py", line 191, in dependencyParseAndPutOffsets
    left = item[1][0:item[1].rindex("-")]
ValueError: substring not found

ma-sultan / monolingual-word-aligner Goto Github PK

monolingual-word-aligner's People

Contributors

Stargazers

Watchers

Forkers

monolingual-word-aligner's Issues

python find Value error

Some errors

Redundant sourceWordsBeingConsidered

Errors while running testAlign.py

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent