m4t1ss / softalignments Goto Github PK

Neural macine translation soft alignment visualisations for web and command line

Home Page: http://attention.lielakeda.lv/

License: MIT License

Python 33.45% JavaScript 35.77% PHP 23.70% CSS 2.54% Shell 4.54%

mt machine-translation nmt neural-machine-translation nematus alignment neural-monkey attention-mechanism attention-alignment-visualization python

softalignments's People

Contributors

Stargazers

Watchers

Forkers

chameleontartu zabin10 lixiangnlp kovalevfm adfors obo bricksdont slye0612 apusto iliyane anjary shuoyangd upendra-k14 lkluo

softalignments's Issues

Missed closing parenthesis

https://github.com/M4t1ss/SoftAlignments/blob/master/web/functions.php#L56

I think that closing parenthesis "?>" is missed.

Visualize two translations simultaneously

Can not using compare with follow Readme.

runing scripts

input=a.nematus 
input2=b.nematus
output_type=web


python process_alignments.py \
    -i $input \
    -o $output_type \
    -f Nematus \
    -v Nematus \
    -w $input2 \

error log

process_alignments.py -i <input_file> [-o <output_type>] [-f <from_system>] [-s <source_sentence_file>] [-t <target_sentence_file>]
input_file is the file with alignment weights (required)
source_sentence_file and target_sentence_file are required only for NeuralMonkey
output_type can be web (default), block, block2 or color
from_system can be Nematus, Marian, Sockeye, OpenNMT or NeuralMonkey (default)

Remember which navigation bars are open and which are closed

Clicking on "Translation", "Confidence", "CDP", "APout", "APin" show the navigation bars. But when clicking on an item in a navigation bar shows the corresponding sentence but disable all navigation bars (they should be kept visible).

Option to combine subword units and their attentions

Probably add this to process_alignments.py as an optional parameter. If set, only save full words and combined attention weights in web output files or show a combined attention matrix in the command line output

Add some filters

For instance, only show sentences that are between in some range of character length.

Fix matrix view when swithcing to shorter sentences

Highlight words and attention alignments on mouse hover

Highlight words and attention alignments on mouse hover in the web version. D3.js probably has ways to get this done

Single attention-matrix format for easy integration

If tool accepts one single attention matrix format (should be specified in the docs), it will be easy to integrate the tool with different NMT frameworks.

NMT framework contributors will only have to save attention weights in the right format to use your tool.

Add a loading indicator

Document-level scores for each one of the metrics

This could be visualized in another table, where e.g. the confidence of the system across different documents could be compared and contrasted.

Fix URL injections

http://attention.lielakeda.lv/?s=79&directory=

Unclear which system is which in the web visualization

Which one is on top and which is the bottom one? Well, probably the systems are in the order that they were given to process_alignments.py, but in the web view it's unclear...

Penalize non-translated sentences

Maximum execution time exceeded

Hi Matiss

For a very large alignments file (150 MB) I get the following error:

Fatal error: Maximum execution time of 30 seconds exceeded in /Users/mathiasmuller/Desktop/alignments/SoftAlignments/web/functions.php on line 54

Is there anything I can do? It would be nice if the tool could load bigger files on demand, e.g. for the web version.

Thanks and regards!
Mathias

Load data asynchronously

Fails if source string empty

processAlignments currently fails if the source is empty and the only attention weights are 1.0s on the </s> (=<EOS>) token.

$ cat test.a
0 ||| a test ||| 8.20928 ||| ||| 0 2
1.0
1.0
1.0

$ python ~/SoftAlignments/process_alignments.py -i test.a -f Nematus -o color

Traceback (most recent call last):
  File "/home/user/mmueller/SoftAlignments/process_alignments.py", line 270, in <module>
    main(sys.argv[1:])
  File "/home/user/mmueller/SoftAlignments/process_alignments.py", line 252, in main
    functions.processAlignments(data, folder, inputfile, outputType, num, refs)
  File "/home/user/mmueller/SoftAlignments/functions.py", line 227, in processAlignments
    ali = [l[:len(list(filter(None, tgt)))] for l in rawAli[:len(src)]]
IndexError: invalid index to scalar variable.

Is this a bug / unaccounted-for edge case, or am I doing something wrong?

Having an empty source seems like a strange requirement, but I am working with multiple sources, so that's a real use case.

Thanks!