clarin-pl / liner2 Goto Github PK
View Code? Open in Web Editor NEWGeneric framework for information extraction tasks, including recognition of named entities, temporal expressions, spatial expressions and events.
Generic framework for information extraction tasks, including recognition of named entities, temporal expressions, spatial expressions and events.
when running my program using g419-liner2-cli-2.7.jar
on openjdk 11.0.18
I've got following warnings while loading chunkers:
-> Setting up chunker: chunker_timex_norm_duration
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.gson.internal.bind.ReflectiveTypeAdapterFactory (file:/app/jars/g419-liner2-cli-2.7.jar) to field java.util.regex.Pattern.pattern
WARNING: Please consider reporting this to the maintainers of com.google.gson.internal.bind.ReflectiveTypeAdapterFactory
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
As message says openjdk 17 is blocked by this.
I've spotted this warnings because my other deps can't run on jdk 8:
NameError: cannot link Java class cz.cuni.mff.ufal.morphodita.Tagger cz/cuni/mff/ufal/morphodita/Tagger has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0
Running example for docker service:
$ python3 stuff/python/liner2rmq.py -t "Pani Ala Nowak mieszkw w Zielonej Górze"
I've got error as follows:
[INFO] Temp route: route-DYOWAE
[INFO] Temp input file: /tmp/ngm6jgy_
[INFO] Sent msg 'route-DYOWAE /tmp/ngm6jgy_' to liner2-input
Traceback (most recent call last):
File "stuff/python/liner2rmq.py", line 101, in <module>\n
process_text(text, args.o)\n
File "stuff/python/liner2rmq.py", line 82, in process_text
xml = liner2.process(text)
File "stuff/python/liner2rmq.py", line 61, in process
output_path = self.receive(route)
File "stuff/python/liner2rmq.py", line 29, in receive
result = channel.queue_declare(exclusive=True)
TypeError: queue_declare() missing 1 required positional argument: 'queue'
Line 29 in file "stuff/python/liner2rmq.py" looks like this:
result = channel.queue_declare(exclusive=True)
And should be changed for this:
result = channel.queue_declare(queue=self.output_queue, exclusive=True)
After this change code works just fine.
I haven't been able to run the app on macOS 10.15. Running docker-compose build
would result in following errors:
#75 86.97 c++: internal compiler error: Killed (program cc1plus)
#75 86.97 Please submit a full bug report,
#75 86.97 with preprocessed source if appropriate.
#75 86.97 See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
#75 87.26 libwccl/CMakeFiles/wccl.dir/build.make:2070: recipe for target 'libwccl/CMakeFiles/wccl.dir/variables.cpp.o' failed
#75 87.26 make[2]: *** [libwccl/CMakeFiles/wccl.dir/variables.cpp.o] Error 4
This was probably caused by running out of memory (Docker on Mac also runs a Linux VM). Assigning 16 GB of RAM to Docker wouldn't solve the problem; I have managed to build the app after changing make -j
to make -j1
in the Dockerfile.
There was another problem - running python3 stuff/python/liner2rmq.py -t "Pani Ala Nowak mieszkw w Zielonej Górze"
would result in:
[INFO] Temp route: route-XAYR69
[INFO] Temp input file: /var/folders/dz/7xrvhr2s43j71_k_f2zch2fw0000gn/T/hl833qwm
[INFO] Sent msg 'route-XAYR69 /var/folders/dz/7xrvhr2s43j71_k_f2zch2fw0000gn/T/hl833qwm' to liner2-input
[INFO] Temp output file: b'ERROR'
Logs on the server side:
liner2_1 | INFO [pool-1-thread-4] (RabbitMqWorker.java:105) - Received path: '/var/folders/dz/7xrvhr2s43j71_k_f2zch2fw0000gn/T/c5tl6vhc'
liner2_1 | ERROR [pool-1-thread-4] (RabbitMqWorker.java:81) - An exception occured
liner2_1 | java.io.FileNotFoundException: /var/folders/dz/7xrvhr2s43j71_k_f2zch2fw0000gn/T/c5tl6vhc (No such file or directory)
stuff/python/liner2rmq.py uses the OS default temp dir, which in case of macOS is not /tmp. In my case running echo $TMPDIR
yields
/var/folders/dz/7xrvhr2s43j71_k_f2zch2fw0000gn/T/
.
I have changed tempfile._get_default_tempdir()
in liner2rmq.py (line 37) to '\tmp'
, which solved the problem.
Mounting the default temp directory to a volume also worked.
Executing simple IOB file from test resources(g419-corpus/src/test/resources) creates invalid CCL file which is a concatenation of CCL's. This happens because source IOB contains multiple -DOCSTART FILE file1 lines. Operation probably should produce a multi-chunk CCL document. Output CCL file is invalid against DTD and it might produce errors in chained tools where the above conditions are met.
reproduce issue by:
./liner2-cli convert -f g419-corpus/src/test/resources/Simple.iob -t ./iob.ccl -i iob -o ccl
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.