Comments (5)
Well, the FrogAPI::Frogtostring method creates a FoLiA Document, and then returns the serialized version.
Is it your intention to get a pointer to the underlying Document? Normally that is unwise.
What is your use-case? Maybe @proycon could add a Python method. But then you still need to use the libfolia C++ API to process it..
from frog.
Python-frog's Frog.process seems to implement code to return a python-version of the document.
The document in the line return PynlplFoliaDocument(string=self.process_raw(text))
expects string
to be XML, but Frog::Frogtostring, which is eventually called via the process_raw call, can only return the results via FrogAPI::showResults:
stringstream ss;
FrogDoc( *doc, true );
showResults( ss, *doc );
delete doc;
return ss.str();
If the Frog::Frogtostring method could return XML as a string instead of the tab delimited output when doXMLout is true, I think the code that is already present in the python wrapper will work. The code to switch between xml and tabs is already present in Frog::Frogserver:
if ( options.doXMLout ){
doc.save( outputstream, options.doKanon );
}
else {
showResults( outputstream, doc );
}
A simple example of what I tried to do:
import frog
from pynlpl.formats import folia
doc = folia.Document(file="in_folia.xml")
handle = frog.Frog(frog.FrogOptions(xmlin=True,xmlout=True,parser=False,morph=False,lemma=False,ner=False))
output = handle.process(doc)
I expect output to be a folia.Document, but currently this throws an exception because the PynlplFoliaDocument (created in process) expected XML but got tabs instead.
(I was trying to setup a build environment to make a pull request, because it seems really simple to implement 😄 )
from frog.
A, OK. Now i see. Indeed not a serialized Document but a tabbed version is returned. :{
Hmm. I leave it to @proycon to solve this. :)
from frog.
Ok,
in git the code for FrogAPI::Frogtostring is changed now.
The example should work as expected.
from frog.
Thanks!
I've tested it and the example indeed now works as expected.
from frog.
Related Issues (20)
- Frog Chunker creates invalid FoLiA HOT 2
- released frog (0.29) depends on unreleased libfolia (2.15) HOT 2
- Building on Ubuntu 22.04 LTS Pop!_OS HOT 1
- Token annotation error for XML output with non-standard rules HOT 3
- segmentation fault when invoked with a missing [[tokenizer]] section in the configuration HOT 5
- Server mode creates only 1 paragraph HOT 2
- Add JSON output as an alternative to 'tabbed' format HOT 3
- Frog breaks while processing large amount of txt data HOT 11
- Keep the deep_morph structure intact when resolving MWU's HOT 1
- Simplify option and configuration handling
- MWU output when no Parser is selected HOT 7
- Update debian package for v0.20
- Python Frog HOT 2
- Frog (through python-frog) accumulates a huge number of temporary files HOT 11
- Praktische vragen rondom grote datasets HOT 7
- Bug: frog server; frog-:connection lost unexpected : write to client failed HOT 2
- Segfault on FoLiA in to FoLiA out (speech data with events and utterances) HOT 7
- New release? HOT 3
- frog lemmatizer with --deep-morph misses a morpheme in FoLiA output
- [Docker] Initialization fails for nld-vnn and dum HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from frog.