Comments (3)
Thanks for your post!
I think that Python2 compatibility is very good idea. Sorry that I didn't care about Python2 earlier but pyMorfologik was a part of bigger application written in Python3.
I have an idea how to reconcile "list of tuples" with "dictionary". I propose to create an abstract class Parser with abstract method parse. Then create a method get_stem in Morfologik class, which expect 2 parameters: list of words and instance of parser. So get_stem method will look like:
def get_stem(self, words, parser):
words = self._make_unique(words)
output = self._run_morfologik(words)
return parser.parse(output)
Thus you can create a "to list of tuples" parser extending abstract parser and we can accomodote existing code to "to dict" parser. Moreover I would like to create (in the near future) a parser which care about information like category of word (verb/adj/...), gender, etc. And naturally way to do it, will be just extend abstract parser and make some code.
What do you think about it?
Anyway, I think that pull request is good idea.
Damian
from pymorfologik.
Hi!
(I realized we could discuss things in Polish, but I guess it's good to keep things readable for the rest of the world too)
Good idea!. I'll make the changes and submit it via a pull request soon.
Yet there would be few minor changes from my side:
- to shift _make_unique() from within get_stem() of Morfologik class to a Parser class;
For my application I would need all text to be stemmed, with some words repeating (and thus with repeating stems). So _make_unique() is not desired for my purposes. I hope that's okey (and it is in line with the morfologik way of outputing things anyway. Still, one could remove repeating words (and hence stems) in a parser if one wishes so, or by simply reducing (via set()?) a list of words supplied as an input - let's rename get_stem() to stem() => even simpler name; it is also a verb so it will play nicely here
- let's add **kwargs to stem(), so some additional parameters could be used if needed
I'll submit my proposal soon.
Regards,
Adrian
from pymorfologik.
Thanks for the merge. Wouldn't this be a good time to release a new pip package?
I would say it would! :)
from pymorfologik.
Related Issues (2)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pymorfologik.