Giter Site home page Giter Site logo

fuzzyset's People

Contributors

adampg avatar alpae avatar axiak avatar glench avatar metal3d avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuzzyset's Issues

cFuzzySet usage: all match scores are 0?

Hello,

When I use cFuzzySet instead of FuzzySet, every match returns a list of the same length, with all zero similarity scores. I tried changing the rel_sim_cutoff parameter, but it doesn't appear to do anything. Is this the intended behavior?

FuzzySet returns matches with scores, and the match lists are different lengths depending on the scores, which is what I expected.

Thank you!

Does not Compile Under PyPy

Installing fuzzyset in a virtual environment with PyPy 1.8 and fuzzyset 0.0.7 on Ubuntu 12.04 fails. Compilation error below:

Running fuzzyset-0.0.7/setup.py -q bdist_egg --dist-dir /tmp/easy_install-SqZtNT/fuzzyset-0.0.7/egg-dist-tmp-i5bSee
fuzzyset/cfuzzyset.c: In function ‘__Pyx_GetException’:
fuzzyset/cfuzzyset.c:4998:5: warning: implicit declaration of function ‘PyThreadState_GET’ [-Wimplicit-function-declaration]
fuzzyset/cfuzzyset.c:4998:29: warning: initialization makes pointer from integer without a cast [enabled by default]
fuzzyset/cfuzzyset.c:4999:24: error: ‘PyThreadState’ has no member named ‘curexc_type’
fuzzyset/cfuzzyset.c:5000:25: error: ‘PyThreadState’ has no member named ‘curexc_value’
fuzzyset/cfuzzyset.c:5001:22: error: ‘PyThreadState’ has no member named ‘curexc_traceback’
fuzzyset/cfuzzyset.c:5002:11: error: ‘PyThreadState’ has no member named ‘curexc_type’
fuzzyset/cfuzzyset.c:5003:11: error: ‘PyThreadState’ has no member named ‘curexc_value’
fuzzyset/cfuzzyset.c:5004:11: error: ‘PyThreadState’ has no member named ‘curexc_traceback’
fuzzyset/cfuzzyset.c:5006:9: error: ‘PyThreadState’ has no member named ‘curexc_type’
fuzzyset/cfuzzyset.c:5018:22: error: ‘PyThreadState’ has no member named ‘exc_type’
fuzzyset/cfuzzyset.c:5019:23: error: ‘PyThreadState’ has no member named ‘exc_value’
fuzzyset/cfuzzyset.c:5020:20: error: ‘PyThreadState’ has no member named ‘exc_traceback’
fuzzyset/cfuzzyset.c:5021:11: error: ‘PyThreadState’ has no member named ‘exc_type’
fuzzyset/cfuzzyset.c:5022:11: error: ‘PyThreadState’ has no member named ‘exc_value’
fuzzyset/cfuzzyset.c:5023:11: error: ‘PyThreadState’ has no member named ‘exc_traceback’
fuzzyset/cfuzzyset.c: In function ‘__Pyx_ExceptionSave’:
fuzzyset/cfuzzyset.c:5041:29: warning: initialization makes pointer from integer without a cast [enabled by default]
fuzzyset/cfuzzyset.c:5042:19: error: ‘PyThreadState’ has no member named ‘exc_type’
fuzzyset/cfuzzyset.c:5043:20: error: ‘PyThreadState’ has no member named ‘exc_value’
fuzzyset/cfuzzyset.c:5044:17: error: ‘PyThreadState’ has no member named ‘exc_traceback’
fuzzyset/cfuzzyset.c: In function ‘__Pyx_ExceptionReset’:
fuzzyset/cfuzzyset.c:5051:29: warning: initialization makes pointer from integer without a cast [enabled by default]
fuzzyset/cfuzzyset.c:5052:22: error: ‘PyThreadState’ has no member named ‘exc_type’
fuzzyset/cfuzzyset.c:5053:23: error: ‘PyThreadState’ has no member named ‘exc_value’
fuzzyset/cfuzzyset.c:5054:20: error: ‘PyThreadState’ has no member named ‘exc_traceback’
fuzzyset/cfuzzyset.c:5055:11: error: ‘PyThreadState’ has no member named ‘exc_type’
fuzzyset/cfuzzyset.c:5056:11: error: ‘PyThreadState’ has no member named ‘exc_value’
fuzzyset/cfuzzyset.c:5057:11: error: ‘PyThreadState’ has no member named ‘exc_traceback’
fuzzyset/cfuzzyset.c: In function ‘__Pyx_check_binary_version’:
fuzzyset/cfuzzyset.c:5554:5: warning: implicit declaration of function ‘Py_GetVersion’ [-Wimplicit-function-declaration]
fuzzyset/cfuzzyset.c: In function ‘__Pyx_AddTraceback’:
fuzzyset/cfuzzyset.c:5740:5: warning: passing argument 1 of ‘PyFrame_New’ makes pointer from integer without a cast [enabled by default]
/home/dave/.virtualenvs/pypytrunk/include/pypy_decl.h:131:29: note: expected ‘struct PyThreadState *’ but argument is of type ‘int’
error: Setup script exited with error: command 'cc' failed with exit status 1

Make FuzzySet Python2.6 compatible

Right now getting this exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.6/dist-packages/fuzzyset/__init__.py", line 31, in add
    self.__add(value, i)
  File "/usr/local/lib/python2.6/dist-packages/fuzzyset/__init__.py", line 37, in __add
    grams = _gram_counter(value, gram_size)
  File "/usr/local/lib/python2.6/dist-packages/fuzzyset/__init__.py", line 95, in _gram_counter
    return collections.Counter(_iterate_grams(value, gram_size))
AttributeError: 'module' object has no attribute 'Counter'

Problem while trying to install it

This is the output when I run "pip3 install fuzzyset". How can I fix it? Same happened with the command provided on "pypi" and using easy_install

C:\Users\stefa\AppData\Local\Programs\Python\Python37-32\Scripts>pip3 install fuzzyset
Collecting fuzzyset
  Using cached https://files.pythonhosted.org/packages/2e/78/7509f3efbb6acbcf842d7bdbd9a919ca8c0ed248123bdd8c57f08497e0dd/fuzzyset-0.0.19.tar.gz
Requirement already satisfied: python-levenshtein in c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages (from fuzzyset) (0.12.0)
Requirement already satisfied: texttable in c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages (from fuzzyset) (1.6.2)
google-search-api 1.1.14 has requirement selenium<3.0.0,>=2.44.0, but you'll have selenium 3.141.0 which is incompatible.
Installing collected packages: fuzzyset
  Running setup.py install for fuzzyset ... error
    Complete output from command c:\users\stefa\appdata\local\programs\python\python37-32\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\stefa\\AppData\\Local\\Temp\\pip-install-gku7y1sj\\fuzzyset\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\stefa\AppData\Local\Temp\pip-record-77gvau5_\install-record.txt --single-version-externally-managed --compile:
    running install
    running build
    running build_py
    creating build
    creating build\lib.win32-3.7
    creating build\lib.win32-3.7\fuzzyset
    copying fuzzyset\__init__.py -> build\lib.win32-3.7\fuzzyset
    running build_ext
    building 'cfuzzyset' extension
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\stefa\AppData\Local\Temp\pip-install-gku7y1sj\fuzzyset\setup.py", line 70, in <module>
        **extra_kwargs
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages\setuptools\__init__.py", line 145, in setup
        return distutils.core.setup(**attrs)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\core.py", line 148, in setup
        dist.run_commands()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\dist.py", line 966, in run_commands
        self.run_command(cmd)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\dist.py", line 985, in run_command
        cmd_obj.run()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages\setuptools\command\install.py", line 61, in run
        return orig.install.run(self)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\command\install.py", line 545, in run
        self.run_command('build')
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\dist.py", line 985, in run_command
        cmd_obj.run()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\command\build.py", line 135, in run
        self.run_command(cmd_name)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\dist.py", line 985, in run_command
        cmd_obj.run()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
        _build_ext.run(self)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\command\build_ext.py", line 339, in run
        self.build_extensions()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\command\build_ext.py", line 448, in build_extensions
        self._build_extensions_serial()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
        self.build_extension(ext)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages\setuptools\command\build_ext.py", line 205, in build_extension
        _build_ext.build_extension(self, ext)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\command\build_ext.py", line 533, in build_extension
        depends=ext.depends)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\_msvccompiler.py", line 345, in compile
        self.initialize()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\distutils\_msvccompiler.py", line 238, in initialize
        vc_env = _get_vc_env(plat_spec)
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages\setuptools\msvc.py", line 171, in msvc14_get_vc_env
        return EnvironmentInfo(plat_spec, vc_min_ver=14.0).return_env()
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\site-packages\setuptools\msvc.py", line 1620, in return_env
        if self.vs_ver >= 14 and isfile(self.VCRuntimeRedist):
      File "c:\users\stefa\appdata\local\programs\python\python37-32\lib\genericpath.py", line 30, in isfile
        st = os.stat(path)
    TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

    ----------------------------------------
Command "c:\users\stefa\appdata\local\programs\python\python37-32\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\stefa\\AppData\\Local\\Temp\\pip-install-gku7y1sj\\fuzzyset\\setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record C:\Users\stefa\AppData\Local\Temp\pip-record-77gvau5_\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\stefa\AppData\Local\Temp\pip-install-gku7y1sj\fuzzyset\
You are using pip version 19.0.3, however version 20.0.2 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.

windows cant install

windows 10 cant install it, anybody can help with this
ERROR:
pip install fuzzyset-0.0.15.tar.gz
Processing c:\users\wei.zhou\downloads\fuzzyset-0.0.15.tar.gz
Requirement already satisfied: python-levenshtein in c:\users\wei.zhou\appdata\local\programs\python\python37-32\lib\site-packages (from fuzzyset==0.0.15) (0.12.0)
Requirement already satisfied: texttable in c:\users\wei.zhou\appdata\local\programs\python\python37-32\lib\site-packages (from fuzzyset==0.0.15) (1.5.0)
Requirement already satisfied: setuptools in c:\users\wei.zhou\appdata\local\programs\python\python37-32\lib\site-packages (from python-levenshtein->fuzzyset==0.0.15) (39.0.1)
Building wheels for collected packages: fuzzyset
Running setup.py bdist_wheel for fuzzyset ... error
Complete output from command C:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\python.exe -u -c "import setuptools, tokenize;file='C:\Users\WEI1.ZHO\AppData\Local\Temp\pip-req-build-r4sla49o\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" bdist_wheel -d C:\Users\WEI1.ZHO\AppData\Local\Temp\pip-wheel-ge4ddfa7 --python-tag cp37:
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win32-3.7
creating build\lib.win32-3.7\fuzzyset
copying fuzzyset_init_.py -> build\lib.win32-3.7\fuzzyset
running build_ext
building 'cfuzzyset' extension
creating build\temp.win32-3.7
creating build\temp.win32-3.7\Release
creating build\temp.win32-3.7\Release\fuzzyset
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\include -IC:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\include "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tcfuzzyset/cfuzzyset.c /Fobuild\temp.win32-3.7\Release\fuzzyset/cfuzzyset.obj
cfuzzyset.c
fuzzyset/cfuzzyset.c(5903): warning C4020: 'function through pointer': too many actual parameters
fuzzyset/cfuzzyset.c(6174): warning C4020: 'function through pointer': too many actual parameters
fuzzyset/cfuzzyset.c(6311): warning C4020: 'function through pointer': too many actual parameters
fuzzyset/cfuzzyset.c(6729): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6730): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6731): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6738): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6739): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6740): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6741): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6742): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6743): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6798): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6799): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6800): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6801): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6802): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6803): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe' failed with exit status 2


Failed building wheel for fuzzyset
Running setup.py clean for fuzzyset
Failed to build fuzzyset
Installing collected packages: fuzzyset
Running setup.py install for fuzzyset ... error
Complete output from command C:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\python.exe -u -c "import setuptools, tokenize;file='C:\Users\WEI1.ZHO\AppData\Local\Temp\pip-req-build-r4sla49o\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\WEI1.ZHO\AppData\Local\Temp\pip-record-g65sfo7i\install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build\lib.win32-3.7
creating build\lib.win32-3.7\fuzzyset
copying fuzzyset_init_.py -> build\lib.win32-3.7\fuzzyset
running build_ext
building 'cfuzzyset' extension
creating build\temp.win32-3.7
creating build\temp.win32-3.7\Release
creating build\temp.win32-3.7\Release\fuzzyset
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\include -IC:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\include "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tcfuzzyset/cfuzzyset.c /Fobuild\temp.win32-3.7\Release\fuzzyset/cfuzzyset.obj
cfuzzyset.c
fuzzyset/cfuzzyset.c(5903): warning C4020: 'function through pointer': too many actual parameters
fuzzyset/cfuzzyset.c(6174): warning C4020: 'function through pointer': too many actual parameters
fuzzyset/cfuzzyset.c(6311): warning C4020: 'function through pointer': too many actual parameters
fuzzyset/cfuzzyset.c(6729): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6730): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6731): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6738): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6739): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6740): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6741): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6742): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6743): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6798): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6799): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6800): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6801): error C2039: 'exc_type': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6802): error C2039: 'exc_value': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
fuzzyset/cfuzzyset.c(6803): error C2039: 'exc_traceback': is not a member of '_ts'
c:\users\wei.zhou\appdata\local\programs\python\python37-32\include\pystate.h(212): note: see declaration of '_ts'
error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\cl.exe' failed with exit status 2

----------------------------------------

Command "C:\Users\wei.zhou\AppData\Local\Programs\Python\Python37-32\python.exe -u -c "import setuptools, tokenize;file='C:\Users\WEI1.ZHO\AppData\Local\Temp\pip-req-build-r4sla49o\setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record C:\Users\WEI1.ZHO\AppData\Local\Temp\pip-record-g65sfo7i\install-record.txt --single-version-externally-managed --compile" failed with error code 1 in C:\Users\WEI~1.ZHO\AppData\Local\Temp\pip-req-build-r4sla49o\

Odd results with few values?

f = FuzzySet()
f.add(u'Conor Hedley')
print f.get(u'directory manager')
>>> [[(0.23529411764705888, u'Conor Hedley')]
f.add(u'man')
print f.get(u'directory manager')
>>> [(0.17647058823529416, u'man')]

How exactly does the first result have a 0.23 match? And how does the match actually gets lower with a value that seems actually closer?
I don't really mind the inconsistency, but the javascript version of this library behaves as expected (no odd match with "Conor Hedley") for example.

Levenshtein exception when mixing unicode and strings

Not sure if there's anything this library should do about this, but when you add as such you get an exception:

>>> f = FuzzySet()
>>> f.add(u'balls')
>>> f['balls']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/glen/.virtualenvs/temp/lib/python2.7/site-packages/fuzzyset/__init__.py", line 49, in __getitem__
    results = self.__get(value, i)
  File "/Users/glen/.virtualenvs/temp/lib/python2.7/site-packages/fuzzyset/__init__.py", line 74, in __get
    for _, matched in results[:50]]
  File "/Users/glen/.virtualenvs/temp/lib/python2.7/site-packages/fuzzyset/__init__.py", line 88, in _distance
    distance = Levenshtein.distance(str1, str2)
TypeError: distance expected two Strings or two Unicodes

Transition to a wheel-based distribution

Hi,

First of all, thanks for this amazing library. It's blazing fast and solves a very specific task very well. Also, the implementation is quite clean an readable. Nice work!

I wanted to know if you have spare time to set up wheel builds and uploads to PyPI for all major OSes. It would really simplify the installation process for lots of users if fuzzyset was distributed in wheel format.

I'm willing to implement most of this, but i need to know if you're interested in this kind of help.

Problem with install to Python3.9

pip install fuzzyset
trace
ERROR: Command errored out with exit status 1:
command: /usr/local/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-byd1csue/fuzzyset_7135df109e3b4107a8adf8352d3dea6e/setup.py'"'"'; file='"'"'/tmp/pip-install-byd1csue/fuzzyset_7135df109e3b4107a8adf8352d3dea6e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-1t1vw2c7/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.9/fuzzyset
cwd: /tmp/pip-install-byd1csue/fuzzyset_7135df109e3b4107a8adf8352d3dea6e/
Complete output (137 lines):
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.9
creating build/lib.linux-x86_64-3.9/fuzzyset
copying fuzzyset/init.py -> build/lib.linux-x86_64-3.9/fuzzyset
running build_ext
building 'cfuzzyset' extension
creating build/temp.linux-x86_64-3.9
creating build/temp.linux-x86_64-3.9/fuzzyset
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/include/python3.9 -c fuzzyset/cfuzzyset.c -o build/temp.linux-x86_64-3.9/fuzzyset/cfuzzyset.o
fuzzyset/cfuzzyset.c: In function '__Pyx_modinit_type_init_code':
fuzzyset/cfuzzyset.c:6309:35: error: 'PyTypeObject' {aka 'struct _typeobject'} has no member named 'tp_print'; did you mean 'tp_dict'?
__pyx_type_9cfuzzyset_cFuzzySet.tp_print = 0;
^~~~~~~~
tp_dict
fuzzyset/cfuzzyset.c: In function '__Pyx_ParseOptionalKeywords':
fuzzyset/cfuzzyset.c:6778:21: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6778:21: warning: 'PyUnicode_AsUnicode' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:580:45: note: declared here
Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
^~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6778:21: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6778:21: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6778:21: warning: 'PyUnicode_AsUnicode' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:580:45: note: declared here
Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
^~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6778:21: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**name) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6794:25: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6794:25: warning: 'PyUnicode_AsUnicode' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:580:45: note: declared here
Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
^~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6794:25: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6794:25: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject *op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6794:25: warning: 'PyUnicode_AsUnicode' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:580:45: note: declared here
Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UNICODE *) PyUnicode_AsUnicode(
^~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c:6794:25: warning: '_PyUnicode_get_wstr_length' is deprecated [-Wdeprecated-declarations]
(PyUnicode_GET_SIZE(**argname) != PyUnicode_GET_SIZE(key)) ? 1 :
^
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:446:26: note: declared here
static inline Py_ssize_t _PyUnicode_get_wstr_length(PyObject op) {
^~~~~~~~~~~~~~~~~~~~~~~~~~
fuzzyset/cfuzzyset.c: In function '__Pyx_PyUnicode_Substring':
fuzzyset/cfuzzyset.c:8026:9: warning: 'PyUnicode_FromUnicode' is deprecated [-Wdeprecated-declarations]
return PyUnicode_FromUnicode(NULL, 0);
^~~~~~
In file included from /usr/local/include/python3.9/unicodeobject.h:1026,
from /usr/local/include/python3.9/Python.h:97,
from fuzzyset/cfuzzyset.c:4:
/usr/local/include/python3.9/cpython/unicodeobject.h:551:42: note: declared here
Py_DEPRECATED(3.3) PyAPI_FUNC(PyObject
) PyUnicode_FromUnicode(
^~~~~~~~~~~~~~~~~~~~~
error: command '/usr/bin/gcc' failed with exit code 1
----------------------------------------
ERROR: Command errored out with exit status 1: /usr/local/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-byd1csue/fuzzyset_7135df109e3b4107a8adf8352d3dea6e/setup.py'"'"'; file='"'"'/tmp/pip-install-byd1csue/fuzzyset_7135df109e3b4107a8adf8352d3dea6e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-1t1vw2c7/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/include/python3.9/fuzzyset Check the logs for full command output.

How to get index of match in original list?

Hello,

Is it possible to get the index of each result returned by 'get' in the original list passed or added to FuzzySet? For example, I am looking for matches in a list of DNA sequences that have genes associated with them. It would be great if I could use the index from the match to directly look up the gene.

Thank you!

Version on PyPi does not match any commit

The most recent version of fuzzyset on pypi does not match any commit.

As of March 5 of this year, version 0.19.0 is the most recent version on PyPi. This seems to be one patch version ahead of the most recent commit on master, which bumped the version to 0.0.18.

Running a diff shows the PyPI version contains a difference on the _distance function of both cFuzzySet and fuzzySet (as well as a large difference in the .C file, but that may simply be a result of re-running cython).

Version 0.19.0 seems to introduce a performance regression due to recomputing the levenshtein distance twice in the fuzzySet implementation. Specifically, _distance function now reads:

 def _distance(str1, str2):
     distance = Levenshtein.distance(str1, str2)
     result = Levenshtein.distance(str1, str2)
     if len(str1) > len(str2):
         return 1 - float(distance) / len(str1)
     else:
         return 1 - float(distance) / len(str2)

Add typings

I'm not quite sure where best to create the issue now, since fuzzyset2 doesn't allow any.

But it would be desirable if the lib could get typhints as it becomes more common to use typechecker and it generally makes libraries more usable.

Memory purge

Thanks for the great code.
After running fuzzyset with a huge set of strings(1 Million 20 character strings), subsequent code runs out of memory. Is there a way to free up the fuzzyset within a program once I'm done using it?

thanks!

more documentation. package still maintained?

Hi,

First of all, congratulations for this amazing packages that is wayyyy faster than fuzzymatch when dealing with large datasets of strings.

Do you have more documentation about the matching algorithm that is used here? In particular I am matching sentences together (not only words) such as this is a sentence and I wanted to know if your defaut settings were appropriate in that case (ngrams=2 for instance).

How can I change them?

Many thanks for your help

Allow .get to return more than 1 result

Allow get to return more than 1 result, even if it's a lower score. Right now only the result with highest score is returned (although more than one result is returned if there's duplicated scores)

TypeError: 'int' object is not subscriptable

Please help am having this error
Traceback (most recent call last):
File "C:\Users\tonyc\Documents\tuberculosis\Tuberculosis_Diagnosis_Expert_System_Using_Fuzzy_Logic-master\Diabetes_Diagnosis_Expert_System_Using_Fuzzy_Logic-master\code\template.py", line 176, in
rules, values = generateRules()
File "C:\Users\tonyc\Documents\tuberculosis\Tuberculosis_Diagnosis_Expert_System_Using_Fuzzy_Logic-master\Diabetes_Diagnosis_Expert_System_Using_Fuzzy_Logic-master\code\template.py", line 156, in generateRules
cough = fuzzySets(cough_low, cough_med, cough_high, str(row[1]))
File "C:\Users\tonyc\Documents\tuberculosis\Tuberculosis_Diagnosis_Expert_System_Using_Fuzzy_Logic-master\Diabetes_Diagnosis_Expert_System_Using_Fuzzy_Logic-master\code\template.py", line 46, in fuzzySets
if(val < str(med[0])):
TypeError: 'int' object is not subscriptable

Installation error - can't find `fuzzyset/cfuzzyset.c`

I had some installation problems using pip and easy_install. Both where throwing exceptions related with fuzzyset/cfuzzyset.c location. I was trying to install the package in virtualenv and system-wide.

Finally, I was able to install it in my venv with clone of git repo, and easy_install .

The error log below:

$ pip install fuzzyset
Collecting fuzzyset
  Using cached fuzzyset-0.0.11.tar.gz
Requirement already satisfied: python-levenshtein in /home/sebzur/.envs/django1.8/lib/python2.7/site-packages (from fuzzyset)
Requirement already satisfied: texttable in /home/sebzur/.envs/django1.8/lib/python2.7/site-packages (from fuzzyset)
Requirement already satisfied: setuptools in /home/sebzur/.envs/django1.8/lib/python2.7/site-packages (from python-levenshtein->fuzzyset)
Building wheels for collected packages: fuzzyset
  Running setup.py bdist_wheel for fuzzyset ... error
  Complete output from command /home/sebzur/.envs/django1.8/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-EwiKAI/fuzzyset/setup.py';f=
getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpV7bEb8pip-wheel- --python-tag cp27:
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-2.7
  creating build/lib.linux-x86_64-2.7/fuzzyset
  copying fuzzyset/__init__.py -> build/lib.linux-x86_64-2.7/fuzzyset
  running build_ext
  building 'cfuzzyset' extension
  creating build/temp.linux-x86_64-2.7
  creating build/temp.linux-x86_64-2.7/fuzzyset
  x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fdebug-prefix-map=/build/python2.7-ZZaKJ6/python2.7-2.7.13=. -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/include/python2.7 -c fuzzyset/cfuzzyset.c -o build/temp.linux-x86_64-2.7/fuzzyset/cfuzzyset.o
  x86_64-linux-gnu-gcc: error: fuzzyset/cfuzzyset.c: Nie ma takiego pliku ani katalogu
  x86_64-linux-gnu-gcc: fatal error: no input files
  compilation terminated.
  error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

I quess there's some path error in setup.py - not time to trace it now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.