Giter Site home page Giter Site logo

Comments (7)

HYLcool avatar HYLcool commented on May 18, 2024 1

@AnitaSherry 你好,感谢你的关注与使用!

由于simhash库本身支持的问题,它的原始代码库只能安装并使用于python3.8及以下的环境,因此目前你有两种方法来解决其安装失败的问题:

  1. 使用一个符合要求的python环境:如 @fuxuelinwudi 提供的方法,感谢 @fuxuelinwudi 提供帮助!
  2. 从源码安装,并将simhash-py的安装源从原始代码库改为 https://github.com/hylcool/simhash-py ,后者是我们基于原代码库做了若干修改,使得它能够兼容python 3.9及以上的版本

供你参考~如有进一步问题欢迎随时提问

from data-juicer.

fuxuelinwudi avatar fuxuelinwudi commented on May 18, 2024

conda环境装python=3.8 可以解决

from data-juicer.

AnitaSherry avatar AnitaSherry commented on May 18, 2024

https://github.com/hylcool/simhash-py 源码安装也失败。
`
(sakura) kemove@kemove-Super-Server:/data/competition/competition_kit/data-juicer/simhash-py-master$ pip install .
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Processing /data/competition/competition_kit/data-juicer/simhash-py-master
Preparing metadata (setup.py) ... done
Building wheels for collected packages: simhash-py
Building wheel for simhash-py (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [108 lines of output]
Building from Cython
/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/dist.py:723: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead
warnings.warn(
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.9
creating build/lib.linux-x86_64-3.9/simhash
copying simhash/init.py -> build/lib.linux-x86_64-3.9/simhash
running build_ext
Compiling simhash/simhash.pyx because it changed.
[1/1] Cythonizing simhash/simhash.pyx
/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/Cython/Compiler/Main.py:381: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /data/competition/competition_kit/data-juicer/simhash-py-master/simhash/simhash.pxd
tree = Parsing.p_module(s, pxd, full_module_name)

  Error compiling Cython file:
  ------------------------------------------------------------
  ...
  import hashlib
  import struct
  
  from simhash cimport compute as c_compute
  ^
  ------------------------------------------------------------
  
  simhash/simhash.pyx:4:0: 'simhash/compute.pxd' not found
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
  import hashlib
  import struct
  
  from simhash cimport compute as c_compute
  from simhash cimport find_all as c_find_all
  ^
  ------------------------------------------------------------
  
  simhash/simhash.pyx:5:0: 'simhash/find_all.pxd' not found
  warning: simhash/simhash.pyx:15:0: Overriding cdef method with def method.
  warning: simhash/simhash.pyx:19:0: Overriding cdef method with def method.
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
      # Unpacks the binary bytes in digest into a Python integer
      return struct.unpack('>Q', digest)[0] & 0xFFFFFFFFFFFFFFFF
  
  def compute(hashes):
      '''Compute the simhash of a vector of hashes.'''
      return c_compute(hashes)
             ^
  ------------------------------------------------------------
  
  simhash/simhash.pyx:17:11: 'c_compute' is not a constant, variable or function identifier
  
  Error compiling Cython file:
  ------------------------------------------------------------
  ...
      Find the set of all matches within the provided vector of hashes.
  
      The provided hashes are manipulated in place, but upon completion are
      restored to their original state.
      '''
      cdef matches_t results_set = c_find_all(hashes, number_of_blocks, different_bits)
                                   ^
  ------------------------------------------------------------
  
  simhash/simhash.pyx:26:33: 'c_find_all' is not a constant, variable or function identifier
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/data/competition/competition_kit/data-juicer/simhash-py-master/setup.py", line 37, in <module>
      setup(
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 148, in setup
      return run_commands(dist)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 163, in run_commands
      dist.run_commands()
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 967, in run_commands
      self.run_command(cmd)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
      cmd_obj.run()
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/wheel/bdist_wheel.py", line 364, in run
      self.run_command("build")
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
      cmd_obj.run()
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/command/build.py", line 135, in run
      self.run_command(cmd_name)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 986, in run_command
      cmd_obj.run()
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 339, in run
      self.build_extensions()
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 448, in build_extensions
      self._build_extensions_serial()
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/setuptools/_distutils/command/build_ext.py", line 473, in _build_extensions_serial
      self.build_extension(ext)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/Cython/Distutils/build_ext.py", line 122, in build_extension
      new_ext = cythonize(
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/Cython/Build/Dependencies.py", line 1134, in cythonize
      cythonize_one(*args)
    File "/home/kemove/anaconda3/envs/sakura/lib/python3.9/site-packages/Cython/Build/Dependencies.py", line 1301, in cythonize_one
      raise CompileError(None, pyx_file)
  Cython.Compiler.Errors.CompileError: simhash/simhash.pyx
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for simhash-py
Running setup.py clean for simhash-py
Failed to build simhash-py
ERROR: Could not build wheels for simhash-py, which is required to install pyproject.toml-based projects
WARNING: There was an error checking the latest version of pip.
`

from data-juicer.

AnitaSherry avatar AnitaSherry commented on May 18, 2024

现在很多项目都要用到python3.11了,这怎么还停留在python3.8,这样太过时了

from data-juicer.

AnitaSherry avatar AnitaSherry commented on May 18, 2024

在environments/science_requires.txt中
删掉simhash-py,变成simhash,就成功安装了

from data-juicer.

HYLcool avatar HYLcool commented on May 18, 2024

现在很多项目都要用到python3.11了,这怎么还停留在python3.8,这样太过时了

这种第三方库的原因我们这边目前也没啥好的办法,作者对这个库已经不做后续支持了。。。后边儿我们会考虑看能否替换为别的依赖库或者我们自行实现这一套计算过程吧

from data-juicer.

liuyukid avatar liuyukid commented on May 18, 2024

我之前也遇到过这个问题,可以试试把Cython的版本降低一点试试,我用Cython==0.29.21之后就可以安装simhash-py

from data-juicer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.