explosion / murmurhash Goto Github PK
View Code? Open in Web Editor NEW💥 Cython bindings for MurmurHash2
License: MIT License
💥 Cython bindings for MurmurHash2
License: MIT License
Would be great to have this be callable from Python! Not super familiar with Cython, but if it were possible to convert a C array to a tuple, then I think it's just matter of creating a return variable uint32_t[4] out
and sending that to python land.
pip3 install --user murmurhash
fails because setup.py
tries to install /usr/include/murmurhash
. If the executable needs to be in the path, please notify the user instead of installing into /usr/include
yourself.
I'm using virtualenv
as a workaround.
hello, how to use this lib
In [1]: import mrmr
In [2]: dir(mrmr)
Out[2]:
['builtins',
'doc',
'file',
'name',
'package',
'pyx_capi',
'test']
Summary
Installing murmurhash on aarch64 via pip using command "pip3 install murmurhash" tries to build wheel from source code
Problem description
murmurhash doesn't have wheel for aarch64 on PyPI repository. So, while installing murmurhash via pip on aarch64, pip builds wheel for same resulting in it takes more time to install murmurhash. Making wheel available for aarch64 will benefit aarch64 users by minimizing murmurhash installation time.
Expected Output
Pip should be able to download murmurhash wheel from PyPI repository rather than building it from source code.
@murmurhash-team, please let me know if I can help you building wheel/uploading to PyPI repository. I am curious to make murmurhash wheel available for aarch64. It will be a great opportunity for me to work with you.
https://github.com/explosion/murmurhash/blob/master/setup.py#L140
It builds fine without it.
Hello,
I am seeing that murmurhash produces different results on platforms of different byte order.
This will impact the results in cases where the model was trained on x86 and loaded / used on a big endian platform like s390x.
So far, I've encountered this twice:
In the context of spacy/thinc usage, this produces a result that can be somewhat unpredictable. With the fixes in explosion/thinc#559 , I was still observing a difference in the resulting analysis of certain sentences, while others matched exactly.
I have added fixes to the issues encountered so far in: https://github.com/andrewsi-z/murmurhash
On s390x or other big endian platforms, simply cloning and pip installing https://github.com/andrewsi-z/murmurhash is enough to replace/bypass the install of https://github.com/exposion/murmurhash.
In terms of this issue and our discussions in the below related issues, are you interested in a PR for these and future endian relaed changes for the murmurhash implementation, or would you prefer to keep this as a separate repository?
Thank you!
See also:
At present, I am trying to install spacy via pip. However, it fails on this dependency with:
error: murmurhash/mrmr.cpp: No such file or directory
The same happens when attempting to install directly. As far as I can tell, this is due to generate_cython
not being run as setup.py
recognises it is running in a package, not from source. However, mrmr.cpp
has not been included in the package.
I've created a minimal example of installing murmurhash using pip on AppVeyor here: https://github.com/suchow/test-murmurhash. It runs successfully on Windows and Python 3.3, 3.4, and 3.5, but not on 2.7.
The error it gives is:
murmurhash/mrmr.cpp(248) : fatal error C1083: Cannot open include file: 'stdint.h': No such file or directory
error: command 'C:\\Users\\appveyor\\AppData\\Local\\Programs\\Common\\Microsoft\\Visual C++ for Python\\9.0\\VC\\Bin\\cl.exe' failed with exit status 2
Here are the full build logs: https://ci.appveyor.com/project/suchow/test-murmurhash/build/1.0.1/job/r9t9duixfppuaefx#L13.
I'm able to install murmurhash==1.0.1
no problem in OSX (Mojave). However, in my Docker container, its a no-go. From the logs, I'd wager this distribution isn't available for this architecture.
The container image is based on python:alpine
Here are the relevant logs noted below:
No matching distribution found for murmurhash==1.0.1
Exception information:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/pip/_internal/basecommand.py", line 228, in main
status = self.run(options, args)
File "/usr/local/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 291, in run
resolver.resolve(requirement_set)
File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolve.py", line 103, in resolve
self._resolve_one(requirement_set, req)
File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolve.py", line 257, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "/usr/local/lib/python3.6/site-packages/pip/_internal/resolve.py", line 210, in _get_abstract_dist_for
self.require_hashes
File "/usr/local/lib/python3.6/site-packages/pip/_internal/operations/prepare.py", line 245, in prepare_linked_requirement
req.populate_link(finder, upgrade_allowed, require_hashes)
File "/usr/local/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 307, in populate_link
self.link = finder.find_requirement(self, upgrade)
File "/usr/local/lib/python3.6/site-packages/pip/_internal/index.py", line 533, in find_requirement
'No matching distribution found for %s' % req
pip._internal.exceptions.DistributionNotFound: No matching distribution found for murmurhash==1.0.1
Not sure if this should be reported here, but this is the error message:
building 'sense2vec.vectors' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/sense2vec
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/ubuntu/miniconda3/envs/industry-mapper/include/python2.7 -I/home/ubuntu/industry-mapper/sense2vec/include -I/home/ubuntu/miniconda3/envs/industry-mapper/include/python2.7 -I/home/ubuntu/miniconda3/envs/industry-mapper/lib/python2.7/site-packages/numpy/core/include -I/home/ubuntu/miniconda3/envs/industry-mapper/lib/python2.7/site-packages/murmurhash/headers -c sense2vec/vectors.cpp -o build/temp.linux-x86_64-2.7/sense2vec/vectors.o -O3 -Wno-unused-function -fno-stack-protector
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
sense2vec/vectors.cpp:258:36: fatal error: murmurhash/MurmurHash3.h: No such file or directory
#include "murmurhash/MurmurHash3.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.