cvangysel / pyndri Goto Github PK
View Code? Open in Web Editor NEWpyndri is a Python interface to the Indri search engine.
Home Page: http://ilps.science.uva.nl
License: MIT License
pyndri is a Python interface to the Indri search engine.
Home Page: http://ilps.science.uva.nl
License: MIT License
Good morning,
In order to perform my tests, I need to get all the words for a document (for exemple get all the words of document 1), I used the snipet code that you gave in https://arxiv.org/pdf/1701.00749.pdf like this :
print([dictionary[token_id] for token_id in index.document(1)[1]])
I've gotten the following error :
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> File "~local/lib/python3.5/site-packages/pyndri/dictionary.py", line 32, in __getitem__ return self.id2token[token_id] KeyError: 0
Please, can you explain me how can I get access to all the document words ?
Thanks
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Traceback (most recent call last):
File "/home/derrer/Documents/UniMaterials/distribSystemsInstall/pyndri/tests/pyndri_tests.py", line 121, in setUp
self.assertEqual(ret, 0)
AssertionError: -11 != 0
Ran 23 tests in 5.854s
FAILED (failures=19)`
Is an AssertionError critical? If so, how do I deal with it?
Hi,
when I'm trying to install the package in cygwin I get the following error:
^
g++ -shared -Wl,--enable-auto-image-base build/temp.cygwin-2.10.0-x86_64-3.6/src/pyndri.o -Lusr/local/include/indri -L/usr/lib/python3.6/config -L/usr/lib -lindri -lz -lpthread -lm -lpython3.6m -o build/lib.cygwin-2.10.0-x86_64-3.6/pyndri_ext.cpython-36m-x86_64-cygwin.dll
/usr/lib/gcc/x86_64-pc-cygwin/7.3.0/../../../../x86_64-pc-cygwin/bin/ld: cannot find -lindri
collect2: error: ld returned 1 exit status
error: command 'g++' failed with exit status 1
I tried to set the environment variable 'LD_LIBRARY_PATH' to the indri local folder but it didn't help.
is there any chance you can help me out to get it installed properly?
thanks in advance
Hi,
I start to install pyndri, in the first step (sudo apt install g++ zlib1g-dev python3.5-dev python3-pip), I received this messages:
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
g++ : Depends: cpp (>= 4:7.3.0-3ubuntu2.1) but 4:5.3.1-1ubuntu1 is to be installed
Depends: gcc (>= 4:7.3.0-3ubuntu2.1) but 4:5.3.1-1ubuntu1 is to be installed
Depends: g++-7 (>= 7.3.0-27~) but it is not going to be installed
Depends: gcc-7 (>= 7.3.0-27~) but it is not going to be installed
python3-pip : Depends: python3-distutils but it is not going to be installed
Recommends: python3-dev (>= 3.2) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
What should I do?
Regards
When I run the example code, the python is terminated with terminate called after throwing an instance of 'lemur::api::Exception'
and exit code 134 (interrupted by signal 6: SIGABRT)
After googling for a few hours, I haven't found any solutions or reasons of the error.
This might not be Pyndri problems but have you seen this error before by any chance?
Hi,
Many thanks for your contributing, that could be very useful. I clone the 'pyndri' to my computer,and run 'sudo python setup.py install' error report:
system: ubuntu
run 'sudo python setup.py install'
error report is : running install
running build
running build_py
running build_ext
building 'pyndri_ext' extension
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -DP_NEEDS_GNU_CXX_NAMESPACE=1 -UNDEBUG -I/usr/include/python2.7 -c src/pyndri.cpp -o build/temp.linux-x86_64-2.7/src/pyndri.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
src/pyndri.cpp:8:42: fatal error: antlr/NoViableAltException.hpp: 没有那个文件或目录
#include <antlr/NoViableAltException.hpp>
^
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
what i have tried: i try to install dependency package like
sudo apt-get install build-essential autoconf libtool pkg-config python-opengl python-imaging python-pyrex python-pyside.qtopengl idle-python2.7 qt4-dev-tools qt4-designer libqtgui4 libqtcore4 libqt4-xml libqt4-test libqt4-script libqt4-network libqt4-dbus python-qt4 python-qt4-gl libgle3 python-dev
sudo easy_install greenlet
sudo easy_install gevent
sudo pip install lxml --upgrade
Hello! I am trying to setup pyndri on colab but I can't get through the first phase where you need to install indri.
I visited the link for project indri but all i could find was 2 jar files (I couldn't find any file named indri-5.11.tar.gz).
Does anyone have any idea on how to set it up on colab(indri/pyndri)?
Thank you in advance!
I can't compile the project. It gives the following error:
src/pyndri.cpp:8:42: fatal error: indri/CompressedCollection.hpp: No such file or directory
I have indexed a huge number of documents using IndriBuildIndex. I am able to run queries using IndriRunQuery on the same index, but when try to open the index in pyIndri I get the following error:
IOError: Indri repository contain more than one index.
i have run this code:
sudo apt install g++ zlib1g-dev python3.5-dev python3-pip
sudo pip3 install setuptools
i have run this code too:
./configure CXX="g++ -D_GLIBCXX_USE_CXX11_ABI=0"
make
sudo make install
and then i try to run example code in pyndri
import pyndri
index = pyndri.Index('~/indri-5.11/runquery/alQuran/indeks')
for document_id in range(index.document_base(), index.maximum_document()):
print(index.document(document_id))
and get error in
Traceback (most recent call last):
File "opik.py", line 3, in
index = pyndri.Index('/indri-5.11/runquery/alQuran/indeks')/indri-5.11/runquery/alQuran/indeks/manifest' for reading.
File "/usr/local/lib/python3.5/site-packages/pyndri/init.py", line 46, in init
super(Index, self).init(*args, **kwargs)
OSError: ../src/Parameters.cpp(469): Couldn't open parameter file '
what should i do,, i am newby in python and pyndri.. thanks for your answer
Hello everyone i want to index my collection of data imported into mongodb database
how can i connect indri tto index mongodb data ?
or how can i use pyndri to connect to mongodb ?
I updated gcc using brew and installed indri.
But when installing pyndri, it gives the error, Failed building the wheel for pyndri
Do you know what's the problem?
Hi,
I'm working with Clueweb09 category A - a big index (about 2TB).
I need to extract the textual content of documents, and to do so I can extract the tokens-tuple using index.document(doc_id)
, but in order to "translate" it to text, I need id2token dictionary.
The problem is that I see I can get id2token only using index.get_dictionary()
, but it uploads pretty much everything to the memory, and even though my machine got over 100GB of RAM, it gets killed in the process.
Can I get only the id2token dictionary? (hopefully that won't be to big)
Do you have any other solution for my problem?
Thanks,
Avihay
src/pyndri.cpp:8:42: fatal error: indri/CompressedCollection.hpp: No such file or directory
#include <indri/CompressedCollection.hpp>
^
compilation terminated.
error: command 'x86_64-pc-linux-gnu-g++' failed with exit status 1
When I run the command sudo pip3 install pyndri
I get this error:
The directory '/home/sars/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/sars/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting pyndri
Downloading pyndri-0.1.tar.gz
Installing collected packages: pyndri
Running setup.py install for pyndri ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-yc8t90wu/pyndri/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-xtj8tbrx-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.5
creating build/lib.linux-x86_64-3.5/pyndri
copying py/__init__.py -> build/lib.linux-x86_64-3.5/pyndri
copying py/compat.py -> build/lib.linux-x86_64-3.5/pyndri
copying py/dictionary.py -> build/lib.linux-x86_64-3.5/pyndri
running build_ext
building 'pyndri_ext' extension
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/src
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -DP_NEEDS_GNU_CXX_NAMESPACE=1 -UNDEBUG -I/usr/include/python3.5m -c src/pyndri.cpp -o build/temp.linux-x86_64-3.5/src/pyndri.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
src/pyndri.cpp:8:42: fatal error: antlr/NoViableAltException.hpp: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-yc8t90wu/pyndri/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-xtj8tbrx-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-yc8t90wu/pyndri/
You are using pip version 8.1.1, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Hi there,
While running the command "pip3 install pyndri" I've got the following error:
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
src/pyndri.cpp:9:42: fatal error: antlr/NoViableAltException.hpp: No such file or directory
#include <antlr/NoViableAltException.hpp>
^
compilation terminated.
error: command 'gcc' failed with exit status 1
Command ".../python3.5/bin/python3.5 -u -c "import setuptools, tokenize;file='/tmp/pip-build-xqx2sa5u/pyndri/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-0bhj1piv-record/install-record.txt --single-version-externally-managed --compile --install-headers .../python3.5/include/site/python3.5/pyndri" failed with error code 1 in /tmp/pip-build-xqx2sa5u/pyndri/
I've run the "install setuptools" command successfully.
Could you help me, please?
Best regards
Good morning,
Thank you for all the good work you do.
Is it possible to use field through Pyndri?
Thanks,
Ortal
Just a small detail:
The number of total docs is defined in index.document_count() is -1 of what it should be.
For instance, if max_document is 2 and document_base is 1, then the total # docs should be 2 instead of 1.
Hence, I believe the right formula is:
index.document_count() = ( index.maximum_document() - index.document_base() +1 )
Very glad to see your work, I keep looking for a python interface for Indri since I am not good at C++. I have read your example code carefully, it seems that there are no function to set retrieval model and smoothing parameters. Could you add functions for those?
i try to get content (text in document which i index), but is failure to get it
import pyndri
index = pyndri.Index('/home/opiq/indri/runquery/alQuran/indeks')
dictionary = pyndri.extract_dictionary(index)
_, int_doc_id = index.document_ids([' QS_1:1 '])
print([dictionary[token_id]
for token_id in index.document(int_doc_id)[1]])
but i get error
> Traceback (most recent call last):
> File "contoh-dok.py", line 5, in <module>
> _, int_doc_id = index.document_ids([' QS_1:1 '])
> ValueError: not enough values to unpack (expected 2, got 0)
thanks if for your answer
Is pseudo relevance feedback based retrieval possible in pyndri?
All the examples that I see are related to without query expansion.
Thanks in advance
Hi, I am wondering whether it is possible to use the WSDM with pyndri ? Do you have any sample code for this use case ? Also is it possible to do parameter tuning using pyndri ?
Good evening,
is it possible to use pyndri with index that was created by indri 5.6?
Thanks in advance,
Ortal
Good morning;
I need to my experiments to access the implementation of the BM25 model because I have to perform some changes to my model, please, how can I get access to the BM25 implementation in order to make my experiments, or call the function making the BM25 ?
Thanks
Hi,
Is it possible to query multiple indexes?
Similar to the batch query using the retrieval parameter files where we can mention several indexes.
Hi,
I'm new to pyndri and I was wondering if I could use a little help here.
I want to extract from the index the words vectors of documents that contain a specific word (e.g. extract all of the words-vector of the documents contain the word, Asia). Is there any convenient way to extract this information by pyndri?
In addition, I want to run an RM3 model. how can I do that?
Thank a lot!
Hi,
I try to install indri on my local folder at linux (I have no sudo rights).
When I run the command: "CC=gcc CXX=g++ pip3 install pyndri --user".
I get this error:
" g++ -pthread -shared build/temp.linux-x86_64-3.5/src/pyndri.o -lindri -lz -lpthread -lm -o build/lib.linux-x86_64-3.5/pyndri_ext.cpython-35m-x86_64-linux-gnu.so
/usr/bin/ld: cannot find -lindri
collect2: ld returned 1 exit status
error: command 'g++' failed with exit status 1
----------------------------------------
Command "/lv_local/home/ortal.as/python/bin/python3.5 -u -c "import setuptools, tokenize;file='/tmp/pip-build-_fastrzt/pyndri/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-qlf7fpqe-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-build-_fastrzt/pyndri/"
The installation log in the attached file:
Installation Log.txt
I would be grateful if you had an idea how to solve the problem.
Thanks in advance,
Ortal
Good Morning,
I installed the latest pyndri today (5.2.18).
Options from the example stem.py (krovetz_stem, porter_stem) are not available. The only available option is "stem".
Thannk you in advance,
Ortal
I have an index that is built on Wikipedia articles. Whenever a query contains a non-ascii character, something (I can't exactly locate the error, but it's definitely below Index.query()
) throws a segmentation fault. This happens for '–' and 'ō', but not 'é' (presumably because the latter is part of ascii-256?).
I build and installed indri
(latest) and installed pyndri for python3.5
but when I try to import pyndri
it gives the following error:
In [1]: import pyndri
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-601147fcfd90> in <module>()
----> 1 import pyndri
/usr/local/lib/python3.5/dist-packages/pyndri-1.0-py3.5-linux-x86_64.egg/pyndri/__init__.py in <module>()
1 from pyndri.dictionary import Dictionary, extract_dictionary
2
----> 3 from pyndri_ext import Index as __IndexBase
4 from pyndri_ext import QueryEnvironment, stem, tokenize
5
ImportError: /usr/local/lib/python3.5/dist-packages/pyndri-1.0-py3.5-linux-x86_64.egg/pyndri_ext.cpython-35m-x86_64-linux-gnu.so: undefined symbol: _ZN5indri3api16QueryEnvironment8addIndexERKSs
I'm using Linux Mint 18 Sarah x64
with Python 3.5.2
and I build Indri
and Pyndri
using the following commands:
for Indri
after tar xvzf
:
./configure
make
sudo make install
for Pyndri
:
sudo python3 setup.py install
The error is produced directly after attempting to import pyndri
in python3.
I can't really make much out of the error code, any help please?
I saw the example in pyndri/examples/query_environments.py.
It returns a list of documents given a query. However, is there any way I can get the score for a specific <q,d> pair?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.