Giter Site home page Giter Site logo

Comments (22)

iyadahmad avatar iyadahmad commented on May 27, 2024

also I got just two commands after completing bitextor-train-document-alignment && bitextor-get-html-text only.

from bitextor.

lpla avatar lpla commented on May 27, 2024

from bitextor.

hieuhoang avatar hieuhoang commented on May 27, 2024

@iyadahmad I also have problems with bitextor & Ubuntu 16.04, I now use Ubuntu 18.04. You may want to upgrade, or give more detailed descriptions so that the issue can be debugged

from bitextor.

mespla avatar mespla commented on May 27, 2024

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

yes the master branch , btw I've tried with ubuntu 18.04 and the result the same , I will attach log files.

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

I followed the instructions in this files https://github.com/bitextor/bitextor/blob/master/README.md
autogen.log
config.log
makeinstalllog.log
makelog.log

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

CMakeError-kenlm.log
CMakeError-mgiza.log
CMakeError-preprocess.log
CMakeOutput-kenlm.log
CMakeOutput-mgiza.log
CMakeOutput-preprocess.log

from bitextor.

lpla avatar lpla commented on May 27, 2024

Looks like some parts are missing from boost or something is broken with libboost or cmake versions, maybe they are too old. This are the most important errors here (looking at mgiza CMakeError log file):

error: ‘std::tr1’ has not been declared
...
CheckSymbolExists.c:(.text+0x1b): undefined reference to `pthread_create'
...
/usr/bin/ld: cannot find -lpthreads

Check if libboost-all-dev installed

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

thanks for your quick reply ,

libboost-all-dev already installed and the version is 1.65.1
cmake version 3.10.2

from bitextor.

IT-coach-666 avatar IT-coach-666 commented on May 27, 2024

Yesterday , I try to install the latest version of bitextor. There is no error message in the installing process, but in the end, I can't use the "bitextor" command to crawl data.

The installing process described as following:
run the script 'configure'
./autogen.sh
make
sudo make install

no error message but can't run bitextor command, I am sure that there is no PATH problem.

from bitextor.

IT-coach-666 avatar IT-coach-666 commented on May 27, 2024

no error message but can't run bitextor command.
I install to a specific path ,and find that only has the command of the following graph:

command

from bitextor.

lpla avatar lpla commented on May 27, 2024

from bitextor.

lpla avatar lpla commented on May 27, 2024

Forget my previous message. There is an issue in master branch which is fixed in our development branch. Please, use the released tag Bitextor 6 package meanwhile we fix it: https://github.com/bitextor/bitextor/releases/tag/v6

from bitextor.

IT-coach-666 avatar IT-coach-666 commented on May 27, 2024

Thanks for your reply, lpla.
The above bitextor link is Bitextor 6, and the following links is the Bitextor 7? And I also want to konw when I can formally use it? Thanks.
https://github.com/bitextor/bitextor/

from bitextor.

mespla avatar mespla commented on May 27, 2024

from bitextor.

lpla avatar lpla commented on May 27, 2024

Ok, now master is fixed, as @mespla said.

Some notes. Bitextor 7 is the next stable release of Bitextor, which is now in development. We manage our code similarly as Debian does:

  • tags are for stable releases that have support until next stable comes, which is Bitextor 6 series (well, 6.0.1 now thanks to your issue with the '\s' in sed, so thanks!). That's why I recommended to use it until we fixed master.
  • master is our testing (Debian equivalent) in which we integrate all the new features we finished (or reached a milestone) for next stable release, like full Python 3 compatibility, WARC format support and more. It can be broken sometimes (today, for example, but now it is working again).
  • Rest of branches are our development playground for new features and they will be probably not even documented in the Wiki or README.md and, of course, prone to be broken.

So, please, @iyadahmad and @JiaYueHuang , test now your installation pulling updates for master and report any other issue. In case you find more issues and we take long to answer, please, try with Bitextor 6.x series.

from bitextor.

IT-coach-666 avatar IT-coach-666 commented on May 27, 2024

Thanks~

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

Hello ,
Master , still "bitextor" binary doesn't exist just [/usr/local/bin/bitextor-get-html-text
and /usr/local/bin/bitextor-train-document-alignment]
CMakeError-kenlm.log
CMakeError-mgiza.log
CMakeError-preprocess.log
CMakeOutputkenlm.log
CMakeOutput-mgiza.log
CMakeOutput-preprcess.log

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

i will test v6.x tag and will write my feedback .

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

V6.x notes

when run ./autogen.sh >> configure: error: this package requires the following libraries for Python:

  • keras: Please, install v1.0.3 or later and try again: sudo pip install keras

but keras is already installed with pip3 [Keras (2.2.4)]

here are the used steps from my side :

  1. git submodule update --init --recursive
  2. sudo apt install cmake g++ automake pkg-config openjdk-8-jdk python3 python3-pip python3-magic libboost-all-dev maven libbz2-dev liblzma-dev zlib1g-dev libffi-dev
  3. sudo pip3 install -e git://github.com/mammadori/magic-python.git#egg=Magic_file_extensions
  4. sudo pip3 install --upgrade python-Levenshtein tensorflow keras iso-639 langid nltk regex h5py ftfy warc3-wet
  5. sudo apt install httrack
  6. sudo pip3 install -r bicleaner/requirements.txt
  7. sudo pip3 install --upgrade matplotlib sklearn numpy && sudo apt install python3-tk
  8. ./autogen.sh
  9. make
  10. sudo make install

from bitextor.

lpla avatar lpla commented on May 27, 2024

from bitextor.

iyadahmad avatar iyadahmad commented on May 27, 2024

Thanks a lot, it works now .

from bitextor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.