Hello, I've been using hadoop and Hibench for 2,5 months and I have

Encountered problems with Hibench and question about concurrency about hibench HOT 4 OPEN

intel-bigdata commented on September 23, 2024

Encountered problems with Hibench and question about concurrency

from hibench.

Comments (4)

adrian-wang commented on September 23, 2024

Thanks for your feedback!

For hive concurrency mode, you need to config something like hive.metastore.uris in hive-site.xml, before hive metastore service is up. You could check hive doc for more details at [Hive document about setting up remote metastore server].(https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-RemoteMetastoreServer)

Also, could you file your fixes for HiBench as pull requests? it would be great to see more contributors and a better HiBench.

We will investigate your other issues a little bit later, since the Chinese new year is on the way. We are going to have a holiday.

Happy New Year and Thank you again!

from hibench.

jforjohn commented on September 23, 2024

Hello again!

Happy New (or maybe Goat) year!

I am coming back now since I didn 't have any answer of the other issues and to let you know about how I fixed the issue with hivebench running in parallel.

Actually, making hivebench run in parallel was a little bit more difficult than I expected, so, I am going to copy the links I found and helped me fix it. Fyi I have a remote metastore database and a local metastore server.

Follow the instructions for remote metastore database -> https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin (ensure that you have the right version of mysql)
Follow the instructions in the begining in order to set properly your new mysql database -> http://www.cloudera.com/content/cloudera/en/documentation/cdh4/v4-2-0/CDH4-Installation-Guide/cdh4ig_topic_18_4.html#topic_18_4_3_unique_1__p_522_unique_1
Make your database accessible from the metastore server host -> http://dev.mysql.com/doc/refman/5.6/en/adding-users.html
You may have to give more permissions to your new user e.g. ALTER, CREATE
Download and set the CLASSAPATH for the java-connector -> http://dev.mysql.com/doc/connector-j/en/connector-j-installing-classpath.html

Problems I encountered:

In the hive-site.xml, replace ${system:java.io.tmpdir}/${system:user.name} by /tmp/mydir as what has been told in https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration (source: http://stackoverflow.com/questions/27099898/java-net-urisyntaxexception-when-starting-hive)
If you have this error "ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111)" then here is your answer -> http://stackoverflow.com/questions/1673530/error-2003-hy000-cant-connect-to-mysql-server-on-127-0-0-1-111

I hope this will help other guys too!

I am waiting for your answer about running in parallel the nutchindexing benchmark.

from hibench.

adrian-wang commented on September 23, 2024

Sorry for leaving it for so long... was working on something else these days.

For Mahout versions, as far as I am concerned we are using the same version. You can even ignore the mahout hibench provided, but set you own MAHOUT_HOME to benchmark any compatible mahout, unless it doesn't support arguments we are using(we didn't test all mahout versions, but I think most of them would work).

For the nutchindexing problem, it may results from a not clean config. We are switch between different configurations according to your hadoop deployment and this could cause some problem.

For the dfsioe, it is a good catch. Maybe we need to handle the case when user gives us an empty configuration.

If I still miss anything, feel free to let me know. You are really helping us a lot and we do appreciate everything you did.

Again, we'd like you to file your fixes as pull request, so that we can review it in detail and hopefully merge them into trunk. And it would be great to see more contributors and a better HiBench.

I'll file some bugs as we discussed here separately. Thanks a lot!

from hibench.

adrian-wang commented on September 23, 2024

Oh and for the nutchindexing temp file, can you specify which temp file we are using?

from hibench.

Encountered problems with Hibench and question about concurrency about hibench HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent