arangodb-helper / arangodb Goto Github PK
View Code? Open in Web Editor NEWArangoDB Starter - starts ArangoDB clusters & single servers with ease.
License: Apache License 2.0
ArangoDB Starter - starts ArangoDB clusters & single servers with ease.
License: Apache License 2.0
Hi, we've been trialing this and are currently on version 0.5.0. We had an issue with the underlying data drive on one of the nodes that resulted in us having to take that node down. We had replication on all shards set to 2 which we thought in theory would allow us to provision a new node and re-balance the shards w/out data loss. We have not had success with the re-balance and assume this is due to the node still being present on the web ui overview page, albeit in an errored state -both DbServer and coordinator for that node are listed as SHUTDOWN. Another problem we have is that when we provisioned the new node on the cluster a new agent was not provisioned, presumably as the old one was still 'registered'?
Our question is, how do we remove the erroneous node to allow us to register an agent etc from our new one and re-balance our shard replica across to the new node?
Here is a dump of the arangod.log file from the agent on the lead node:
2017-08-08T11:15:27Z [2872] ERROR {cluster} cannot create connection to server '' at endpoint 'tcp://10.16.1.6:4001'
2017-08-08T11:15:28Z [2872] INFO {supervision} Precondition failed for starting job 12520001
Note that this is continually retrying and has CPU pegged at ~80%
Hoping for some guidance here should this issue arise in future. We intend to start from scratch on a new cluster set up.
Thanks
By default the starter runs arangod
as (a series of) local processes.
If it cannot find the arangod
executable it should try to start the servers in a docker container.
This way the time from nothing to a running cluster becomes shorter.
I did below command when starting arangodb cluster.
arangodb --starter.data-dir=/home1/irteam/apps/arangodb/
but in this case arangod.conf file has 'authentication = false' in coordinator8529, dbserver8530, agent8531
Is there any option or way to start arangodb cluster as authentication true mode?
[server]
authentication = false
endpoint = tcp://[::]:8529
threads = 16
[log]
level = INFO
[javascript]
v8-contexts = 4
Scenario:
arangodb --starter.port=8528
arangodb --starter.port=7000 --starter.join=machineA:8528
arangodb --starter.port=3000 --starter.join=machineA:8528
Keep every other behavior the same.
Hi,
I used arangodb with Docker and I noticed that the containers do not restart by default. I updated them using docker update --restart=always {mycontainersids}
but could you explain why a restart strategy is not set by default ?
In order to cluster host A(master), B, C on each different machine.
To cluster 3 machine in daemon, I executed 3 commands below
On host A
nohup arangodb --starter.data-dir=/mypath/arangodb/ &
On host B
nohup arangodb --starter.join A(ip adress for A) --starter.data-dir=/mypath/arangodb/ &
On host C
nohup arangodb --starter.join A(ip adress for A) --starter.data-dir=/mypath/arangodb/ &
And I got 3 coordinators(port:8529), 3 db servers(port:8530) and 3 agents(port:8531) then clustered successfully.
And I use java client for basic CRUD to cluster.
ArangoDB arangoDB = new ArangoDB.Builder().host(hostA, PORT).build();
hostA: host of A(master), PORT: 8529
I have a question about this!
If host A(master) dead, any request to host A is blocked and connection refused.
Is there any way to redirecting to another alive node(in this situation B,C) in cluster automatically when host A is dead?
I found that 4 files/directories(agent8531, coordinator8529, dbserver8530, setup.json) are created on current path when execute clustering commands(e.g arangodb --server.join A).
How to define the path of those files/directories explicitly?
Tests give unexpected (and false) random timeouts
Right now running a single instance and then trying to start a local cluster will screw up the cluster.
We need a procedure to run a cluster with authentication, possibly by documenting how the user has to edit the configuration files. I will investigate how to do this in the simplest way.
Our last log message is: 2017/04/13 14:50:30 coordinator up and running.
- Please add here the information how the user is then able to connect to the coordinator with her browser or clients aka
2017/04/13 14:50:30 coordinator up and running. You can connect it with your browser on http://...:8530/ or using arangosh --server.endpoint http+tcp://...
To reduce the possibility of user errors, Each window should print 3 blank lines, followed by this message.
"make install" for arangodb placed the arangod binary in /usr/local/sbin. arangodb does not seem to be aware:
2017/10/22 12:46:58 Starting arangodb version 0.10.0, build 71b7af9
2017/10/22 12:46:58 Cannot find arangod (expected at /usr/sbin/arangod).
2017/10/22 12:46:58 Please install ArangoDB locally or run the ArangoDB starter in docker (see README for details).
My expectation is that arangodb should check /usr/local/sbin too.
I have no docker installed in my computer, and can not launch the new version after make succeed. The new version looks better than before, but I was not able to launch the cluster no matter the agencySize is 3 or 5.
I tried to reinstall the arangodb3 and rebuild , but no luck. Please see the following infomation:
A :
2017/02/11 12:37:51 Starting arangodb version dev, build dev
2017/02/11 12:37:51 Relaunching service with id '08c244af' on :4000...
2017/02/11 12:37:51 Starting agent on port 4001
2017/02/11 12:37:51 Listening on 0.0.0.0:4000 (:4000)
2017/02/11 12:37:52 Starting dbserver on port 4003
2017/02/11 12:37:53 Starting coordinator on port 4002
2017/02/11 12:37:54 agent up and running.
2017/02/11 12:40:22 dbserver not ready after 5min!
2017/02/11 12:40:23 coordinator not ready after 5min!
B:
2017/02/11 12:38:52 Starting arangodb version dev, build dev
2017/02/11 12:38:52 Relaunching service with id '3f43897c' on :4000...
2017/02/11 12:38:52 Starting agent on port 4006
2017/02/11 12:38:52 Listening on 0.0.0.0:4005 (:4005)
2017/02/11 12:38:53 Starting dbserver on port 4008
2017/02/11 12:38:54 Starting coordinator on port 4007
2017/02/11 12:41:22 agent not ready after 5min!
2017/02/11 12:41:23 dbserver not ready after 5min!
2017/02/11 12:41:24 coordinator not ready after 5min!
C:
2017/02/11 12:39:02 Starting arangodb version dev, build dev
2017/02/11 12:39:02 Relaunching service with id '4fe2b113' on :4000...
2017/02/11 12:39:02 Listening on 0.0.0.0:4010 (:4010)
2017/02/11 12:39:02 Starting agent on port 4011
2017/02/11 12:39:03 Starting dbserver on port 4013
2017/02/11 12:39:04 Starting coordinator on port 4012
2017/02/11 12:41:32 agent not ready after 5min!
2017/02/11 12:41:33 dbserver not ready after 5min!
2017/02/11 12:41:34 coordinator not ready after 5min!
Stopping the starter with Control-C in a cmd.exe leads to a proper shutdown. But killing the starter in the task manager seems to terminate it immediately and the subprocesses arangod.exe keep running. Maybe we can handle further signals or events on Windows.
If the starter was crashed (like with kill -9 on linux) and is restarted, it gets into an endless loop because the subprocesses are still running. We should probably detect at startup that there are already arangod processes running on the planned ports, see whether they are alive and adopt them in this case. Furthermore, if startup of the subprocesses fails very quickly a certain number of times, the starter should give up or at least do an exponential backoff strategy.
Currently when the starter gets a new feature (e.g. recent SSL addition) that result in additional/alternate settings in the server config files you have 2 options:
I propose to add a option/procedure to reconfigure an existing cluster, without the need for manual config file editing.
We add a trigger to starter, upon which, the starter:
The trigger can be any of:
POST /reconfigure
)--reconfigure
).When a starter has reconfigured it servers, there may be a short time before all other starters in the clusters have done their reconfiguration (something the user would have to do manually). It is possible that servers cannot reach each other in that period. This is acceptable as long as nothing is permanently broken.
Trying to setup Docker Cluster in Marathon
While starting the first docker container I am getting the followng error
2017/10/05 18:34:14 Starting arangodb version 0.9.3, build c185373
2017/10/05 18:34:14 Cannot find docker container name. Please specify using --dockerContainer=...
Requirements:
.git
folderShould:
The starter currently exposes an API over HTTP.
It should be possible to expose this API over HTTPS.
Based on the configuration file start, more convenient operation dimension
ssl.keyfile
ssl.cafile
server.rr
server.arangod
auth.jwt-secret
either document or translate relative paths to absolute
(tested on macos and ubuntu)
I have installed arangodb
locally and then tried to use --docker.image neunhoef/arangodb:3.3.candidate
to make it use a docker image for the ArangoDB cluster. This did not work, it does not use the docker image but rather tries to launch a local executable. Is this intended or could this be fixed?
The starter should have an option --ssl.cafile
followed by a file name. If this is given, it should make sure that all arangod
instances have access to the file (for Docker we might copy it into the directory of the instance or mount it into the Docker container) and get themselves the option --ssl.cafile
on their command line. Finally, all endpoint arguments on all command lines need to be changed from tcp://...
to ssl://...
.
To ease the users access in conjunction with the documentation arangodbstarter should try to make the coordinator bind a "standard" port.
Single servers usually bind 8529, clusters 8530.
To be better in line with documentation & examples it should try to bind the coordinator to one of these ports.
Before arangod
will fail because it cannot bind a port.
@neunhoef Please confirm this is safe in 3.1 & 3.2
Values mmfiles
(default) or rocksdb
Only when value rocksdb
is given does this need to go into the config of every server.
(setting this field always, will break 3.1 servers)
A customer wanted to use the starter on Windows (irrelevant here) to setup an ArangoDB cluster on multiple machines. They would prefer it if one could make it so that the same ports are used for starter, agent, coordinator (in particular) and dbserver on all machines. Maybe we can detect this automatically, and if this does not work reliably, there should at least be an option in the starter to override the automatic increment of the portOffset and force it to be zero.
The starter waits until three machines have registered (including the master) and then starts all processes in quick succession. This creates a race for the names "DBServer0001" and "Coordinator0001". It would be better for users if the starter would delay the startup on machines 2 and 3 slightly, such that the numbering would always be consistent. This is only necessary on the first startup, restarts are fine since the names are taken then.
It is possible to run script and launch servers as daemon ?
If I close my terminal, it will kill my script and my servers, right ? Or will it keep servers running?
When I start a cluster in docker, like it is described in the README, it uses up all the CPU-Cores assigned to docker constantly with ~100%. The CPU-Usage starts to rise as soon as the coordinator
starts (version 3.2.0). The only difference to the described setup is, that I use host directories to mount the volumes.
Compared to the cluster version, the single node arangodb server from DockerHub does hardly use any CPU (when it is idle).
All 3 nodes run on the same host and their console-log looks "OK" for me.
2017/07/29 11:09:42 Starting arangodb version 0.8.1, build f78ddde
2017/07/29 11:09:42 Relaunching service with id '4343101e' on 192.168.1.25:8528...
2017/07/29 11:09:42 Listening on 0.0.0.0:8528 (192.168.1.25:8528)
2017/07/29 11:09:42 Looking for a running instance of agent on port 8531
2017/07/29 11:09:42 Starting agent on port 8531
2017/07/29 11:09:43 Looking for a running instance of dbserver on port 8530
2017/07/29 11:09:43 Starting dbserver on port 8530
2017/07/29 11:09:44 Looking for a running instance of coordinator on port 8529
2017/07/29 11:09:44 Starting coordinator on port 8529
2017/07/29 11:09:46 agent up and running (version 3.2.0).
2017/07/29 11:09:51 dbserver up and running (version 3.2.0).
2017/07/29 11:09:59 coordinator up and running (version 3.2.0).
2017/07/29 11:09:59 Your cluster can now be accessed with a browser at `http://192.168.1.25:8529` or
2017/07/29 11:09:59 using `arangosh --server.endpoint tcp://192.168.1.25:8529`.
Is this the "normal" and "expected" behaviour?
Running the starter with only IPv6 addresses seem to cause issues.
(Missing [...]
around address)
See also #2
Properly expand home directory in paths.
If one combines the options --cluster.agency-size 1
and --starter.local
one gets a cluster with one coordinator, one dbserver and one agent. I would have expected to get a cluster with 3 coordinators, 3 dbservers and one agent. If I want the former, I can simply use --cluster.agency-size 1
alone.
So is this a fork of arangodb or a specialized version of arangodb or a helper? It has the same name as arangodb, namely arangodb
so I am confused...
Investigation needed: At some stage I got this message on shutdown. I could not easily find out where it was coming from. The servers seem to have been shut down, I do not know whether soft or hard.
if the user i.e. runs:
arangodb --local
that should give him a cluster with no extra work.
The final log output during the startup should make clear that this cluster doesn't offer any safety and is for demo purposes only.
When I cluster with 3 nodes, clustering fail.
execute
arangodb --starter.data-dir=/home1/irteam/apps/arangodb/ arangodb --starter.join myIPadress --starter.data-dir=/home1/irteam/apps/arangodb/ arangodb --starter.join myIPadress --starter.data-dir=/home1/irteam/apps/arangodb/
and shows some failure logs with
[6332] INFO {agencycomm} Flaky agency communication to http+tcp://myIPadress:8531. Unsuccessful consecutive tries: 42 (56.72s). Network checks advised.
I think reason of this situation is that I did upgrading arangodb version and using 'old(data used at 3.1)' starter.data-dir path /home1/irteam/apps/arangodb/
and when I set another new '--starter.data-dir path', it works well!
What is the problem of this situation and how to use 'old version(used at 3.1) data dir' on new arangodb version(3.2)
p.s
I upgraded arangodb version 3.1 -> 3.2 by following below doc
https://docs.arangodb.com/3.2/Manual/Administration/Upgrading/
I set up a arangodb cluster on the dcos with the arangodb-mesos-framework, does it support that ? If it supports, how to make it?
I have runed 5 instance:
Machine A: ./arangodb --dataDir /home/arango/data/ --agencySize 5
Machine B:./arangodb --join 192.168.130.201 --dataDir /home/arango/data/
Machine C:./arangodb --join 192.168.130.201 --dataDir /home/arango/data/
Machine C:./arangodb --startDBserver false --startCoordinator false --join 192.168.130.201 --dataDir /home/arango/data1/
Machine C:./arangodb --startDBserver false --startCoordinator false --join 192.168.130.201 --dataDir /home/arango/data2/
I saw 5 coordinators and DBserver are serving
Please alter:
https://github.com/arangodb/arangodb/blob/devel/utils/generateExitCodesFiles.py
to generate a go file also.
Use this go file to get more informations about the exit codes of the arangodb process.
Files involved using this script so far are:
lib/Basics/exitcodes.dat
lib/Basics/application-exit.h
lib/Basics/exitcodes.cpp
lib/Basics/application-exit.cpp
lib/Basics/exitcodes.h
Installation/Windows/Plugins/exitcodes.nsh
Use log messages like:
"Serving as a Starter master"
"Serving as a Starter slave"
"Just became master of the Starter Cluster"
"Tried to become master of the Starter Cluster but failed: Status 412"
To make the distinction between ArangoDB & ArangoStarter as clear as possible
In order to cluster 3 hosts
On host A
arangodb
and on host B
arangodb --starter.join {ip address of host A }
and on host C
arangodb --starter.join {ip address of host A }
But I cannot cluster with 3 hosts and have error log below.
2017/07/10 15:56:47 ## Start of agent log
2017-07-10T06:56:45Z [21730] WARNING {communication} failed to open endpoint 'http+tcp://[::]:8531' with error: open: Address family not supported by protocol
2017-07-10T06:56:45Z [21730] FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
2017-07-10T06:56:45Z [21780] INFO ArangoDB 3.1.24 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1e-fips 11 Feb 2013
2017-07-10T06:56:45Z [21780] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-07-10T06:56:45Z [21780] INFO Starting up with role AGENT
2017-07-10T06:56:45Z [21780] INFO file-descriptors (nofiles) hard limit is 81920, soft limit is 81920
2017-07-10T06:56:45Z [21780] INFO JavaScript using startup '/usr/share/arangodb3/js', application '/home1/irteam/apps/arangodb/agent8531/apps'
2017-07-10T06:56:46Z [21780] INFO using endpoint 'http+tcp://[::]:8531' for non-encrypted requests
2017-07-10T06:56:46Z [21780] INFO {agency} 26ed22b8-d014-4e12-a6fd-169dd93603fb rebuilt key-value stores - serving.
2017-07-10T06:56:46Z [21780] WARNING {communication} failed to open endpoint 'http+tcp://[::]:8531' with error: open: Address family not supported by protocol
2017-07-10T06:56:46Z [21780] FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
2017-07-10T06:56:46Z [21830] INFO ArangoDB 3.1.24 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1e-fips 11 Feb 2013
2017-07-10T06:56:46Z [21830] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-07-10T06:56:46Z [21830] INFO Starting up with role AGENT
2017-07-10T06:56:46Z [21830] INFO file-descriptors (nofiles) hard limit is 81920, soft limit is 81920
2017-07-10T06:56:46Z [21830] INFO JavaScript using startup '/usr/share/arangodb3/js', application '/home1/irteam/apps/arangodb/agent8531/apps'
2017-07-10T06:56:46Z [21830] INFO using endpoint 'http+tcp://[::]:8531' for non-encrypted requests
2017-07-10T06:56:46Z [21830] INFO {agency} 26ed22b8-d014-4e12-a6fd-169dd93603fb rebuilt key-value stores - serving.
2017-07-10T06:56:46Z [21830] WARNING {communication} failed to open endpoint 'http+tcp://[::]:8531' with error: open: Address family not supported by protocol
2017-07-10T06:56:46Z [21830] FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
2017/07/10 15:56:47 ## End of agent log
2017/07/10 15:56:47 restarting agent
2017/07/10 15:56:47 Looking for a running instance of agent on port 8531
2017/07/10 15:56:47 Starting agent on port 8531
2017/07/10 15:56:47 agent has terminated, quickly, in 648.747928ms (recent failures: 100)
Hello,
I appear to be starting an arangodb cluster correctly with ArangoDBStarter, yet collections I create on one server are not available on the other two servers. Data should be available on all cluster members, right?
I just downloaded compiled version and moved it to /usr/bin/arangodb (after a chmod +x)
When I start it here my logs :
$ arangodb
2017/05/15 10:06:37 Starting arangodb version 0.7.0, build 90aebe6
2017/05/15 10:06:37 Relaunching service with id '969a8efc' on :8528...
2017/05/15 10:06:37 Listening on 0.0.0.0:8528 (:8528)
2017/05/15 10:06:37 Looking for a running instance of agent on port 8531
2017/05/15 10:06:37 Starting agent on port 8531
2017/05/15 10:06:37 Error while starting agent: fork/exec /usr/sbin/arangod: no such file or directory
2017/05/15 10:06:38 Looking for a running instance of dbserver on port 8530
2017/05/15 10:06:38 Starting dbserver on port 8530
2017/05/15 10:06:38 Error while starting dbserver: fork/exec /usr/sbin/arangod: no such file or directory
2017/05/15 10:06:39 Looking for a running instance of coordinator on port 8529
2017/05/15 10:06:39 Starting coordinator on port 8529
2017/05/15 10:06:39 Error while starting coordinator: fork/exec /usr/sbin/arangod: no such file or directory
I use a fresh install of CentOS7
What's wrong ?
Customer wanted to install ArangoDB under E:\arangodb. Obviously, the starter would not find the installation there. Override with --arangod
and --jsDir
worked. Maybe we should at least look into all drives?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.