arangodb-helper / arangodb Goto Github PK

ArangoDB Starter - starts ArangoDB clusters & single servers with ease.

License: Apache License 2.0

Makefile 1.26% Go 98.20% Shell 0.49% Dockerfile 0.05%

arangodb's Issues

How to remove a node from the cluster?

Hi, we've been trialing this and are currently on version 0.5.0. We had an issue with the underlying data drive on one of the nodes that resulted in us having to take that node down. We had replication on all shards set to 2 which we thought in theory would allow us to provision a new node and re-balance the shards w/out data loss. We have not had success with the re-balance and assume this is due to the node still being present on the web ui overview page, albeit in an errored state -both DbServer and coordinator for that node are listed as SHUTDOWN. Another problem we have is that when we provisioned the new node on the cluster a new agent was not provisioned, presumably as the old one was still 'registered'?

Our question is, how do we remove the erroneous node to allow us to register an agent etc from our new one and re-balance our shard replica across to the new node?

Here is a dump of the arangod.log file from the agent on the lead node:

2017-08-08T11:15:27Z [2872] ERROR {cluster} cannot create connection to server '' at endpoint 'tcp://10.16.1.6:4001'
2017-08-08T11:15:28Z [2872] INFO {supervision} Precondition failed for starting job 12520001

Note that this is continually retrying and has CPU pegged at ~80%

Hoping for some guidance here should this issue arise in future. We intend to start from scratch on a new cluster set up.

Thanks

Fallback to docker image when local `arangod` cannot be found

By default the starter runs arangod as (a series of) local processes.
If it cannot find the arangod executable it should try to start the servers in a docker container.

This way the time from nothing to a running cluster becomes shorter.

Can I start arangodb cluster as authentication true?

I did below command when starting arangodb cluster.
arangodb --starter.data-dir=/home1/irteam/apps/arangodb/
but in this case arangod.conf file has 'authentication = false' in coordinator8529, dbserver8530, agent8531

Is there any option or way to start arangodb cluster as authentication true mode?

[server]
authentication = false
endpoint = tcp://[::]:8529
threads = 16
[log]
level = INFO
[javascript]
v8-contexts = 4

Allow for different master ports on different machines

Scenario:

Machine A: arangodb --starter.port=8528
Machine B: arangodb --starter.port=7000 --starter.join=machineA:8528
Machine C: arangodb --starter.port=3000 --starter.join=machineA:8528

Add option to start single server

Keep every other behavior the same.

Can a Docker cluster be restarted ?

Hi,

I used arangodb with Docker and I noticed that the containers do not restart by default. I updated them using docker update --restart=always {mycontainersids} but could you explain why a restart strategy is not set by default ?

Question about ArangoDB cluster

In order to cluster host A(master), B, C on each different machine.

To cluster 3 machine in daemon, I executed 3 commands below

On host A
nohup arangodb --starter.data-dir=/mypath/arangodb/ &

On host B
nohup arangodb --starter.join A(ip adress for A) --starter.data-dir=/mypath/arangodb/ &

On host C
nohup arangodb --starter.join A(ip adress for A) --starter.data-dir=/mypath/arangodb/ &

And I got 3 coordinators(port:8529), 3 db servers(port:8530) and 3 agents(port:8531) then clustered successfully.

And I use java client for basic CRUD to cluster.
ArangoDB arangoDB = new ArangoDB.Builder().host(hostA, PORT).build();
hostA: host of A(master), PORT: 8529

I have a question about this!

If host A(master) dead, any request to host A is blocked and connection refused.

Is there any way to redirecting to another alive node(in this situation B,C) in cluster automatically when host A is dead?

How to define the path of agent8531, coordinator8529, dbserver8530, setup.json files when clustering?

I found that 4 files/directories(agent8531, coordinator8529, dbserver8530, setup.json) are created on current path when execute clustering commands(e.g arangodb --server.join A).

How to define the path of those files/directories explicitly?

`make run-tests` is instable

Tests give unexpected (and false) random timeouts

Check role of existing instance, not only that the port is open

Right now running a single instance and then trying to start a local cluster will screw up the cluster.

Switch on authentication

We need a procedure to run a cluster with authentication, possibly by documenting how the user has to edit the configuration files. I will investigate how to do this in the simplest way.

Show the user how to connect her browser

Our last log message is: 2017/04/13 14:50:30 coordinator up and running. - Please add here the information how the user is then able to connect to the coordinator with her browser or clients aka

2017/04/13 14:50:30 coordinator up and running. You can connect it with your browser on http://...:8530/ or using arangosh --server.endpoint http+tcp://...

To reduce the possibility of user errors, Each window should print 3 blank lines, followed by this message.

Check other places for arangod

"make install" for arangodb placed the arangod binary in /usr/local/sbin. arangodb does not seem to be aware:

2017/10/22 12:46:58 Starting arangodb version 0.10.0, build 71b7af9
2017/10/22 12:46:58 Cannot find arangod (expected at /usr/sbin/arangod).
2017/10/22 12:46:58 Please install ArangoDB locally or run the ArangoDB starter in docker (see README for details).

My expectation is that arangodb should check /usr/local/sbin too.

Doesn't work again

I have no docker installed in my computer, and can not launch the new version after make succeed. The new version looks better than before, but I was not able to launch the cluster no matter the agencySize is 3 or 5.

I tried to reinstall the arangodb3 and rebuild , but no luck. Please see the following infomation:

A :
2017/02/11 12:37:51 Starting arangodb version dev, build dev
2017/02/11 12:37:51 Relaunching service with id '08c244af' on :4000...
2017/02/11 12:37:51 Starting agent on port 4001
2017/02/11 12:37:51 Listening on 0.0.0.0:4000 (:4000)
2017/02/11 12:37:52 Starting dbserver on port 4003
2017/02/11 12:37:53 Starting coordinator on port 4002
2017/02/11 12:37:54 agent up and running.
2017/02/11 12:40:22 dbserver not ready after 5min!
2017/02/11 12:40:23 coordinator not ready after 5min!



B:

2017/02/11 12:38:52 Starting arangodb version dev, build dev
2017/02/11 12:38:52 Relaunching service with id '3f43897c' on :4000...
2017/02/11 12:38:52 Starting agent on port 4006
2017/02/11 12:38:52 Listening on 0.0.0.0:4005 (:4005)
2017/02/11 12:38:53 Starting dbserver on port 4008
2017/02/11 12:38:54 Starting coordinator on port 4007
2017/02/11 12:41:22 agent not ready after 5min!
2017/02/11 12:41:23 dbserver not ready after 5min!
2017/02/11 12:41:24 coordinator not ready after 5min!

C: 

2017/02/11 12:39:02 Starting arangodb version dev, build dev
2017/02/11 12:39:02 Relaunching service with id '4fe2b113' on :4000...
2017/02/11 12:39:02 Listening on 0.0.0.0:4010 (:4010)
2017/02/11 12:39:02 Starting agent on port 4011
2017/02/11 12:39:03 Starting dbserver on port 4013
2017/02/11 12:39:04 Starting coordinator on port 4012
2017/02/11 12:41:32 agent not ready after 5min!
2017/02/11 12:41:33 dbserver not ready after 5min!
2017/02/11 12:41:34 coordinator not ready after 5min!

Task manager "stop task" in Windows seems to kill ArangoDBStarter

Stopping the starter with Control-C in a cmd.exe leads to a proper shutdown. But killing the starter in the task manager seems to terminate it immediately and the subprocesses arangod.exe keep running. Maybe we can handle further signals or events on Windows.

Starter should find already running subprocesses

If the starter was crashed (like with kill -9 on linux) and is restarted, it gets into an endless loop because the subprocesses are still running. We should probably detect at startup that there are already arangod processes running on the planned ports, see whether they are alive and adopt them in this case. Furthermore, if startup of the subprocesses fails very quickly a certain number of times, the starter should give up or at least do an exponential backoff strategy.

Reconfigure cluster

Problem

Currently when the starter gets a new feature (e.g. recent SSL addition) that result in additional/alternate settings in the server config files you have 2 options:

Start all over
Manually edit server config files.

I propose to add a option/procedure to reconfigure an existing cluster, without the need for manual config file editing.

Proposal

We add a trigger to starter, upon which, the starter:

stops its servers
re-generates config files for all servers
restarts the servers.

The trigger can be any of:

A signal (e.g. USR1, USR2)
An HTTP request (e.g. POST /reconfigure)
A command line option (--reconfigure).

Effect

When a starter has reconfigured it servers, there may be a short time before all other starters in the clusters have done their reconfiguration (something the user would have to do manually). It is possible that servers cannot reach each other in that period. This is acceptable as long as nothing is permanently broken.

Possible enhancements / extensions

Send a reconfigure trigger only to the master starter and let that broadcast it to all starters.
Have an additional trigger (or flag on the trigger) to restart the starter. This enables you to update the starter (binary), call this additional trigger ensure that starter is using the latest configuration features.

Free port detection fails when using `--starter.address=127.0.0.1` with `--starter.local`

Static docker image name fails with custom build for other architectures

See : neunhoef/ArangoDBStarter#13

Error while running Docker in Marathon

Trying to setup Docker Cluster in Marathon

While starting the first docker container I am getting the followng error

2017/10/05 18:34:14 Starting arangodb version 0.9.3, build c185373
2017/10/05 18:34:14 Cannot find docker container name. Please specify using --dockerContainer=...

Allow for homebrew/other packing system build

Requirements:

Sources in directory A
All build artifacts & temp files in directory B
Sources directory does not need a .git folder
No docker requirement
Include info from VERSION file (excluded the git build)

Should:

Fail if go version is too old

SSL for own HTTP API

The starter currently exposes an API over HTTP.

It should be possible to expose this API over HTTPS.

Can I start a arangodb based on a configuration file?

Based on the configuration file start, more convenient operation dimension

Option name change

ssl.keyfile
ssl.cafile
server.rr
server.arangod
auth.jwt-secret

sslKeyFile path must be absolute

either document or translate relative paths to absolute

(tested on macos and ubuntu)

Local `arangodb` does not seem to be able to use docker image

I have installed arangodb locally and then tried to use --docker.image neunhoef/arangodb:3.3.candidate to make it use a docker image for the ArangoDB cluster. This did not work, it does not use the docker image but rather tries to launch a local executable. Is this intended or could this be fixed?

Add SSL support

The starter should have an option --ssl.cafile followed by a file name. If this is given, it should make sure that all arangod instances have access to the file (for Docker we might copy it into the directory of the instance or mount it into the Docker container) and get themselves the option --ssl.cafile on their command line. Finally, all endpoint arguments on all command lines need to be changed from tcp://... to ssl://....

strive to bind coordinator to a more standard port

To ease the users access in conjunction with the documentation arangodbstarter should try to make the coordinator bind a "standard" port.
Single servers usually bind 8529, clusters 8530.
To be better in line with documentation & examples it should try to bind the coordinator to one of these ports.

Check availability of ports

Before arangod will fail because it cannot bind a port.

Remove thread & v8-context settings from generated arangod.conf

server.threads
javascript.v8-contexts

@neunhoef Please confirm this is safe in 3.1 & 3.2

Add `server.storage-engine` option

Values mmfiles (default) or rocksdb

Only when value rocksdb is given does this need to go into the config of every server.
(setting this field always, will break 3.1 servers)

PortOffset 0 on multiple servers

A customer wanted to use the starter on Windows (irrelevant here) to setup an ArangoDB cluster on multiple machines. They would prefer it if one could make it so that the same ports are used for starter, agent, coordinator (in particular) and dbserver on all machines. Maybe we can detect this automatically, and if this does not work reliably, there should at least be an option in the starter to override the automatic increment of the portOffset and force it to be zero.

Delay in startup of the first three machines

The starter waits until three machines have registered (including the master) and then starts all processes in quick succession. This creates a race for the names "DBServer0001" and "Coordinator0001". It would be better for users if the starter would delay the startup on machines 2 and 3 slightly, such that the numbering would always be consistent. This is only necessary on the first startup, restarts are fine since the names are taken then.

Run in background (daemon mode)

It is possible to run script and launch servers as daemon ?

If I close my terminal, it will kill my script and my servers, right ? Or will it keep servers running?

High CPU Usage

When I start a cluster in docker, like it is described in the README, it uses up all the CPU-Cores assigned to docker constantly with ~100%. The CPU-Usage starts to rise as soon as the coordinator starts (version 3.2.0). The only difference to the described setup is, that I use host directories to mount the volumes.

Compared to the cluster version, the single node arangodb server from DockerHub does hardly use any CPU (when it is idle).

All 3 nodes run on the same host and their console-log looks "OK" for me.

2017/07/29 11:09:42 Starting arangodb version 0.8.1, build f78ddde
2017/07/29 11:09:42 Relaunching service with id '4343101e' on 192.168.1.25:8528...
2017/07/29 11:09:42 Listening on 0.0.0.0:8528 (192.168.1.25:8528)
2017/07/29 11:09:42 Looking for a running instance of agent on port 8531
2017/07/29 11:09:42 Starting agent on port 8531
2017/07/29 11:09:43 Looking for a running instance of dbserver on port 8530
2017/07/29 11:09:43 Starting dbserver on port 8530
2017/07/29 11:09:44 Looking for a running instance of coordinator on port 8529
2017/07/29 11:09:44 Starting coordinator on port 8529
2017/07/29 11:09:46 agent up and running (version 3.2.0).
2017/07/29 11:09:51 dbserver up and running (version 3.2.0).
2017/07/29 11:09:59 coordinator up and running (version 3.2.0).
2017/07/29 11:09:59 Your cluster can now be accessed with a browser at `http://192.168.1.25:8529` or
2017/07/29 11:09:59 using `arangosh --server.endpoint tcp://192.168.1.25:8529`.

Is this the "normal" and "expected" behaviour?

Check & fix IPv6 behavior

Running the starter with only IPv6 addresses seem to cause issues.
(Missing [...] around address)

Support `~` in directories (expand homedir)

Properly expand home directory in paths.

Agency size 1 and local mode

If one combines the options --cluster.agency-size 1 and --starter.local one gets a cluster with one coordinator, one dbserver and one agent. I would have expected to get a cluster with 3 coordinators, 3 dbservers and one agent. If I want the former, I can simply use --cluster.agency-size 1 alone.

arangodb vs. arangodb

So is this a fork of arangodb or a specialized version of arangodb or a helper? It has the same name as arangodb, namely arangodb so I am confused...

"shutdown on Windows not supported"

Investigation needed: At some stage I got this message on shutdown. I could not easily find out where it was coming from. The servers seem to have been shut down, I do not know whether soft or hard.

Add option to instantly start single host test cluster

if the user i.e. runs:

arangodb --local

that should give him a cluster with no extra work.
The final log output during the startup should make clear that this cluster doesn't offer any safety and is for demo purposes only.

[6332] INFO {agencycomm} Flaky agency communication to http+tcp://myIPadress:8531. Unsuccessful consecutive tries: 42 (56.72s). Network checks advised.

When I cluster with 3 nodes, clustering fail.

execute
arangodb --starter.data-dir=/home1/irteam/apps/arangodb/ arangodb --starter.join myIPadress --starter.data-dir=/home1/irteam/apps/arangodb/ arangodb --starter.join myIPadress --starter.data-dir=/home1/irteam/apps/arangodb/
and shows some failure logs with

[6332] INFO {agencycomm} Flaky agency communication to http+tcp://myIPadress:8531. Unsuccessful consecutive tries: 42 (56.72s). Network checks advised.

I think reason of this situation is that I did upgrading arangodb version and using 'old(data used at 3.1)' starter.data-dir path /home1/irteam/apps/arangodb/

and when I set another new '--starter.data-dir path', it works well!

What is the problem of this situation and how to use 'old version(used at 3.1) data dir' on new arangodb version(3.2)

p.s
I upgraded arangodb version 3.1 -> 3.2 by following below doc
https://docs.arangodb.com/3.2/Manual/Administration/Upgrading/

Does it support the cluster operation in the DCOS

I set up a arangodb cluster on the dcos with the arangodb-mesos-framework, does it support that ? If it supports, how to make it?

'--startCoordinator false --startDBserver false ' it does not work

I have runed 5 instance:
Machine A: ./arangodb --dataDir /home/arango/data/ --agencySize 5
Machine B:./arangodb --join 192.168.130.201 --dataDir /home/arango/data/
Machine C:./arangodb --join 192.168.130.201 --dataDir /home/arango/data/
Machine C:./arangodb --startDBserver false --startCoordinator false --join 192.168.130.201 --dataDir /home/arango/data1/
Machine C:./arangodb --startDBserver false --startCoordinator false --join 192.168.130.201 --dataDir /home/arango/data2/

I saw 5 coordinators and DBserver are serving

Support for the ArangoDB Exitcode mappings to messages

Please alter:
https://github.com/arangodb/arangodb/blob/devel/utils/generateExitCodesFiles.py
to generate a go file also.
Use this go file to get more informations about the exit codes of the arangodb process.

Files involved using this script so far are:

lib/Basics/exitcodes.dat
lib/Basics/application-exit.h
lib/Basics/exitcodes.cpp
lib/Basics/application-exit.cpp
lib/Basics/exitcodes.h
Installation/Windows/Plugins/exitcodes.nsh

Use less confusing vocabular

Use log messages like:

"Serving as a Starter master"
"Serving as a Starter slave"
"Just became master of the Starter Cluster"
"Tried to become master of the Starter Cluster but failed: Status 412"

To make the distinction between ArangoDB & ArangoStarter as clear as possible

FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.

In order to cluster 3 hosts

On host A
arangodb

and on host B
arangodb --starter.join {ip address of host A }

and on host C
arangodb --starter.join {ip address of host A }

But I cannot cluster with 3 hosts and have error log below.

2017/07/10 15:56:47 ## Start of agent log
2017-07-10T06:56:45Z [21730] WARNING {communication} failed to open endpoint 'http+tcp://[::]:8531' with error: open: Address family not supported by protocol
2017-07-10T06:56:45Z [21730] FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
2017-07-10T06:56:45Z [21780] INFO ArangoDB 3.1.24 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1e-fips 11 Feb 2013
2017-07-10T06:56:45Z [21780] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-07-10T06:56:45Z [21780] INFO Starting up with role AGENT
2017-07-10T06:56:45Z [21780] INFO file-descriptors (nofiles) hard limit is 81920, soft limit is 81920
2017-07-10T06:56:45Z [21780] INFO JavaScript using startup '/usr/share/arangodb3/js', application '/home1/irteam/apps/arangodb/agent8531/apps'
2017-07-10T06:56:46Z [21780] INFO using endpoint 'http+tcp://[::]:8531' for non-encrypted requests
2017-07-10T06:56:46Z [21780] INFO {agency} 26ed22b8-d014-4e12-a6fd-169dd93603fb rebuilt key-value stores - serving.
2017-07-10T06:56:46Z [21780] WARNING {communication} failed to open endpoint 'http+tcp://[::]:8531' with error: open: Address family not supported by protocol
2017-07-10T06:56:46Z [21780] FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
2017-07-10T06:56:46Z [21830] INFO ArangoDB 3.1.24 [linux] 64bit, using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1e-fips 11 Feb 2013
2017-07-10T06:56:46Z [21830] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2017-07-10T06:56:46Z [21830] INFO Starting up with role AGENT
2017-07-10T06:56:46Z [21830] INFO file-descriptors (nofiles) hard limit is 81920, soft limit is 81920
2017-07-10T06:56:46Z [21830] INFO JavaScript using startup '/usr/share/arangodb3/js', application '/home1/irteam/apps/arangodb/agent8531/apps'
2017-07-10T06:56:46Z [21830] INFO using endpoint 'http+tcp://[::]:8531' for non-encrypted requests
2017-07-10T06:56:46Z [21830] INFO {agency} 26ed22b8-d014-4e12-a6fd-169dd93603fb rebuilt key-value stores - serving.
2017-07-10T06:56:46Z [21830] WARNING {communication} failed to open endpoint 'http+tcp://[::]:8531' with error: open: Address family not supported by protocol
2017-07-10T06:56:46Z [21830] FATAL failed to bind to endpoint 'http+tcp://[::]:8531'. Please check whether another instance is already running using this endpoint and review your endpoints configuration.
2017/07/10 15:56:47 ## End of agent log
2017/07/10 15:56:47 restarting agent
2017/07/10 15:56:47 Looking for a running instance of agent on port 8531
2017/07/10 15:56:47 Starting agent on port 8531
2017/07/10 15:56:47 agent has terminated, quickly, in 648.747928ms (recent failures: 100)

Data not shared in cluster?

Hello,
I appear to be starting an arangodb cluster correctly with ArangoDBStarter, yet collections I create on one server are not available on the other two servers. Data should be available on all cluster members, right?

Error while starting coordinator: fork/exec /usr/sbin/arangod: no such file or directory

I just downloaded compiled version and moved it to /usr/bin/arangodb (after a chmod +x)
When I start it here my logs :
$ arangodb

2017/05/15 10:06:37 Starting arangodb version 0.7.0, build 90aebe6
2017/05/15 10:06:37 Relaunching service with id '969a8efc' on :8528...
2017/05/15 10:06:37 Listening on 0.0.0.0:8528 (:8528)
2017/05/15 10:06:37 Looking for a running instance of agent on port 8531
2017/05/15 10:06:37 Starting agent on port 8531
2017/05/15 10:06:37 Error while starting agent: fork/exec /usr/sbin/arangod: no such file or directory
2017/05/15 10:06:38 Looking for a running instance of dbserver on port 8530
2017/05/15 10:06:38 Starting dbserver on port 8530
2017/05/15 10:06:38 Error while starting dbserver: fork/exec /usr/sbin/arangod: no such file or directory
2017/05/15 10:06:39 Looking for a running instance of coordinator on port 8529
2017/05/15 10:06:39 Starting coordinator on port 8529
2017/05/15 10:06:39 Error while starting coordinator: fork/exec /usr/sbin/arangod: no such file or directory

I use a fresh install of CentOS7
What's wrong ?

Search ArangoDB installations on all drives in Windows

Customer wanted to install ArangoDB under E:\arangodb. Obviously, the starter would not find the installation there. Override with --arangod and --jsDir worked. Maybe we should at least look into all drives?

arangodb-helper / arangodb Goto Github PK

arangodb's Issues

Problem

Proposal

Effect

Possible enhancements / extensions

Recommend Projects

Recommend Topics

Recommend Org