10gen / mongo-orchestration Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
depends on #31
Preset replica set configuration with two data members and one arbiter is needed for testing read preferences SECONDARY and SECONDARY_PREFERRED
If we don't then all the MCI hosts / Jenkins hosts will need deploy keys or we will need to modify the base images (all of which are going to waste a fair amount of time for people)
We should document that the JSON parser is strict somewhere (if not already). Thinks like requiring double quotes was a gotcha for me recently. If we already document this, feel free to close as a duplicate.
Since this will likely be deployed on systems like RHEL/CentOS 6.x which ship with python 2.6 we should probably support that version.
One problem that comes to mind is using argparse for command line argument parsing. We could either require the argparse package from pypi if the python version is 2.6, or just replace argparse with getopt.
Traceback (most recent call last):,
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/apps/rs.py", line 32, in wrap return f(*arg, **kwd),
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/apps/rs.py", line 57, in rs_create rs_id = RS().create(data),
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/lib/rs.py", line 452, in create repl = ReplicaSet(rs_params),
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/lib/rs.py", line 48, in __init__ if not self.repl_init(config):,
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/lib/rs.py", line 112, in repl_init return self.waiting_config_state(),
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/lib/rs.py", line 388, in waiting_config_state while not self.check_config_state():,
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/lib/rs.py", line 397, in check_config_state if len(filter(lambda item: item['state'] in (3, 4, 5, 6, 9), self.run_command("rs.status()", is_eval=True)['members'])) > 0:,
File "/mnt/jenkins/workspace/mongo-java-driver-test-3.0.x/jdk/HotSpot6/label/linux64/mongodb_configuration/replica_set/mongodb_option/auth/mongodb_server/27-nightly/mongodb_ssl/nossl/mongo-orchestration/lib/rs.py", line 171, in run_command result = getattr(self.connection(hostname=hostname).admin, mode)(command, arg),
File "build/bdist.linux-x86_64/egg/pymongo/database.py", line 962, in eval args=args),
File "build/bdist.linux-x86_64/egg/pymongo/database.py", line 445, in command uuid_subtype, compile_re, **kwargs)[0],
File "build/bdist.linux-x86_64/egg/pymongo/database.py", line 351, in _command msg, allowable_errors),
File "build/bdist.linux-x86_64/egg/pymongo/helpers.py", line 178, in _check_command_response raise OperationFailure(msg % errmsg, code, response),
OperationFailure: command SON([('$eval', Code('rs.status()', {})), ('args', (None,))]) failed: not authorized on admin to execute command { $eval: rs.status(), args: [ null ] }
Given PUT /hosts/standalone with configuration in request content, and then POST /hosts/standalone with request content {action: 'start'}, then the registry/uniqueness of server id "standalone" is lost. Future GET /hosts/standalone fail to find the previous instance, and a second PUT /hosts/standalone starts up a second duplicate server.
This isn't RESTful:
PUT /hosts/host-id/start
PUT /hosts/host-id/stop
PUT /hosts/host-id/restart
Why not encode the action in the PUT payload?
Right now if I spin up a huge cluster on a virtualized node with slow disk the my request blocks harder than Lego.
All requests that involve a sub process(es) should return immediately and require the client to poll for status / state.
Without this it will be hard to test complex configurations in the cloud due to timeouts.
The "uri" key for resources representing mongodb hosts/clusters have a value with host:port.
{
"procInfo": {
"pid": 12734,
"optfile": "/tmp/mongo-FpbGGy",
"params": {
"smallfiles": true,
"oplogSize": 10,
"port": 1025,
"noprealloc": true,
"dbpath": "/tmp/mongo-7O_mON"
},
"name": "mongod",
"alive": true
},
"uri": "localhost:1025",
"serverInfo": {
...
},
"orchestration": "hosts",
"id": "2bc465a2-4edd-4b39-aa37-d53cb836a02a",
"statuses": {
"locked": false,
"primary": true,
"mongos": false
}
}
Maybe we should add a "mongodb://" so it is actually a mongodb uri?
This is my mongo-orchestration config:
{
"releases": {
"default": "/home/tbrock/Code/mongo/mongod"
},
"last_updated": "2012-08-28 17:45:00.000000"
}
I'm running:
python server.py start
Then in another terminal:
./# configurations/hosts/clean.json
./# configurations/hosts/clean.json
DBPATH:
LOGPATH:/home/tbrock/tmp/orchestration
Posting a request from configurations/hosts/clean.json to http://localhost:8889...
["Traceback (most recent call last):\n", " File \"/home/tbrock/Code/mongo-orchestration/apps/hosts.py\", line 34, in wrap\n return f(*arg, **kwd)\n", " File \"/home/tbrock/Code/mongo-orchestration/apps/hosts.py\", line 74, in host_create\n data.get('id', None))\n", " File \"/home/tbrock/Code/mongo-orchestration/lib/hosts.py\", line 267, in create\n raise OSError\n", "OSError\n"]
HTTP/1.0 200 OK
Date: Fri, 08 Aug 2014 15:29:56 GMT
Server: WSGIServer/0.1 Python/2.7.8
Content-Length: 2
Content-Type: application/json
[]
At present we have to setup and teardown a cluster for each test, and this overhead is a minimum of 25 seconds for each replica set test. We need to have a reset action that reduces this overhead by resetting a cluster with minimal overhead but ensuring working state, i.e., start only nodes that are not running, ensure replica set state if cluster is a replica set, etc. For the Ruby driver, this is embodied by the following.
ClusterManager#start
https://github.com/mongodb/mongo-ruby-driver/blob/1.x-stable/test/tools/mongo_config.rb#L643
ClusterManager#repl_set_startup
https://github.com/mongodb/mongo-ruby-driver/blob/1.x-stable/test/tools/mongo_config.rb#L438
Request body of the form {"action": "reset"} for all of the following.
POST servers/{server-id}
POST replica_sets/{repl-id}
POST shared_clusters/{shard-id}
localhost:8889/v1/<rest_of_api>
localhost:8889/v2/<rest_of_api>
...
scripts/mo only starts clusters from a config file, it should also support status and stop commands
With --no-fork, preset configurations work fine, but without --no-fork, the preset configurations fail, probably because running as daemon changes CWD.
Loading a preset doesn't work despite the file actually existing:
$ http --json POST localhost:8889/replica_sets preset="basic.json"
HTTP/1.1 500 Internal Server Error
Content-Length: 532
Content-Type: application/json
Date: Thu, 11 Sep 2014 16:18:24 GMT
Server: bigbrock
[
"Traceback (most recent call last):\n",
" File \"/home/tbrock/Code/mongo-orchestration/apps/__init__.py\", line 62, in wrap\n return f(*arg, **kwd)\n",
" File \"/home/tbrock/Code/mongo-orchestration/apps/replica_sets.py\", line 52, in rs_create\n data = preset_merge(data, 'replica_sets')\n",
" File \"/home/tbrock/Code/mongo-orchestration/lib/common.py\", line 38, in preset_merge\n with open(path, \"r\") as preset_file:\n",
"IOError: [Errno 2] No such file or directory: u'configurations/replica_sets/basic.json'\n"
]
$ ls configurations/replica_sets/basic.json
configurations/replica_sets/basic.json
See the linking section on our MMS API documentation: http://mms.mongodb.com/help/core/api/#linking. This is basically the principal of HATEOAS.
Each resource should provide links to the available further actions. For instance, the root resource of GET /{version} should provide urls to all the items that are possible as well as enough documentation for how to interact with it. For instance, I've added a "template" parameter to the rel: "add-server" below which could contain a sample body for the post and maybe the required/optional parameters and their types. But, the template stuff is certainly a lot of work and may be something we don't want to tackle right now.
GET /v1
=> {
links: [
{ rel: "get-versions", uri: "full-hostname/v1/versions", verb: "GET" },
{ rel: "get-servers", uri: "full-hostname/v1/servers", verb: "GET" },
// etc...
]
}
Then, following the servers link:
GET /v1/servers
=> {
servers: [ {
id: "1",
hostname: "localhost",
port: 28934,
links: [
{ rel: "get-server", uri: "full-hostname/v1/servers/1", verb: "GET" },
{ rel: "shutdown-server", uri: "full-hostname/v1/servers/1", verb: "DELETE" },
// etc...
]
],
links: [
{ rel: "add-server", uri: "full-hostname/v1/servers", verb: "POST", template: { } },
// etc...
]
}
Right now you only get member_id and a uri which doesn't seem to be a uri...
$ http --json POST localhost:8889/replica_sets members:="[{},{},{}]"
{ "id": "rs-babe3974-96fe-4136-b8a2-af665d603b3f", ...}
$ http GET localhost:8889/replica_sets/rs-babe3974-96fe-4136-b8a2-af665d603b3f
{
"auth_key": null,
"id": "rs-babe3974-96fe-4136-b8a2-af665d603b3f",
"members": [
{
"_id": 0,
"host": "localhost:1034",
"host_id": "dd7f2676-e590-42b0-8ccc-757d44e5a5fc"
},
{
"_id": 1,
"host": "localhost:1035",
"host_id": "efec3229-db29-463f-8f1e-2a60fc253745"
},
{
"_id": 2,
"host": "localhost:1036",
"host_id": "f304a6d9-d0f2-4ccc-8a86-66643831b20f"
}
],
"mongodb_uri": "mongodb://localhost:1034,localhost:1035,localhost:1036/?replicaSet=rs-babe3974-96fe-4136-b8a2-af665d603b3f",
"orchestration": "replica_sets",
"uri": "localhost:1034,localhost:1035,localhost:1036/?replicaSet=rs-babe3974-96fe-4136-b8a2-af665d603b3f"
}
$ http GET localhost:8889/replica_sets/rs-babe3974-96fe-4136-b8a2-af665d603b3f/primary
{
"member_id": 0,
"uri": "/servers/dd7f2676-e590-42b0-8ccc-757d44e5a5fc"
}
I find myself getting lots of 500s with responses like the following:
[
"Traceback (most recent call last):\n",
" File \"/home/tbrock/Code/mongo-orchestration/apps/rs.py\", line 33, in wrap\n return f(*arg, **kwd)\n",
" File \"/home/tbrock/Code/mongo-orchestration/apps/rs.py\", line 57, in rs_create\n data = json.loads(json_data)\n",
" File \"/usr/lib64/python2.7/json/__init__.py\", line 338, in loads\n return _default_decoder.decode(s)\n", " File \"/usr/lib64/python2.7/json/decoder.py\", line 366, in decode\n obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n",
" File \"/usr/lib64/python2.7/json/decoder.py\", line 382, in raw_decode\n obj, end = self.scan_once(s, idx)\n",
"ValueError: Expecting property name: line 1 column 2 (char 1)\n"
]
It would be nicer to see something like a 400 vs a 500 with: "couldn't parse the JSON sent to server" or "missing process name key [name], etc..."
There's no LICENSE file in the repository, nor any mention of a license in the source code files.
Right now we are using wsgiref (the default), which is single threaded. While we are waiting/deciding to go async we might be able to do better by at least using a multi-threaded backend for bottle by default. It's as simple as providing some params when you call Bottle.run
http://bottlepy.org/docs/dev/deployment.html#switching-the-server-backend
If we are blocking waiting for a replica set or cluster to spin up we can service other requests by using a multi-threaded or multi-process backend.
As an example, consider right now that if you spin up 10 replica sets for 10 test suites the first one spawns and we block waiting for initiation. Then the next one spawns, we block again... With a multithreaded or pre-fork version we could spawn all 10 at once and block (logically) once for the initiation of all of the sets before proceeding to run all the tests.
Redirecting stdout and stderr to a pipe causes deadlock:
https://github.com/mongodb/mongo-orchestration/blob/master/lib/process.py#L166-L168
No one reads from the pipe. Once the pipe is full, mongod blocks trying to write to it. mongod writes quite a bit to stdout (or stderr, I don't remember) if started with -vvvvv. We need to write to /dev/null instead. Here's an example in test-tools code:
https://github.com/mongodb/test-tools/commit/1d199ef44b279ff144f63869ef1430f7aa5ceb9d
Preset server-side configurations would be very convenient and could significantly simplify clients.
Suggested interface - add "preset" field to JSON parameters of POST body:
POST /hosts
POST /rs
POST /sh
{
"preset": "basic.json", // [optional] - configuration file on server
}
The above is a minimal JSON POST body. Suggested semantics - if "preset" is specified, load the configuration file on the server, and then merge in the JSON POST body with parameters that will override any equivalent parameters from the configuration file.
$python server.py start
Traceback (most recent call last):
File "server.py", line 82, in <module>
args = read_env()
File "server.py", line 38, in read_env
cli_args.release_path = config['releases'][cli_args.env]
KeyError: 'default'
Python version is 2.7.8 / ArchLinux / latest pymongo + requests
clients should connect to the instantiated cluster via uri, so supply it in all cases so that all of the clients don't have to have code to generate a uri themselves to connect for all case.
Related to #40, but branching this out into a separate issue to keep the scope under control.
Rename modules:
lib/rs.py
-> lib/replica_sets.py
lib/hosts.py
-> lib/servers.py
Rename classes:
Host
-> class Server
Shard
-> class ShardedCluster
Wherever we return an object with a host
field, this should become a server
field. shard_id
should become cluster_id
.
This should finish up the renaming once and for all.
There's some confusion over DBPATH and DATAPATH that needs to be fixed and cleaned up. Also the name '#' for the script is more than inconvenient and confusing.
Arbitrary stops and starts result in confused states with multiple processes and missing config files.
If the process is already running, you can start another (competing) process with the same configuration. I think that this is a bug and should be fixed.
After PUT hosts/{host-id}/stop, status via GET hosts/{host-id} still reports the old procInfo.pid even though it isn't running. You can configure a "hosts" object, but can't get an accurate status on whether or not the processing is running, you can only make assumptions based on "knowing" initial state and mentally tracking your state changes and the "expected" state.
Accurate status would help as we could determine whether or not to submit a start request.
Selection of mongo server version is needed via RESTful interface. With this feature, driver clients can easily script tests of a feature for function across a specified spectrum of mongo server versions. This is exactly what we need to improve development and maintenance while maintaining old version compatibility.
Without this feature, test scripts would have to add an additional layer of layer of process spawning support and complexity to launch and kill mongo-orchestration processes.
This is not mixed-version support, all members of a cluster would be of the same version.
Suggested config parameter - "version"
Example parameter in JSON body for /hosts POST or /hosts/id PUT
{
"version": "2.6", // [optional] - version for mongo servers
}
To resolve "version" to a bin_path, the "version" value should be a first substring match versus the mongo-orchestration.config data.
The config file should be reordered latest first so that latest releases are matched first.
The file is static, a more dynamic solution may be a future enhancement.
call trace for config['releases'] --> bin_path
server.py:97 args = read_env()
server.py:21 def read_env():
server.py:42 config = json.loads(open(cli_args.config, 'r').read())
server.py:47 releases = config['releases']
server.py:52 cli_args.release_path = releases[cli_args.env]
server.py:53 return cli_args
server.py:98 daemon.set_args(args)
server.py:92 def set_args(self, args):
server.py:93 self.args = args
server.py:104 if args.command == 'start' and args.no_fork:
server.py:105 daemon.run()
server.py:85 def run(self):
server.py:87 setup(getattr(self.args, "release_path", ""))
server.py:62 def setup(release_path):
server.py:65 set_bin_path(release_path)
__init__.py:9 def set_bin_path(bin_path=''):
__init__.py:10 Hosts().set_settings(bin_path)
__init__.py:11 RS().set_settings(bin_path)
__init__.py:12 Shards().set_settings(bin_path)
I'm seeing this after pulling on master and receiving the recent changes:
$ killall python
python: no process found
$ python server.py stop
pidfile /home/tbrock/Code/mongo-orchestration/server.pid does not exist. Daemon not running?
$ python server.py start
python server.py start
child process started successfully, parent exiting after 5 seconds
Starting Mongo Orchestration on port 8889...
Nothing running... I do all the stop things again and get the same results, then:
$ python server.py start --no-fork
Starting Mongo Orchestration on port 8889...
Bottle v0.11.rc1 server starting up (using WSGIRefServer())...
Listening on http://localhost:8889/
Hit Ctrl-C to quit.
Traceback (most recent call last):
File "server.py", line 105, in <module>
daemon.run()
File "server.py", line 90, in run
run(get_app(), host='localhost', port=self.args.port, debug=False, reloader=False, quiet=not self.args.no_fork)
File "/home/tbrock/Code/mongo-orchestration/bottle.py", line 2697, in run
server.run(app)
File "/home/tbrock/Code/mongo-orchestration/bottle.py", line 2379, in run
srv = make_server(self.host, self.port, handler, **self.options)
File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 144, in make_server
server = server_class((host, port), handler_class)
File "/usr/lib64/python2.7/SocketServer.py", line 419, in __init__
self.server_bind()
File "/usr/lib64/python2.7/wsgiref/simple_server.py", line 48, in server_bind
HTTPServer.server_bind(self)
File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind
SocketServer.TCPServer.server_bind(self)
File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind
self.socket.bind(self.server_address)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use
$ lsof -i | grep 8889
returns nothing
The tests all pass and starting with forking seems like it is running but actually doesn't start anything.
"sh" evokes singular shard in my mind.
"sc" or "clusters" or even "sharded_clusters" would be clearer as it should be plural and represent the collection of hosts/nodes.
Also "rs" could become "replica_sets".
What are your thoughts on this?
I think the logic for moving /start /stop /restart actions from the URI and into the payload applies the same way here. This is essentially an action.
The docs currently say we require pymongo 2.5.2, when we really need >=2.7.2 (we should add that requirement in setup.py btw). They also say we don't support Windows, but that support has been added. It's likely that other things are out of date due to recent changes.
Let's update the docs to cover recent changes and make sure we keep them up to date as we change things in the future.
Was there a change that made it so we no longer log requests processed by the orchestration server to STDOUT? If so, can we start logging again?
GET / route for server check, returns JSON that includes version number for dependency check
Some drivers add their own user for this.
Server 2.7.2+ has support for the new SCRAM-SHA-1 authentication mechanism. This should be enabled by default on all servers supporting it in conjunction with MONGODB-CR.
MO uses configsvrs for config but configservers for REST resource path. We should consider making them consistent. Some research on usage in MongoDB sharded cluster status and config may give some input on what to do.
If you create a replica set with MO, and then ask MO to stepdown the primary, the stepdown fails with "no secondaries within 10 seconds of my optime". At present, I'm working around this by using the client to issue {'replSetStepDown': 60, 'force': true} and then reattempting a write operation until it succeeds (takes about 20 attempts).
If MO implements a RESTful way to "await replication", then a following stepdown via MO would work.
Use Python scripts to start MO, send JSON files to create a setup, and teardown MO. This way we won't have to maintain separate scripts for Windows and Unix.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.