Documentation: https://shakenfist.com/
Source Code: https://github.com/shakenfist/shakenfist
shakenfist / shakenfist Goto Github PK
View Code? Open in Web Editor NEWOld man shakes fist at cloud
License: Apache License 2.0
Old man shakes fist at cloud
License: Apache License 2.0
Documentation: https://shakenfist.com/
Source Code: https://github.com/shakenfist/shakenfist
Jun 17 14:43:02 sf-1 sf[20328]: Traceback (most recent call last):
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1950, in full_dispatch_request
Jun 17 14:43:02 sf-1 sf[20328]: rv = self.dispatch_request()
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1936, in dispatch_request
Jun 17 14:43:02 sf-1 sf[20328]: return self.view_functions[rule.endpoint](**req.view_args)
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/local/lib/python3.6/dist-packages/flask_restful/__init__.py", line 472, in wrapper
Jun 17 14:43:02 sf-1 sf[20328]: return self.make_response(data, code, headers=headers)
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/local/lib/python3.6/dist-packages/flask_restful/__init__.py", line 501, in make_response
Jun 17 14:43:02 sf-1 sf[20328]: resp = self.representations[mediatype](data, *args, **kwargs)
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/local/lib/python3.6/dist-packages/flask_restful/representations/json.py", line 21, in output_json
Jun 17 14:43:02 sf-1 sf[20328]: dumped = dumps(data, **settings) + "\n"
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/lib/python3.6/json/__init__.py", line 231, in dumps
Jun 17 14:43:02 sf-1 sf[20328]: return _default_encoder.encode(obj)
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/lib/python3.6/json/encoder.py", line 199, in encode
Jun 17 14:43:02 sf-1 sf[20328]: chunks = self.iterencode(o, _one_shot=True)
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/lib/python3.6/json/encoder.py", line 257, in iterencode
Jun 17 14:43:02 sf-1 sf[20328]: return _iterencode(o, 0)
Jun 17 14:43:02 sf-1 sf[20328]: File "/usr/lib/python3.6/json/encoder.py", line 180, in default
Jun 17 14:43:02 sf-1 sf[20328]: o.__class__.__name__)
Jun 17 14:43:02 sf-1 sf[20328]: TypeError: Object of type 'generator' is not JSON serializable
So that user's stuff is carved up into their own namespaces.
# sf-client instance create cirros 1 1 -d 8@cirros
Traceback (most recent call last):
File "/usr/local/bin/sf-client", line 10, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/srv/shakenfist/src/shakenfist/client/main.py", line 405, in instance_create
network, disk, sshkey_content, userdata_content))
File "/srv/shakenfist/src/shakenfist/client/apiclient.py", line 77, in create_instance
'user_data': userdata
File "/srv/shakenfist/src/shakenfist/client/apiclient.py", line 45, in _request_url
'API request failed', method, url, r.status_code, r.text)
shakenfist.client.apiclient.APIException: ('API request failed', 'POST', 'http://localhost:13000/instances', 500, '{"error": "server error", "status": 500, "traceback": "Traceback (most recent call last):\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 56, in wrapper\\n return func(*args, **kwargs)\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 245, in post\\n for network in args[\'network\']:\\nTypeError: \'NoneType\' object is not iterable\\n"}')
We need to sanitise instance names for the DHCP configuration.
Return the last N bytes of the console log like OpenStack Nova does.
Everything else uses the db.py abstraction, which I think has advantages in terms of keeping the format of what is in etcd consistent. We should decide if we value db.py and if we do ensure that all etcd access flows through it.
Its like the 1990s all over again.
I've punted on this while getting JWT working at all and need to circle back and fix it.
Like a lot. I should clean those up.
# sf-client instance snapshot 4dae894a-3f89-4c7d-b8eb-04d890cf0d5d
Traceback (most recent call last):
File "/usr/local/bin/sf-client", line 10, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/srv/shakenfist/src/shakenfist/client/main.py", line 473, in instance_snapshot
uuid = CLIENT.snapshot_instance(instance_uuid, all)
File "/srv/shakenfist/src/shakenfist/client/apiclient.py", line 83, in snapshot_instance
'/snapshot', data={'all': all})
File "/srv/shakenfist/src/shakenfist/client/apiclient.py", line 45, in _request_url
'API request failed', method, url, r.status_code, r.text)
shakenfist.client.apiclient.APIException: ('API request failed', 'POST', 'http://localhost:13000/instances/4dae894a-3f89-4c7d-b8eb-04d890cf0d5d/snapshot', 500, '{"error": "server error", "status": 500, "traceback": "Traceback (most recent call last):\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 56, in wrapper\\n return func(*args, **kwargs)\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 89, in wrapper\\n return func(*args, **kwargs)\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 113, in wrapper\\n return func(*args, **kwargs)\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 319, in post\\n snap_uuid = instance_from_db_virt.snapshot(all=args[\'all\'])\\n File \\"/srv/shakenfist/src/shakenfist/virt.py\\", line 425, in snapshot\\n d[\'path\'], os.path.join(snappath, d[\'device\']))\\n File \\"/srv/shakenfist/src/shakenfist/virt.py\\", line 402, in _snapshot_device\\n images.snapshot(source, destination)\\n File \\"/srv/shakenfist/src/shakenfist/images.py\\", line 212, in snapshot\\n shell=True)\\n File \\"/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py\\", line 424, in execute\\n cmd=sanitized_cmd)\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: qemu-img convert --force-share -O qcow2 /srv/shakenfist/instances/4dae894a-3f89-4c7d-b8eb-04d890cf0d5d/vdc.qcow2 /srv/shakenfist/snapshots/dc498e79-974a-49f8-a776-d0cc7f5b2f31/vdc\\nExit code: 1\\nStdout: \'\'\\nStderr: \\"qemu-img: Could not open \'/srv/shakenfist/instances/4dae894a-3f89-4c7d-b8eb-04d890cf0d5d/vdc.qcow2\': Could not open \'/srv/shakenfist/instances/4dae894a-3f89-4c7d-b8eb-04d890cf0d5d/vdc.qcow2\': No such file or directory\\\\n\\"\\n"}')
I shouldn't be trying to snapshot CDROMs anyways.
Just recording that this needs to be nailed down at some point.
Because you can't boot a 404 page, but it sure is confusing.
Not entirely sure why yet.
$ for uuid in `sf-client --simple instance list | grep -v uuid | cut -f 1 -d "," | head -1`; do sf-client instance delete $uuid; done
Traceback (most recent call last):
File "/usr/local/bin/sf-client", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/main.py", line 533, in instance_delete
CLIENT.delete_instance(instance_uuid)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 186, in delete_instance
'/instances/' + instance_uuid)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 109, in _request_url
'API request failed', method, url, r.status_code, r.text)
shakenfist.client.apiclient.APIException: ('API request failed', 'DELETE', 'http://localhost:13000/instances/23000e33-25cd-47f2-a857-6ca6721cc86f', 500, '{"error": "server error", "status": 500, "traceback": "Traceback (most recent call last):\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 91, in wrapper\\n return func(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/view_decorators.py\\", line 107, in wrapper\\n verify_jwt_in_request()\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/view_decorators.py\\", line 32, in verify_jwt_in_request\\n jwt_data, jwt_header = _decode_jwt_from_request(request_type=\'access\')\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/view_decorators.py\\", line 294, in _decode_jwt_from_request\\n decoded_token = decode_token(encoded_token, csrf_token)\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/utils.py\\", line 118, in decode_token\\n allow_expired=allow_expired\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/tokens.py\\", line 140, in decode_jwt\\n leeway=leeway, options=options, issuer=issuer)\\n File \\"/usr/local/lib/python3.6/dist-packages/jwt/api_jwt.py\\", line 92, in decode\\n jwt, key=key, algorithms=algorithms, options=options, **kwargs\\n File \\"/usr/local/lib/python3.6/dist-packages/jwt/api_jws.py\\", line 156, in decode\\n key, algorithms)\\n File \\"/usr/local/lib/python3.6/dist-packages/jwt/api_jws.py\\", line 223, in _verify_signature\\n raise InvalidSignatureError(\'Signature verification failed\')\\njwt.exceptions.InvalidSignatureError: Signature verification failed\\n"}')
I need to pay more attention to this.
sf-client CLI autocomplete lists the available UUID's (when appropriate) but does not select them to the command line.
All the CI testing is done with short lived clusters, so of course I missed this...
Jul 4 10:14:53 sau-3f41e-or Failed to write /sf/node/sf-3, attempt 2: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.RESOURCE_EXHAUSTED
details = "etcdserver: mvcc: database space exceeded"
debug_error_string = "{"created":"@1593857693.847666930","description":"Error received from peer ipv4:127.0.0.1:2379","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"etcdserver: mvcc: database space exceeded","grpc_status":8}"
>
Jul 4 10:14:53 sau-3f41e-or Failed to collect resource statistics: Cannot write "/sf/node/sf-3"
I need to compact / defrag / expire old events regularly it seems.
While launching many VMs on lots of networks, I got this error. Its weird because I only got it once out of hundreds of networks. A less used code path perhaps?
++ sf-client --simple network create 192.168.0.0/24 cybertaipan-20
++ grep uuid:
++ cut -f 2 -d :
Traceback (most recent call last):
File "/usr/local/bin/sf-client", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/main.py", line 373, in network_create
netblock, dhcp, nat, name, namespace))
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 275, in allocate_network
'namespace': namespace
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 140, in _request_url
return self._actual_request_url(method, url, data=data)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 119, in _actual_request_url
'API request failed', method, url, r.status_code, r.text)
shakenfist.client.apiclient.APIException: ('API request failed', 'POST', 'http://localhost:13000/networks', 500, '{"error": "server error", "status": 500, "traceback": "Traceback (most recent call last):\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 104, in wrapper\\n return func(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/view_decorators.py\\", line 108, in wrapper\\n return fn(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 1074, in post\\n n.create()\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/net.py\\", line 172, in create\\n self.deploy_nat()\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/net.py\\", line 198, in deploy_nat\\n self.persist_floating_gateway()\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/net.py\\", line 97, in persist_floating_gateway\\n db.persist_floating_gateway(self.uuid, self.floating_gateway)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/db.py\\", line 165, in persist_floating_gateway\\n etcd.put(\'network\', None, network_uuid, n)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/etcd.py\\", line 56, in put\\n encoded = json.dumps(data, indent=4, sort_keys=True)\\n File \\"/usr/lib/python3.6/json/__init__.py\\", line 238, in dumps\\n **kw).encode(obj)\\n File \\"/usr/lib/python3.6/json/encoder.py\\", line 201, in encode\\n chunks = list(chunks)\\n File \\"/usr/lib/python3.6/json/encoder.py\\", line 430, in _iterencode\\n yield from _iterencode_dict(o, _current_indent_level)\\n File \\"/usr/lib/python3.6/json/encoder.py\\", line 404, in _iterencode_dict\\n yield from chunks\\n File \\"/usr/lib/python3.6/json/encoder.py\\", line 437, in _iterencode\\n o = _default(o)\\n File \\"/usr/lib/python3.6/json/encoder.py\\", line 180, in default\\n o.__class__.__name__)\\nTypeError: Object of type \'IPv4Address\' is not JSON serializable\\n"}')
$ hyperfine "sf-client network create 192.168.1.0/24 mynet"
Benchmark #1: sf-client network create 192.168.1.0/24 mynet
Time (mean ± σ): 18.615 s ± 0.550 s [User: 361.9 ms, System: 40.3 ms]
Range (min … max): 17.725 s … 19.362 s 10 runs
Its interesting, because the individual events aren't super slow:
$ sf-client network events ee744264-6d0c-423e-9823-dc36f7ca5aa5
+----------------------------+------+------------------------+------------+---------------------+---------+
| timestamp | node | operation | phase | duration | message |
+----------------------------+------+------------------------+------------+---------------------+---------+
| 2020-06-17 19:01:22.428386 | sf-1 | api | create | None | None |
| 2020-06-17 19:01:24.115698 | sf-1 | create vxlan interface | start | None | None |
| 2020-06-17 19:01:24.479220 | sf-1 | create vxlan interface | finish | 0.3634636402130127 | None |
| 2020-06-17 19:01:24.829187 | sf-1 | create vxlan bridge | start | None | None |
| 2020-06-17 19:01:25.263628 | sf-1 | create vxlan bridge | finish | 0.43483662605285645 | None |
| 2020-06-17 19:01:25.598291 | sf-1 | create netns | start | None | None |
| 2020-06-17 19:01:26.039656 | sf-1 | create netns | finish | 0.43940114974975586 | None |
| 2020-06-17 19:01:26.391639 | sf-1 | create router veth | start | None | None |
| 2020-06-17 19:01:26.978238 | sf-1 | create router veth | finish | 0.5865623950958252 | None |
| 2020-06-17 19:01:27.326934 | sf-1 | create physical veth | start | None | None |
| 2020-06-17 19:01:27.770758 | sf-1 | create physical veth | finish | 0.44352030754089355 | None |
| 2020-06-17 19:01:30.158449 | sf-1 | enable virtual routing | start | None | None |
| 2020-06-17 19:01:30.698722 | sf-1 | enable virtual routing | finish | 0.5399155616760254 | None |
| 2020-06-17 19:01:31.046316 | sf-1 | enable nat | start | None | None |
| 2020-06-17 19:01:31.962038 | sf-1 | enable nat | finish | 0.9159204959869385 | None |
| 2020-06-17 19:01:32.296439 | sf-1 | ensure mesh | start | None | None |
| 2020-06-17 19:01:32.952852 | sf-1 | discover mesh | start | None | None |
| 2020-06-17 19:01:33.302200 | sf-1 | discover mesh | finish | 0.34851765632629395 | None |
| 2020-06-17 19:01:33.636606 | sf-1 | ensure mesh | finish | 1.3521108627319336 | None |
| 2020-06-17 19:01:33.944221 | sf-1 | update dhcp | start | None | None |
| 2020-06-17 19:01:36.005281 | sf-1 | update dhcp | finish | 2.0472848415374756 | None |
| 2020-06-17 19:01:36.338392 | sf-1 | ensure mesh | start | None | None |
| 2020-06-17 19:01:37.006036 | sf-1 | discover mesh | start | None | None |
| 2020-06-17 19:01:37.354087 | sf-1 | discover mesh | finish | 0.34702134132385254 | None |
| 2020-06-17 19:01:37.701589 | sf-1 | ensure mesh | finish | 1.3621313571929932 | None |
| 2020-06-17 19:02:24.758767 | sf-1 | api | get events | None | None |
+----------------------------+------+------------------------+------------+---------------------+---------+
Creating an instance randomly returns an empty data structure with a 200 OK.
The instance is subsequently created.
**************************
*** Create an instance ***
**************************
UUID:
Name:
CPUs: 0
Memory (MB): 0
Disks:
SSHKey:
Node:
ConsolePort: 0
VDIPort: 0
UserData:
State:
StateUpdated: 1970-01-01 10:00:00 +1000 AEST
For example, revoking a single user's access.
Is the endpoint API the best design?
Currently:
POST auth/namespace/<namespace>/metadata/ creates a key
Would it be more appropriate:
POST auth/namespace/<namespace>/metadata creates a key
PUT auth/namespace/<namespace>/metadata/<key> creates or updates a key
Doesn't add functionality, only polish.
I don't track what has been handed out at all. That needs to change.
Network UUID does exist. On request the error:
Jun 23 08:04:18 andy-200623-fiu9eiqu-sf-1 External API request: <bound method arg_is_network_uuid.<locals>.wrapper of <shakenfist.external_api.app.Network object at 0x7f18763a1da0>> () {'network_uuid': '00997cbc-1e97-4a3d-b55b-3720117fb71a'}
Jun 23 08:04:18 andy-200623-fiu9eiqu-sf-1 Returning API error: 500, server error#012 Traceback (most recent call last):#012 File "/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py", line 91, in wrapper#012 return func(*args, **kwargs)#012 File "/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/view_decorators.py", line 108, in wrapper#012 return fn(*args, **kwargs)#012 File "/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py", line 173, in wrapper#012 return func(*args, **kwargs)#012 File "/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py", line 654, in get#012 if network_from_db is not None:#012 KeyError: 'ipmanager'
virsh start sf:edfbb915-704c-49ce-9ed3-4ad53795c364
error: Failed to start domain sf:edfbb915-704c-49ce-9ed3-4ad53795c364
error: unsupported configuration: Unable to use MAC address starting with reserved value 0xFE - 'fe:94:0c:39:75:c6' -
The mirror sometimes refuses to let me download images. I should support a local mirror.
I should fix that.
root@sf-1:/srv/shakenfist# sf-client network delete 2f40b085-cdfc-4a2c-9031-a17f7b48273a
Traceback (most recent call last):
File "/usr/local/bin/sf-client", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/main.py", line 412, in network_delete
CLIENT.delete_network(network_uuid)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 249, in delete_network
'/networks/' + network_uuid)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 131, in _request_url
'API request failed', method, url, r.status_code, r.text)
shakenfist.client.apiclient.APIException: ('API request failed', 'DELETE', 'http://localhost:13000/networks/2f40b085-cdfc-4a2c-9031-a17f7b48273a', 500, '{"error": "server error", "status": 500, "traceback": "Traceback (most recent call last):\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 104, in wrapper\\n return func(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/flask_jwt_extended/view_decorators.py\\", line 108, in wrapper\\n return fn(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 213, in wrapper\\n return func(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 259, in wrapper\\n return func(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 242, in wrapper\\n return func(*args, **kwargs)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/external_api/app.py\\", line 999, in delete\\n n = net.from_db(network_uuid)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/net.py\\", line 37, in from_db\\n namespace=dbnet[\'namespace\'])\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/net.py\\", line 55, in __init__\\n ipm = db.get_ipmanager(self.uuid)\\n File \\"/usr/local/lib/python3.6/dist-packages/shakenfist/db.py\\", line 151, in get_ipmanager\\n raise Exception(\'IP Manager not found for network %s\' % network_uuid)\\nException: IP Manager not found for network 2f40b085-cdfc-4a2c-9031-a17f7b48273a\\n"}')
root@sf-1:/srv/shakenfist# ```
So that we can implement resource ownership.
It probably shouldn't be a JSON blob in the database. Discuss.
Creating instances is slower than I expected too:
$ hyperfine "sf-client instance create myinst 1 1 -d 8@cirros -n e67e2be9-2736-44c2-a56c-2a0575a882c6"
Benchmark #1: sf-client instance create myinst 1 1 -d 8@cirros -n e67e2be9-2736-44c2-a56c-2a0575a882c6
Time (mean ± σ): 60.729 s ± 11.505 s [User: 366.3 ms, System: 47.4 ms]
Range (min … max): 48.186 s … 79.898 s 10 runs
With these sorts of events:
$ sf-client instance events 416dd877-6604-4130-a5f6-a8a553682ce2
+----------------------------+------+----------------------------+-----------------------------+---------------------+--------------------------+
| timestamp | node | operation | phase | duration | message |
+----------------------------+------+----------------------------+-----------------------------+---------------------+--------------------------+
| 2020-06-18 07:08:11.256338 | sf-1 | uuid allocated | None | None | None |
| 2020-06-18 07:08:13.595873 | sf-1 | schedule | start | None | None |
| 2020-06-18 07:08:15.261052 | sf-1 | schedule | Initial candidates | None | ['sf-1', 'sf-2', 'sf-3'] |
| 2020-06-18 07:08:15.594456 | sf-1 | schedule | Have enough actual CPU | None | ['sf-1', 'sf-2', 'sf-3'] |
| 2020-06-18 07:08:15.927853 | sf-1 | schedule | Have enough idle CPU | None | ['sf-1', 'sf-2', 'sf-3'] |
| 2020-06-18 07:08:16.261326 | sf-1 | schedule | Have enough idle RAM | None | ['sf-1', 'sf-2', 'sf-3'] |
| 2020-06-18 07:08:16.594862 | sf-1 | schedule | Have enough idle disk | None | ['sf-1', 'sf-2', 'sf-3'] |
| 2020-06-18 07:08:19.280864 | sf-1 | schedule | Have most matching networks | None | ['sf-3'] |
| 2020-06-18 07:08:19.947462 | sf-1 | schedule | Have most matching images | None | ['sf-3'] |
| 2020-06-18 07:08:20.280687 | sf-1 | schedule | finish | 6.684841871261597 | None |
| 2020-06-18 07:08:21.280062 | sf-1 | placement | None | None | sf-3 |
| 2020-06-18 07:08:21.956380 | sf-3 | schedule | start | None | None |
| 2020-06-18 07:08:23.084216 | sf-3 | schedule | Forced candidates | None | ['sf-3'] |
| 2020-06-18 07:08:23.276393 | sf-3 | schedule | Initial candidates | None | ['sf-3'] |
| 2020-06-18 07:08:23.468592 | sf-3 | schedule | Have enough actual CPU | None | ['sf-3'] |
| 2020-06-18 07:08:23.718679 | sf-3 | schedule | Have enough idle CPU | None | ['sf-3'] |
| 2020-06-18 07:08:24.026666 | sf-3 | schedule | Have enough idle RAM | None | ['sf-3'] |
| 2020-06-18 07:08:24.207056 | sf-3 | schedule | Have enough idle disk | None | ['sf-3'] |
| 2020-06-18 07:08:25.343608 | sf-3 | schedule | Have most matching networks | None | ['sf-3'] |
| 2020-06-18 07:08:25.902487 | sf-3 | schedule | Have most matching images | None | ['sf-3'] |
| 2020-06-18 07:08:26.152653 | sf-3 | schedule | finish | 4.278729200363159 | None |
| 2020-06-18 07:08:40.168090 | sf-3 | ensure networks exist | start | None | None |
| 2020-06-18 07:08:51.337100 | sf-3 | ensure networks exist | finish | 11.136167287826538 | None |
| 2020-06-18 07:08:51.670927 | sf-3 | instance creation | start | None | None |
| 2020-06-18 07:08:52.004922 | sf-3 | make config drive | start | None | None |
| 2020-06-18 07:08:53.204224 | sf-3 | make config drive | finish | 1.2850000858306885 | None |
| 2020-06-18 07:08:53.366705 | sf-3 | fetch image | start | None | None |
| 2020-06-18 07:08:55.677333 | sf-3 | fetch image | finish | 2.224187135696411 | None |
| 2020-06-18 07:08:56.010746 | sf-3 | transcode image | start | None | None |
| 2020-06-18 07:08:56.344267 | sf-3 | transcode image | finish | 0.33343958854675293 | None |
| 2020-06-18 07:08:56.677983 | sf-3 | resize image | start | None | None |
| 2020-06-18 07:08:57.011498 | sf-3 | resize image | finish | 0.3337991237640381 | None |
| 2020-06-18 07:08:57.200808 | sf-3 | create copy on write layer | start | None | None |
| 2020-06-18 07:08:57.387142 | sf-3 | create copy on write layer | finish | 0.18468737602233887 | None |
| 2020-06-18 07:08:57.874770 | sf-3 | create domain XML | start | None | None |
| 2020-06-18 07:08:59.019109 | sf-3 | create domain XML | finish | 1.0588061809539795 | None |
| 2020-06-18 07:08:59.207091 | sf-3 | create domain | start | None | None |
| 2020-06-18 07:09:01.591925 | sf-3 | create domain | finish | 2.300135374069214 | None |
| 2020-06-18 07:09:02.321802 | sf-3 | instance creation | finish | 10.73669147491455 | None |
| 2020-06-18 07:09:03.871757 | sf-1 | api | get interfaces | None | None |
| 2020-06-18 07:22:55.862299 | sf-1 | api | get events | None | None |
+----------------------------+------+----------------------------+-----------------------------+---------------------+--------------------------+
So... gunicorn has a request timeout. The default is 30 seconds, although I am currently changing that to 300 seconds. Why? Well, fetching a large image for an instance start can take a long time. The file might be hundreds of gig! I think long term we might want to move to a queue system for instance starts, but I see that as a v0.3 thing not a v0.2 thing.
Windows lacks virtio drivers by default. A user of Shaken Fist should be able to express the need to provide windows supported devices inside an instance fully via the API. Specifically, network cards need to have their model exposed for configuration.
This is needed for one of our callers.
sf-client should not have unhandled exceptions, especially resource not found.
Traceback (most recent call last):
File "/usr/local/bin/sf-client", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/main.py", line 230, in network_show
_show_network(ctx, CLIENT.get_network(network_uuid))
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 206, in get_network
'/networks/' + network_uuid)
File "/usr/local/lib/python3.6/dist-packages/shakenfist/client/apiclient.py", line 105, in _request_url
'API request failed', method, url, r.status_code, r.text)
shakenfist.client.apiclient.ResourceNotFoundException: ('API request failed', 'GET', 'http://localhost:13000/networks/021adbe5-af42-4223-b6f9-6054f8c63649', 404, '{"error": "network not found", "status": 404}')
Is the command line client not exit(1)'ing when it sees an unknown exception?
Sound familiar?
Resource endpoints are plurals except for namespace.
Should namespace change for consistency?
New errors with latest commits.
Attempt to create instance returns empty data structure.
Namespace: testspace
Call was to sf-1. Instantiated on sf-3. Success but empty data structure returned.
Subsequent call to /instances, returned 404 - no URL.
Jun 28 10:53:24 andy-200628-amai3ieg-sf-1 API request: POST http://localhost:13000/auth
Headers:
('Host', 'localhost:13000')
('User-Agent', 'Go-http-client/1.1')
('Content-Length', '41')
('Authorization', 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE1OTMzNDE2MDEsIm5iZiI6MTU5MzM0MTYwMSwianRpIjoiODQ2ZDRkOWItMjE3MS00ZmNmLTgzZDYtYjk4OTFlMzQwOTZkIiwiZXhwIjoxNTkzMzQyNTAxLCJpZGVudGl0eSI6InRlc3RzcGFjZSIsImZyZXNoIjpmYWxzZSwidHlwZSI6ImFjY2VzcyJ9.BwhPucz5hW7lsAeNPLSk8aEWJYznc9VtQ9wGmxnr_cU')
('Content-Type', 'application/json')
('Accept-Encoding', 'gzip')
Args: ()
KWargs: {'namespace': 'testspace', 'key': 'testkey'}
Jun 28 10:53:24 andy-200628-amai3ieg-sf-1 systemd-networkd[4970]: vxlan-7: Gained IPv6LL
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 systemd-networkd[4970]: phy-7-o: Gained IPv6LL
Jun 28 10:53:25 localhost gunicorn.sf.access: [14349] 127.0.0.1 - - [28/Jun/2020:10:53:25 +0000] "POST /auth HTTP/1.1" 200 302 "-" "Go-http-client/1.1"
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 API request: POST http://localhost:13000/instances
Headers:
('Host', 'localhost:13000')
('User-Agent', 'Go-http-client/1.1')
('Content-Length', '232')
('Authorization', 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE1OTMzNDE2MDUsIm5iZiI6MTU5MzM0MTYwNSwianRpIjoiOGZiZjRhM2UtMjhjNC00ZGE2LTk5NzktNGQ5OTcyNjM2N2JiIiwiZXhwIjoxNTkzMzQyNTA1LCJpZGVudGl0eSI6InRlc3RzcGFjZSIsImZyZXNoIjpmYWxzZSwidHlwZSI6ImFjY2VzcyJ9.sBbNFOEO_XEHJpc07m1I3hlCuxPlgobXF6MSDEAUFpg')
('Content-Type', 'application/json')
('Accept-Encoding', 'gzip')
Args: ()
KWargs: {'name': 'golang', 'cpus': 1, 'memory': 1, 'network': [{'network_uuid': '7a32b742-a7b0-434f-aa86-4e37fe27b5d1', 'address': '', 'macaddress': '', 'model': ''}], 'disk': [{'base': 'cirros', 'size': 8, 'bus': '', 'type': 'disk'}], 'ssh_key': '', 'user_data': ''}
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-1', 'sf-2', 'sf-3'] start as candidates
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-1', 'sf-2', 'sf-3'] have enough actual CPU
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-1', 'sf-2', 'sf-3'] have enough idle CPU
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-1', 'sf-2', 'sf-3'] have enough idle RAM
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-1', 'sf-2', 'sf-3'] have enough idle disk
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-1', 'sf-2', 'sf-3'] have most matching networks
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Scheduling instance(43521ba9-47fb-464c-b445-e9228ffa6421), ['sf-3'] have most matching images
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 instance(43521ba9-47fb-464c-b445-e9228ffa6421): Finish schedule, duration 0.25 seconds
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 systemd-networkd[4970]: br-vxlan-7: Gained IPv6LL
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Fetching testspace auth token from http://sf-3:13000/auth
Jun 28 10:53:25 andy-200628-amai3ieg-sf-3 API request: POST http://sf-3:13000/auth
Headers:
('Host', 'sf-3:13000')
('User-Agent', 'Mozilla/5.0 (Ubuntu; Linux x86_64) Shaken Fist/0.1.2')
('Accept-Encoding', 'gzip, deflate')
('Accept', '*/*')
('Connection', 'keep-alive')
('Content-Type', 'application/json')
('Content-Length', '87')
Args: ()
KWargs: {'namespace': 'testspace', 'key': 'fyhhzcrbvzqbgtgfjbmjzmmhcczkclifwirgckrjykfkrswtkq'}
Jun 28 10:53:25 andy-200628-amai3ieg-sf-3 API request: POST http://sf-3:13000/instances
Headers:
('Host', 'sf-3:13000')
('User-Agent', 'Mozilla/5.0 (Ubuntu; Linux x86_64) Shaken Fist/0.1.2')
('Accept-Encoding', 'gzip, deflate')
('Accept', '*/*')
('Connection', 'keep-alive')
('Authorization', 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE1OTMzNDE2MDUsIm5iZiI6MTU5MzM0MTYwNSwianRpIjoiZDRiM2RjYjItMDIwNC00MWY1LThhMjMtM2JhNGNiMmQ0ODFlIiwiZXhwIjoxNTkzMzQyNTA1LCJpZGVudGl0eSI6InRlc3RzcGFjZSIsImZyZXNoIjpmYWxzZSwidHlwZSI6ImFjY2VzcyJ9.YvYm5jtrlEhglcgTnas-1_e6W-5tNzZib5h4bzSsNfg')
('Content-Length', '363')
Args: ()
KWargs: {'name': 'golang', 'cpus': 1, 'memory': 1, 'network': [{'network_uuid': '7a32b742-a7b0-434f-aa86-4e37fe27b5d1', 'address': '', 'macaddress': '', 'model': ''}], 'disk': [{'base': 'cirros', 'size': 8, 'bus': '', 'type': 'disk'}], 'ssh_key': '', 'user_data': '', 'placed_on': 'sf-3', 'instance_uuid': '43521ba9-47fb-464c-b445-e9228ffa6421', 'namespace': 'testspace'}
Jun 28 10:53:25 andy-200628-amai3ieg-sf-3 Returning API error: 401, only admins can create resources in a different namespace
Jun 28 10:53:25 andy-200628-amai3ieg-sf-1 Returning proxied request: 401, {"error": "only admins can create resources in a different namespace", "status": 401}
Jun 28 10:53:25 localhost gunicorn.sf.access: [14349] 127.0.0.1 - - [28/Jun/2020:10:53:25 +0000] "POST /instances HTTP/1.1" 401 85 "-" "Go-http-client/1.1"
Jun 28 10:53:26 localhost gunicorn.sf.access: [14358] 127.0.0.1 - - [28/Jun/2020:10:53:26 +0000] "GET /instances/ HTTP/1.1" 404 232 "-" "Go-http-client/1.1"```
And report a nice error instead of crashing...
shakenfist.client.apiclient.APIException: ('API request failed', 'POST', 'http://localhost:13000/instances', 500, '{"error": "server error", "status": 500, "traceback": "Traceback (most recent call last):\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 83, in wrapper\\n return func(*args, **kwargs)\\n File \\"/srv/shakenfist/src/shakenfist/daemons/external_api.py\\", line 371, in post\\n instance.create()\\n File \\"/srv/shakenfist/src/shakenfist/virt.py\\", line 172, in create\\n hashed_image_path, str(disk[\'size\']) + \'G\')\\n File \\"/srv/shakenfist/src/shakenfist/images.py\\", line 190, in resize_image\\n shell=True)\\n File \\"/usr/local/lib/python3.6/dist-packages/oslo_concurrency/processutils.py\\", line 424, in execute\\n cmd=sanitized_cmd)\\noslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.\\nCommand: qemu-img resize /srv/shakenfist/image_cache/579708eb995f49304bb123b16851e3e45a7b95d31ddd0e5dbac6451bb28994a5.v001.qcow2.2G 2G\\nExit code: 1\\nStdout: \'\'\\nStderr: \\"qemu-img: warning: Shrinking an image will delete all data beyond the shrunken image\'s end. Before performing such an operation, make sure there is no important data there.\\\\nqemu-img: Use the --shrink option to perform a shrink operation.\\\\n\\"\\n"}')
Imagine an image file which takes a very long time to fetch, we need to ensure we are only downloading it once at a time per node.
We're inconsistent at the moment.
Sometimes MySQL goes away. We should have a retry decorator.
Debug servers are not for production.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.