katosys / kato Goto Github PK
View Code? Open in Web Editor NEWThe magic is underneath.
Home Page: http://kato.one
License: Apache License 2.0
The magic is underneath.
Home Page: http://kato.one
License: Apache License 2.0
Use the vagrant-hostmanager
plugin to manage the host's /etc/hosts
file.
core@kato-1 ~ $ sudo rkt trust --prefix=quay.io/kato
pubkey: prefix: "quay.io/kato"
key: "https://quay.io/aci-signing-key"
gpg key fingerprint is: BFF3 13CD AA56 0B16 A898 7B8F 72AB F5F6 799D 33BC
Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Are you sure you want to trust this key (yes/no)?
yes
Trusting "https://quay.io/aci-signing-key" for prefix "quay.io/kato" after fingerprint review.
Added key for prefix "quay.io/kato" at "/etc/rkt/trustedkeys/prefix.d/quay.io/kato/bff313cdaa560b16a8987b8f72abf5f6799d33bc"
core@kato-1 ~ $ find /etc/rkt/
/etc/rkt/
/etc/rkt/trustedkeys
/etc/rkt/trustedkeys/prefix.d
/etc/rkt/trustedkeys/prefix.d/quay.io
/etc/rkt/trustedkeys/prefix.d/quay.io/kato
/etc/rkt/trustedkeys/prefix.d/quay.io/kato/bff313cdaa560b16a8987b8f72abf5f6799d33bc
core@kato-1 ~ $ ls -la /etc/rkt/trustedkeys/prefix.d/quay.io/kato/bff313cdaa560b16a8987b8f72abf5f6799d33bc
-rw-r--r--. 1 root rkt-admin 991 Nov 2 09:15 /etc/rkt/trustedkeys/prefix.d/quay.io/kato/bff313cdaa560b16a8987b8f72abf5f6799d33bc
core@kato-1 ~ $ cat /etc/rkt/trustedkeys/prefix.d/quay.io/kato/bff313cdaa560b16a8987b8f72abf5f6799d33bc
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v2
mQENBFTT6doBCACkVncI+t4HASQdnByRlXCYkwjsPqGOlgTCgenop5I6vgTqFWhQ
PMNhtSaFdFECMt2WKQT4QGVbfVOmIH9CLV+Muqvk4iJIAn3Nh3qp/kfMhwjGaS6m
fWN2ARFCq4RIs9tboCNQOouaD5C26/FsQtIsoqyYcdX+YFaU1a+R1kp0fc2CABDI
k6Iq8oEJO+FOYvqQYIJNfd3c0NHICilMu2jO3yIsw80qzWoFAAblyb0zVq/hudWB
4vdVzPmJe1f4Ymk8l1R413bN65LcbCiOax3hmFWovJoxlkL7WoGTTMfaeb2QmaPL
qcu4Q94v1KG87gyxbkIo5uZdvMLdswQI7yQ7ABEBAAG0RFF1YXkuaW8gQUNJIENv
bnZlcnRlciAoQUNJIGNvbnZlcnNpb24gc2lnbmluZyBrZXkpIDxzdXBwb3J0QHF1
YXkuaW8+iQE5BBMBAgAjBQJU0+naAhsDBwsJCAcDAgEGFQgCCQoLBBYCAwECHgEC
F4AACgkQcqv19nmdM7zKzggAjGFqy7Hcx6TCFXn53/inl5iyKrTu8cuF4K547XuZ
12Dt8b6PgJ+b3z6UnMMTd0wXKGcfOmNeQ2R71xmVnviuo7xB5ZkZIBxHI4M/5uhK
I6GZKr84WJS2ec7ssH2ofFQ5u1l+es9jUwW0KbAoNmES0IcdDy28xfmJpkfOn3oI
P2Bzz4rGlIqJXEjq28Wk+qQu64kJRKYuPNXqiHncPDm+i5jMXUUN1D+pkDukp26x
oLbpol42/jIcM3fe2AFZnflittBCHYLIHjJ51NlpSHJZmf2pQZbdyeKElN2SCNe7
nDcol24zYIC+SX0K23w/LrLzlff4mzbO99ePt1bB9zAiVA==
=SBoV
-----END PGP PUBLIC KEY BLOCK-----
The current build is broken:
github.com/bobtfish/go-nsone-api
github.com/katosys/kato/providers/dns/ns1
# github.com/katosys/kato/providers/dns/ns1
providers/dns/ns1/ns1.go:14: imported and not used: "github.com/bobtfish/go-nsone-api" as ns1
providers/dns/ns1/ns1.go:42: undefined: nsone in nsone.New
providers/dns/ns1/ns1.go:55: undefined: nsone in nsone.NewZone
providers/dns/ns1/ns1.go:93: undefined: nsone in nsone.New
providers/dns/ns1/ns1.go:101: undefined: nsone in nsone.NewRecord
providers/dns/ns1/ns1.go:102: undefined: nsone in nsone.Answer
providers/dns/ns1/ns1.go:103: undefined: nsone in nsone.NewAnswer
The --cluster-state
flag can take one argument with two possible values new
and existing
(default). This flag will be used to shape the configuration of new quorum
nodes.
For instance, if the cluster already exists then new quorum
nodes should be started with zookeeper
and etcd2
stopped and properly templated.
... the generated /etc/kato.env
will be included in other units like this:
EnvironmentFile=/etc/kato.env
This is due to a bad indentation.
By default --round-robin
is set to false
. But before it can be set to true
we must ensure that --nameservers
has the list of all the master
IP addresses (and nothing else).
Something like:
Flags:
--debug Enable debug mode.
--help Show help.
Will set:
log.SetLevel(log.DebugLevel)
After reading this http://machinezone.github.io/research/networking-solutions-for-kubernetes/#results I have decided to switch to host-gw
(because I trust them).
Currently master
and worker
nodes are in different VPC subnets. This works for vxlan
but I think it won't work for host-gw
.
INFO[0094] New EC2 elb security group cmd=ec2:setup id=sg-e5ad8182
INFO[0094] New EC2 quorum security group cmd=ec2:setup id=sg-e3ad8184
INFO[0094] New EC2 master security group cmd=ec2:setup id=sg-e2ad8185
INFO[0094] New EC2 worker security group cmd=ec2:setup id=sg-efad8188
ERRO[0095] InvalidGroup.NotFound: The security group 'sg-efad8188' does not exist
status code: 400, request id: 796e7a2e-a323-4c2d-a796-1cf1c7119e80 cmd=ec2:setup file=ec2.go func=ec2.(*Data).tag line=1996
FATA[0095] InvalidGroup.NotFound: The security group 'sg-efad8188' does not exist
status code: 400, request id: 796e7a2e-a323-4c2d-a796-1cf1c7119e80 cmd=ec2:setup file=ec2.go func=ec2.(*Data).setupEC2Firewall line=801
FATA[0095] exit status 1
[0] ~ >> ./kato-ec2
INFO[0000] Setup the EC2 environment cmd=ec2:deploy id=cell-1.dub.xnood.com
INFO[0000] Connecting to region eu-west-1 cmd=ec2:setup
INFO[0000] Latest CoreOS stable AMI located cmd=ec2:deploy id=ami-b7cba3c4
INFO[0000] New EC2 VPC created cmd=ec2:setup id=vpc-c0833aa4
ERRO[0000] InvalidVpcID.NotFound: The vpc ID 'vpc-c0833aa4' does not exist
status code: 400, request id: 78023e94-814c-4ba2-9482-bf8617c7ab23 cmd=ec2:setup file=ec2.go func=ec2.(*Data).tag line=1996
FATA[0000] InvalidVpcID.NotFound: The vpc ID 'vpc-c0833aa4' does not exist
status code: 400, request id: 78023e94-814c-4ba2-9482-bf8617c7ab23 cmd=ec2:setup file=ec2.go func=ec2.(*Data).Setup line=330
FATA[0000] exit status 1 cmd=ec2:deploy file=ec2.go func=ec2.(*Data).setupEC2 line=421
[0] ~ >> ./kato-ec2
INFO[0000] Setup the EC2 environment cmd=ec2:deploy id=cell-1.dub.xnood.com
INFO[0000] Connecting to region eu-west-1 cmd=ec2:setup
INFO[0000] Latest CoreOS stable AMI located cmd=ec2:deploy id=ami-b7cba3c4
INFO[0000] New EC2 VPC created cmd=ec2:setup id=vpc-81833ae5
INFO[0000] Using existing DNS zone cmd=ns1:zone:add id=int.cell-1.dub.xnood.com
INFO[0000] Using existing DNS zone cmd=ns1:zone:add id=ext.cell-1.dub.xnood.com
INFO[0000] New main route table added cmd=ec2:setup id=rtb-a6bfc1c2
INFO[0001] New etcd bootstrap token requested cmd=ec2:deploy id=5813b54775cb6091e71532cd06d4bc79
INFO[0001] New external subnet cmd=ec2:setup id=subnet-a1b82cd7
INFO[0000] Using existing DNS zone cmd=ns1:zone:add id=cell-1.dub.xnood.com
INFO[0001] New internal subnet cmd=ec2:setup id=subnet-aeb82cd8
ERRO[0001] InvalidSubnetID.NotFound: The subnet ID 'subnet-aeb82cd8' does not exist
status code: 400, request id: 685a700e-2415-4ad9-aa36-5617763b7eec cmd=ec2:setup file=ec2.go func=ec2.(*Data).tag line=1996
FATA[0001] InvalidSubnetID.NotFound: The subnet ID 'subnet-aeb82cd8' does not exist
status code: 400, request id: 685a700e-2415-4ad9-aa36-5617763b7eec cmd=ec2:setup file=ec2.go func=ec2.(*Data).setupVPCNetwork line=696
FATA[0001] exit status 1 cmd=ec2:deploy file=ec2.go func=ec2.(*Data).setupEC2 line=421
[0] ~ >> ./kato-ec2
INFO[0000] Setup the EC2 environment cmd=ec2:deploy id=cell-1.dub.xnood.com
INFO[0000] Connecting to region eu-west-1 cmd=ec2:setup
INFO[0000] New EC2 VPC created cmd=ec2:setup id=vpc-a78916c3
ERRO[0000] InvalidVpcID.NotFound: The vpc ID 'vpc-a78916c3' does not exist
status code: 400, request id: b5cc9e97-7217-49a9-bab2-9e7aaf5dac43 cmd=ec2:setup file=ec2.go func=ec2.(*Data).tag line=1978
FATA[0000] InvalidVpcID.NotFound: The vpc ID 'vpc-a78916c3' does not exist
status code: 400, request id: b5cc9e97-7217-49a9-bab2-9e7aaf5dac43 cmd=ec2:setup file=ec2.go func=ec2.(*Data).Setup line=363
FATA[0000] exit status 1 cmd=ec2:deploy file=ec2.go func=ec2.(*Data).setupEC2 line=454
[1] ~ >> INFO[0000] New DNS zone created cmd=ns1:zone:add id=int.cell-1.dub.xnood.com
INFO[0000] New DNS zone created cmd=ns1:zone:add id=ext.cell-1.dub.xnood.com
... because this command is not idempotent:
vb.customize ["storagectl", :id, "--name", "SATA", "--add", "sata"]
This task needs thecodeteam/libstorage#325 to be released. Find below the changes needed:
diff --git a/udata/fragments.go b/udata/fragments.go
index aa033a0..a62b0cf 100644
--- a/udata/fragments.go
+++ b/udata/fragments.go
@@ -104,8 +104,14 @@ write_files:`,
{{- if .RexrayStorageDriver }}
content: |
rexray:
- storageDrivers:
- - {{.RexrayStorageDriver}}
+ logLevel: warn
+ libstorage:
+ embedded: true
+ service: {{.RexrayStorageDriver}}
+ server:
+ services:
+ virtualbox:
+ driver: virtualbox
virtualbox:
endpoint: http://` + d.RexrayEndpointIP + `:18083
volumePath: ` + os.Getenv("HOME") + `/VirtualBox Volumes
@@ -1025,14 +1031,14 @@ coreos:
Restart=always
RestartSec=10
TimeoutStartSec=0
+ KillMode=process
EnvironmentFile=/etc/rexray/rexray.env
- ExecStartPre=-/bin/bash -c '\
- REXRAY_URL=https://emccode.bintray.com/rexray/stable/0.3.3/rexray-Linux-i386-0.3.3.tar.gz; \
- [ -f /opt/bin/rexray ] || { curl -sL $${REXRAY_URL} | tar -xz -C /opt/bin; }; \
- [ -x /opt/bin/rexray ] || { chmod +x /opt/bin/rexray; }'
+ Environment=URL=https://dl.bintray.com/emccode/rexray/stable/0.6.0/rexray-Linux-x86_64-0.6.0.tar.gz
+ ExecStartPre=-/bin/bash -c " \
+ [ -f /opt/bin/rexray ] || { curl -sL ${URL} | tar -xz -C /opt/bin; }; \
+ [ -x /opt/bin/rexray ] || { chmod +x /opt/bin/rexray; }"
ExecStart=/opt/bin/rexray start -f
ExecReload=/bin/kill -HUP $MAINPID
- KillMode=process
[Install]
WantedBy=kato.target`,
Zones must exist before Káto can create DNS records.
... because I use systemd
and rkt
plays nicely with it.
It will be very useful to have a table with all the services in the cluster and their ports.
If Káto is using --flannel-backend host-gw
then the SourceDestCheck
instance attribute must be disabled.
Experiment with multiple availability zones in EC2:
3
quorum nodes 1
zone each.3
master nodes 1
zone each.n
workers equally distributed.With 3 AZs the system should be able to tolerate one failure.
Create a diagram to show how katoctl
works internally.
A distributed, highly available service discovery & internal load balancer for distributed systems (microservices and containers). https://github.com/dcos/minuteman
Because masters and edge are nodes too.
In order to get the 'native' experience in OSX: https://github.com/mist64/xhyve
Message: Errno::ENOENT: No such file or directory - vboxwebsrv -H 0.0.0.0 -b
How to solve this problem?
Check this task and comment to understand why this is needed.
This is a major change and requires changing the way container images, networking and storage volumes are managed. This also implies re-thinking every situation where the docker Unix socket endpoint is used (suck as in Jenkins slaves). Lots of work ahead...
vboxwebsrv.exe: error: Unknown option: '-b'
My environment is (win10, virtualbox (5.1.18), vagrant1.9.3), suggesting that the above error.
Master nodes will persist Prometheus data.
Edge nodes will persist MongoDB data.
Draw a new version using the previous Lucidchart diagram and publish it in DropBox.
How to diagnose real issues and their solutions.
Script to be used as ExecStartPre
which will check whether a healthy quorum of zookeeper servers are up and running. It will retry a few times before giving up.
worker
nodes to an elastic load balancer.Sometimes more than 1 HAProxy processes are running inside the same marathon-lb
container. The older process should die after being drained but for some reason it doesn't. This prevents zdd
to progress. A healthy system looks like the one below:
core@worker-1 ~ $ loopssh worker "docker exec -i marathon-lb ps auxf | grep 'haproxy -p'"
--[ worker-1.cell-1.dub.xnood.com ]--
root 2367 0.0 0.1 40012 7756 ? Ss 10:18 0:00 haproxy -p /tmp/haproxy.pid -f /marathon-lb/haproxy.cfg -D -sf 2278
--[ worker-2.cell-1.dub.xnood.com ]--
root 2531 0.0 0.1 40012 7708 ? Ss 10:18 0:00 haproxy -p /tmp/haproxy.pid -f /marathon-lb/haproxy.cfg -D -sf 2441
--[ worker-3.cell-1.dub.xnood.com ]--
root 4775 0.0 0.1 40012 7772 ? Ss 10:18 0:00 haproxy -p /tmp/haproxy.pid -f /marathon-lb/haproxy.cfg -D -sf 4676
This task is to update marathon-lb
to a patched version whenever mesosphere/marathon-lb#267 is fixed and they cut a new release.
Also check mesosphere/marathon-lb#318
The quadruplet custom parser is defined but without parsing implementation.
The aim of this task is to implement the parsing logic.
Obviously this two services collide when you try to deploy an all-in-one
.
-alertmanager.url
flag.The -alertmanager.url
flag accepts a comma separated list of URLs and/or be set multiple times.
The current set of instructions is outdated and won't work.
Amend systemd unit fragments whenever https://issues.apache.org/jira/browse/MESOS-6212 is fixed.
I am not sure about masters
and edge
nodes.
The awscli
will be containerised and wrapped into a shell script.
CoreOS stable (1122.2.0)
Failed Units: 1
getcerts.service
core@worker-1 ~ $ systemctl status getcerts.service
● getcerts.service - Get certificates from private S3 bucket
Loaded: loaded (/etc/systemd/system/getcerts.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2016-09-20 09:20:54 UTC; 13min ago
Process: 3976 ExecStart=/opt/bin/getcerts (code=exited, status=125)
Main PID: 3976 (code=exited, status=125)
Sep 20 09:20:36 worker-1.cell-1.dub.xnood.com getcerts[3976]: e1c8150b89d0: Retrying in 2 seconds
Sep 20 09:20:37 worker-1.cell-1.dub.xnood.com getcerts[3976]: e1c8150b89d0: Retrying in 1 seconds
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com getcerts[3976]: e1c8150b89d0: Downloading
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com getcerts[3976]: e1c8150b89d0: Downloading
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com getcerts[3976]: docker: dial tcp: lookup auth.docker.io: Temporary failure in name resoluti
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com getcerts[3976]: See 'docker run --help'.
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com systemd[1]: getcerts.service: Main process exited, code=exited, status=125/n/a
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com systemd[1]: Failed to start Get certificates from private S3 bucket.
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com systemd[1]: getcerts.service: Unit entered failed state.
Sep 20 09:20:54 worker-1.cell-1.dub.xnood.com systemd[1]: getcerts.service: Failed with result 'exit-code'.
Use an Alpine Linux base image to package zookeeper.
In order to test the Zero Downtime Deployment
Wait for Docker 1.12.0
(moby/moby/pull/21361) and use:
docker volume rm $(docker volume ls -q -f driver=local -f dangling=true)
By doing so, the garbage collector won't delete REX-Ray
volumes.
Only local
volumes will be deleted.
Every sub-command must validate its input.
If --ami-id
is set use it. Otherwise, run the discovery procedure based on Channel
and Region
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.