alibaba / hybridnet Goto Github PK
View Code? Open in Web Editor NEWMake underlay and overlay network can coexist, communicate, even be transformed purposefully.
License: Apache License 2.0
Make underlay and overlay network can coexist, communicate, even be transformed purposefully.
License: Apache License 2.0
Type: feature request
Use charts to deploy hybridnet, including
Type: feature request
Now rama webhook is taking whole control of webhook configurations, including creation or update.
But we have to generate and config all certificates(root CA, tls cert, tls key) through a secret, that would bring some troubles to users, and it's not a general k8s-style way.
So cert-manager
should be introduced for deployment automation.
cert-manager
will be a necessary pre step before installing rama.
Type: feature request
By reducing the number of single direct request to apiserver (without cache) in manager's pod reconciling, we can get a significant increase of IP allocation performance.
For now, four requests exist during one pod reconciling process. It's very possible to reduce to one, which can bring a four times theoretical optimization to IP allocation.
What we might need mostly:
Type: feature request
If using dualstack mode on hybridnet, some fixed default configurations may become inconvenient for users.
We should figure out all of them, and make them configurable.
Type: feature request
Type: feature request
Multi-tenancy is a common topic in kubernetes, as we know, some container networking solutions may have related abilities on this.
If this done, the following advantages will reach,
Type: bug report
An Underlay Network (still with enough ip addresses to allocate) has 17 nodes, 16 of which have the "networking.alibaba.com/address-quota: empty" label while one of them has "networking.alibaba.com/address-quota: noempty" label.
The "noempty" one is the last item of the network.status.nodeList.
All of the nodes have the "networking.alibaba.com/address-quota: noempty" label.
cat /etc/os-release
):uname -a
):Type: feature request
When creating the kata pod, the kata runtime failed to add ip v4 route to kata agent.
"Failed to add IP v4 route (src: , dst: , gtw: 169.254.1.1,Err: Network unreachable (os error 101))"
Can you remove the onlink
flag of routes in pod netns? So we can create kata containers.
Type: feature request
Now hybridnet is choosing preferred host interface through two flags.
They are not configurable enough.
We need a new CRD which can be attached to subnets.
Type: feature request
Specify subnet without network. Only subnet-specify annotation is needed.
Type: feature request
Now, if webhook is down, all pods will be blocked from creation. But actually, most pods can be handled by controller even webhook inactive.
Type: bug report
Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
cat /etc/os-release
):uname -a
):There are two types of nodes, some of them use eth1 as node container networking nic, some of them use eth0 as node container networking nic
I wish to use configuration of "--prefer-interfaces=eth0,eth1" with only one rama-daemon ds to support all of the nodes, but some thing failed, pod is stuck at Containercreating period with ensure vlan interface error: Link not found
Pod can be created correctly on all of the nodes.
Using configuration of "--prefer-interfaces=eth0,eth1" to deploy a rama-daemon ds, and need an environment of different nic name on different nodes
NONE
cat /etc/os-release
): CentOSuname -a
): 4.19Type: feature request
If one network have several subnets, some IPv4 and some IPv6, because the dualstack allocation will take a loop to find available IPv4 subnet and IPv6 subnet, so the subnet index (last allocated subnet) will increase fast and not continuous.
Type: feature request
Now it's supported in Hybridnet to specify network-type/network/subnet for a pod. But maybe it's necessary to support specify network-type/network/subnet for a batch of pods which are in the same namespace.
Type: feature request
Type: feature request
Add "can-reach" parameters for daemon to choose host NIC, just like calico.
Type: bug report
cat /etc/os-release
):uname -a
):Type: feature request
Type: bug report
For a ipv4/ipv6 dualstack pod, cannot specify subnets for it using the "networking.alibaba.com/specified-subnet"="<v4subnet>/<v6subnet>"
annotation because of a "specified subnet not found" error from webhook:
Seems the webhook just regards the two subnets as one.
Pod can be created successfully.
Like above.
cat /etc/os-release
):uname -a
):Type: bug report
hybrid-daemon policy container reporting "can not access kubernetes service, exiting" error
hybrid-daemon running normally
I just ran the command:
helm install hybridnet hybridnet/hybridnet -n kube-system
and the error occurred
cat /etc/os-release
): Ubuntu20.04 LTSuname -a
):5.4.0-105-genericType: feature request
Now we need to create one BGP mode hybridnet Network for each ToR. It can not satisfy the Pods which need both static IP addresses and cluster-wide scheduled ability.
It's possible to achieve it for BGP protocol, all we need to do is to change the way of advertising routes.
Type: feature request
Creating /32 or /128 Subnet should not be allowed in webhook validation, which might cause some unexpected errors.
Type: feature request
BGP is now always the most popular choice for a large-scale cluster. Using BGP can provide an almost hands-free way to initialize container network, which is also much more convenient for maintenance.
Type: feature request
Now subnet change will hook a address range validation, which is not precise enough,
ReservedIPs
and ExlucdedIPs
should be taken into account, too.
NONE
Type: feature request
Type: bug report
An overlay pod access an underlay pod using a masquerade path rather an vxlan tunnel, when the underlay pod's Network does not include the node which the overlay pod is on.
Overlay pods need to access underlay pod using vxlan tunnel without masquerade.
Create two underlay Network and an overlay Network can reproduce it.
cat /etc/os-release
):uname -a
):Type: feature request
This is not actually an issue or a bug.
Because hybridnet use ".vxlan4" as the fixed suffix of generated vxlan interface. We should announce the requirement clearly that the name of vxlan parent interface's length should shorter than 8 characters.
Type: feature request
In current hybridnet, context is not being passed on in chain, so usually a controller-runtime client is using a context.TODO() or context.Background() instead. This is non-standard. We should keep context awareness everywhere.
After that, if manager is shutting down, all client-related goroutines will get a context signal and quit immediately.
Type: bug report
Enhanced address is being used, when we want to ping local pods from node. This will cause the ICMP reply will never come back.
cat /etc/os-release
):uname -a
):Type: bug report
An compatibility problem might it be.
I set up a k8s cluster with CentOS 8 (which links iptables
to nftables
) and runs pods on Overlay and Underlay networks. I observed that on the host machines, the cmd lsmod | grep ip_tables
shows that ip_tables is used by iptable_nat, iptable_mangle, and iptable_filter.
After checking the logs of daemon pods, I believe the iptables rule is written without error. But on host machine, no rama-related iptables rules shown through either iptables-save
or nft rule list
.
To be mentioned, iptables-save
warns that there are more rules on iptables-legacy.
I should observe rama-related rules in iptables-save
.
Set up a k8s cluster CentOS 8 nodes. Install rama and run several Overlay/Underlay pods.
CentOS 8 removes iptables from packages and links it to nftables.
Kube-proxy works perfectly.
cat /etc/os-release
): CentOS 8uname -a
): Linux 4.18.0-305.7.1.el8_4.x86_6Type: feature request
Type: bug report
node1: 11.166.83.9
node2: 11.166.83.20
svc: nodePort 30991, only have one pod backend, pod ip: 100.88.253.95 on node2
network: vlanid 701
master interface: bond0
test network access: from node1 to node2:30991
#curl 11.166.83.20:30991
404 page not found
tcpdump on interface bond0.701 and got unexpected traffic
#tcpdump -nv -i bond0.701 host 11.166.83.9 and host 11.166.83.20 and port 30991
tcpdump: listening on bond0.701, link-type EN10MB (Ethernet), capture size 262144 bytes
16:09:03.511036 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60)
11.166.83.20.30991 > 11.166.83.9.37957: Flags [S.], cksum 0xbd97 (incorrect -> 0xcde4), seq 2388492280, ack 4182406673, win 28960, options [mss 1460,sackOK,TS val 522611790 ecr 2192028254,nop,wscale 7], length 0
16:09:03.511207 IP (tos 0x0, ttl 63, id 5047, offset 0, flags [DF], proto TCP (6), length 52)
11.166.83.20.30991 > 11.166.83.9.37957: Flags [.], cksum 0xbd8f (incorrect -> 0x6c9c), ack 83, win 227, options [nop,nop,TS val 522611790 ecr 2192028254], length 0
16:09:03.511374 IP (tos 0x0, ttl 63, id 5048, offset 0, flags [DF], proto TCP (6), length 228)
11.166.83.20.30991 > 11.166.83.9.37957: Flags [P.], cksum 0xbe3f (incorrect -> 0x4865), seq 1:177, ack 83, win 227, options [nop,nop,TS val 522611790 ecr 2192028254], length 176
16:09:03.511649 IP (tos 0x0, ttl 63, id 5049, offset 0, flags [DF], proto TCP (6), length 52)
11.166.83.20.30991 > 11.166.83.9.37957: Flags [F.], cksum 0xbd8f (incorrect -> 0x6be9), seq 177, ack 84, win 227, options [nop,nop,TS val 522611790 ecr 2192028255], length 0
We can see the reply network packets above, all went into bond0.701 after routing.
But, the packets between nodes are expected to go through bond0.
All reply network packets go through bond0, but not bond0.701.
Access nodeport across nodes on underlay network mode.
Sometimes, although the network traffic is unexpected but it works, because container network router received these network packets and forward to nodes, if rp_filter
and source address detection are both disabled.
cat /etc/os-release
):uname -a
):Type: feature request
Now RemoteClusterStatusChecker
is a custom runnable, which means it do not support concurrency, just like the other reconcilers.
This pr is trying to support concurrency in RemoteClusterStatusChecker.
Type: feature request
Code-generator is too old to maintain, and make something difficult to introduce new objects.
Controller-runtime is the now and future.
NONE
Type: feature request
try link to rama slack
Type: feature request
Type: bug report
iterator value can not be used in closure function directly
every closure function has its own input value
cat /etc/os-release
):uname -a
):Type: feature request
Type: bug report
Policy container of daemon pod exit with ip6tables-legacy-save error:
2022-05-10 08:11:59.242 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="raw"
2022-05-10 08:11:59.249 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.249 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="nat"
2022-05-10 08:11:59.249 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.249 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="filter"
2022-05-10 08:11:59.444 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.444 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="raw"
2022-05-10 08:11:59.444 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.444 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="mangle"
2022-05-10 08:11:59.451 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.451 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="nat"
2022-05-10 08:11:59.451 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.451 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="filter"
2022-05-10 08:11:59.846 [WARNING][24856] felix/table.go 814: iptables save failed error=exit status 1
2022-05-10 08:11:59.846 [WARNING][24856] felix/table.go 763: ip6tables-legacy-save command failed error=exit status 1 ipVersion=0x6 stderr="" table="mangle"
2022-05-10 08:11:59.846 [PANIC][24856] felix/table.go 769: ip6tables-legacy-save command failed after retries ipVersion=0x6 table="mangle"
panic: (*logrus.Entry) 0xc000992640
goroutine 205 [running]:
github.com/sirupsen/logrus.Entry.log(0xc00007e180, 0xc000269bc0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7f6900000000, ...)
/go/pkg/mod/github.com/projectcalico/[email protected]/entry.go:112 +0x2ff
github.com/sirupsen/logrus.(*Entry).Panic(0xc0005c3310, 0xc00091eb38, 0x1, 0x1)
/go/pkg/mod/github.com/projectcalico/[email protected]/entry.go:182 +0xfa
github.com/sirupsen/logrus.(*Entry).Panicf(0xc0005c3310, 0x27b6396, 0x1f, 0xc00091ebf8, 0x1, 0x1)
/go/pkg/mod/github.com/projectcalico/[email protected]/entry.go:230 +0xc5
github.com/projectcalico/felix/iptables.(*Table).getHashesAndRulesFromDataplane(0xc0008af000, 0xa3d43fb, 0x3caaf60)
/go/src/github.com/projectcalico/felix/iptables/table.go:769 +0x4bc
github.com/projectcalico/felix/iptables.(*Table).loadDataplaneState(0xc0008af000)
/go/src/github.com/projectcalico/felix/iptables/table.go:606 +0x1e5
github.com/projectcalico/felix/iptables.(*Table).Apply(0xc0008af000, 0xc0007d1fb0)
/go/src/github.com/projectcalico/felix/iptables/table.go:990 +0x1025
github.com/projectcalico/felix/dataplane/linux.(*InternalDataplane).apply.func3(0xc000967358, 0xc000967360, 0xc0006e4c00, 0xc000967370, 0xc0008af000)
/go/src/github.com/projectcalico/felix/dataplane/linux/int_dataplane.go:1818 +0x3c
created by github.com/projectcalico/felix/dataplane/linux.(*InternalDataplane).apply
/go/src/github.com/projectcalico/felix/dataplane/linux/int_dataplane.go:1817 +0x6de
cat /etc/os-release
): CentOS Linux 7uname -a
): Linux iZ0jlbn6dzzahicroo1vwhZ 3.10.0-957.21.3.el7.x86_64Type: feature request
For now, if dualstack feature-gate is false, user can still create IPv6-only environment, which make some behaviors unpredictable. We should limit that only IPv4 environment can be created if feature-gate is false.
cilium/cilium#14436
this is based on IPVLAN. but from this repo, it's based on MACVLAN. not sure what they want to do.
Type: bug report
cat /etc/os-release
):uname -a
):Type: bug report
If a pod is evicted or is a completed job pod, it's ipinstance will not be released until pod is deleted manually. Maybe is better to release such ipinstances it by hybridnet.
IPInstance can be released if a pod is "Completed" or "Evicted".
Run a Job or make a pod evicted can produce it.
cat /etc/os-release
):uname -a
):Type: feature request
Now, to deploy a DualStack hybridnet cluster, we need to open the DualStack feature gate by manager/webhook parameters.
But it seems now that "an ipv4-only cluster with closed DualStack feature gate" and "a dual-stack cluster (DualStack feature gate is opened) with only ipv4 subnets" are not different at all. Maybe it's time to remove the DualStack feature gate and make it a build-in feature.
Type: bug report
Pod keep being "ContainerCreating" for a long time. And "failed to set link to host netns: file exists" error exist in the output of kubectl describe po
.
It happens randomly.
cat /etc/os-release
): CentOS 8uname -a
):Type: bug report
Bad network path from overlay pod to gateway of underlay subnets, when ping underlay gateway from overlay pod, unexpected neighbor event comes from vxlan interface.
We expect overlay pod will react underlay gateway on a underlay+snat network path.
Ping underlay gateway from overlay pod.
Beside underlay gateway, if we only use part of a underlay subnet, the rest IPs will have the same problem as gateway.
cat /etc/os-release
):uname -a
):A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.