For my homelab I’m running an over-engineered one-node Kubernetes “cluster” using Cilium as the Container Network Interface (CNI). Up until recently I used MetalLB for LoadBalancer IP Address Management (LB-IPAM) and L2 announcements for Address Resolution Protocol (ARP) requests over the local network, but Cilium has now replaced this functionality.
Personally I really like Cilium because of their use of the eBPF 🐝 protocol and frankly also for their logo since hexagons are bestagons.
Cilium 1.13 introduced LB-IPAM support and 1.14 added L2 announcement capabilities, making MetalLB superfluous for my setup. Cilium also has support for Border Gateway Protocol (BGP) if you need that.
LB-IPAM#
LB-IPAM functionality is enabled by default,
and you only need to create an IP-pool to allow Cilium to start assigning IPs to LoadBalancer
services
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
name: default-pool
spec:
blocks:
- cidr: 192.168.1.128/25
If you need more control over which services are assigned from which IP-pool you can use e.g. label matching as written in the Cilium LB-IPAM documentation.
To avoid having to change my network settings I wanted to Cilium to assign the same IPs as MetalLB did to some of my services. This was easily done by changing the Service annotations from
kind: Service
apiVersion: v1
metadata:
annotations:
metallb.universe.tf/loadBalancerIPs: 192.168.1.153
metallb.universe.tf/allow-shared-ip: pi-hole
to
kind: Service
apiVersion: v1
metadata:
annotations:
io.cilium/lb-ipam-ips: 192.168.1.153
io.cilium/lb-ipam-sharing-key: pi-hole
.spec.loadBalancerIP
which has been deprecated
since Kubernetes v1.24 and will be
removed in a future release.Here I’ve also included an annotation to allow two services to share the same IP. This allowed me to circumvent an issue with mixed protocol services which has now been fixed and stable since Kubernetes v1.26.
Limitations#
Using MetalLB I was able to expose both UDP and TCP connections on the same port, but unfortunately this isn’t supported by Cilium (yet) according to a GitHub issue.
Running both UDP and TCP on port 53 is the standard for DNS. Since I run a Pi-hole DNS instance in the cluster I relied on both UDP and TCP to be exposed on the same port. Doing some research however reveals that even though DNS is mostly UDP it has fallback to TCP if either the packet size is too big or UDP isn’t available (RFC1034). With this in mind I decided it’s acceptable to take the (negligible) overhead hit running DNS only over TCP.
L2 Announcements#
For Cilium to perform L2 announcements and reply to ARP requests it must run in
Kube-proxy replacement mode.
We also need to explicitly enable the feature in the Helm chart values.yaml
file
kubeProxyReplacement: true
l2announcements:
enabled: true
externalIPs:
enabled: true
The above config enables Cilium to announce IPs from services’ .status.loadbalancer.ingress
field,
i.e. the IPs assigned by LB-IPAM,
and external IPs from a service’s
manually assigned .spec.externalIPs
field.
Next we also need to define a CiliumL2AnnouncementPolicy
to actually announce the IPs
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
name: default-l2-announcement-policy
namespace: kube-system
spec:
interfaces:
- enp0s25
externalIPs: true
loadBalancerIPs: true
Here I’ve explicitly set it to only announce over the enp0s25
interface which is the physical network card.
You can list all network interfaces by running
ip link
Leaving .spec.interfaces
empty announces over all interfaces.
For more details read the documentation here.
Caveats#
Enabling L2 announcements can significantly increase API traffic depending on you configuration as explained in the Cilium L2 announcements documentation.
I already have quite a few services running in my homelab and skimmed over this important detail to begin with, which resulted in connectivity/discovery issues.
Looking at the logs of the cilium-agent
container by running
kubectl logs -l app.kubernetes.io/name=cilium-agent -c cilium-agent -n kube-system --tail 128
I saw multiple error
- and info
-level messages indicating I hit some kind of rate limit
error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
level=error msg="error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline" subsys=klog
error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
level=error msg="error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline" subsys=klog
level=info msg="failed to renew lease kube-system/cilium-l2announce-...: timed out waiting for the condition" subsys=klog
level=info msg="failed to renew lease kube-system/cilium-l2announce-...: timed out waiting for the condition" subsys=klog
...
level=info msg="Waited for 7.593379216s due to client-side throttling, not priority and fairness, request: PUT:https://...:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cilium-l2announce-kube-system-hubble-peer" subsys=klog
This was easily remedied by increasing the k8sClientRateLimit
tenfold
k8sClientRateLimit:
qps: 50
burst: 100
Summary#
My current Cilium configuration can be found on GitHub, but for posterity the below configuration should enable Cilium with LB-IPAM and L2 announcements using ArgoCD with Kustomize + Helm
# kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ip-pool.yaml
- announce.yaml
helmCharts:
- name: cilium
repo: https://helm.cilium.io
version: 1.14.4
releaseName: "cilium"
namespace: kube-system
valuesFile: values.yaml
#values.yaml
k8sServiceHost: <-- HOST IP -->
k8sServicePort: 6443
kubeProxyReplacement: true
# Roll out cilium agent pods automatically when ConfigMap is updated.
rollOutCiliumPods: true
# Increase rate limit when doing L2 announcements
k8sClientRateLimit:
qps: 50
burst: 100
# Announce IPs from services' `.status.loadbalancer.ingress` field (automatically assigned by LB-IPAM).
l2announcements:
enabled: true
# Announce manually assigned IPs from services' `.spec.externalIPs` field
externalIPs:
enabled: true
operator:
# Can't have more replicas than nodes
replicas: 1
# ip-pool.yaml
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
name: default-pool
namespace: kube-system
spec:
blocks:
- cidr: <-- VALID CIDR RANGE FOR YOUR NETWORK -->
- start: <-- START IP -->
stop: <-- END IP -->
# announce.yaml
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
name: default-l2-announcement-policy
namespace: kube-system
spec:
externalIPs: true
loadBalancerIPs: true
Hands-on lab#
If you’re interested in more details, Isovalent — the company behind Cilium, have created multiple hands-on labs for getting started with Cilium, including one about implementing LB-IPAM and L2 service announcements which is helpful for better understanding what’s happening under the hood.