Skip to main content
  1. Articles/

Migrating from MetaLB to Cilium

··998 words·5 mins·
Vegard S. Hagen
Author
Vegard S. Hagen
Pondering post-physicists
Table of Contents

For my homelab I’m running an over-engineered one-node Kubernetes “cluster” using Cilium as the Container Network Interface (CNI). Up until recently I used MetalLB for LoadBalancer IP Address Management (LB-IPAM) and L2 announcements for Address Resolution Protocol (ARP) requests over the local network, but Cilium has now replaced this functionality.

Personally I really like Cilium because of their use of the eBPF 🐝 protocol and frankly also for their logo since hexagons are bestagons.

Cilium 1.13 introduced LB-IPAM support and 1.14 added L2 announcement capabilities, making MetalLB superfluous for my setup. Cilium also has support for Border Gateway Protocol (BGP) if you need that.

LB-IPAM
#

LB-IPAM functionality is enabled by default, and you only need to create an IP-pool to allow Cilium to start assigning IPs to LoadBalancer services

apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
  name: default-pool
spec:
  blocks:
    - cidr: 192.168.1.128/25

If you need more control over which services are assigned from which IP-pool you can use e.g. label matching as written in the Cilium LB-IPAM documentation.

To avoid having to change my network settings I wanted to Cilium to assign the same IPs as MetalLB did to some of my services. This was easily done by changing the Service annotations from

kind: Service
apiVersion: v1
metadata:
  annotations:
    metallb.universe.tf/loadBalancerIPs: 192.168.1.153
    metallb.universe.tf/allow-shared-ip: pi-hole

to

kind: Service
apiVersion: v1
metadata:
  annotations:
    io.cilium/lb-ipam-ips: 192.168.1.153
    io.cilium/lb-ipam-sharing-key: pi-hole
Avoid using .spec.loadBalancerIP which has been deprecated since Kubernetes v1.24 and will be removed in a future release.

Here I’ve also included an annotation to allow two services to share the same IP. This allowed me to circumvent an issue with mixed protocol services which has now been fixed and stable since Kubernetes v1.26.

Limitations
#

Update 2024-03-09: This is no longer an issue using Cilium 1.15.1 and Kubernetes 1.29.1. Thanks to Reddit user u/spooge_mcnubbins for informing me!

Using MetalLB I was able to expose both UDP and TCP connections on the same port, but unfortunately this isn’t supported by Cilium (yet) according to a GitHub issue.

Running both UDP and TCP on port 53 is the standard for DNS. Since I run a Pi-hole DNS instance in the cluster I relied on both UDP and TCP to be exposed on the same port. Doing some research however reveals that even though DNS is mostly UDP it has fallback to TCP if either the packet size is too big or UDP isn’t available ( RFC1034). With this in mind I decided it’s acceptable to take the (negligible) overhead hit running DNS only over TCP.

L2 Announcements
#

For Cilium to perform L2 announcements and reply to ARP requests it must run in Kube-proxy replacement mode. We also need to explicitly enable the feature in the Helm chart values.yaml file

kubeProxyReplacement: true

l2announcements:
  enabled: true

externalIPs:
  enabled: true

The above config enables Cilium to announce IPs from services’ .status.loadbalancer.ingress field, i.e. the IPs assigned by LB-IPAM, and external IPs from a service’s manually assigned .spec.externalIPs field.

Next we also need to define a CiliumL2AnnouncementPolicy to actually announce the IPs

apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
  name: default-l2-announcement-policy
  namespace: kube-system
spec:
  interfaces:
    - enp0s25
  externalIPs: true
  loadBalancerIPs: true

Here I’ve explicitly set it to only announce over the enp0s25 interface which is the physical network card. You can list all network interfaces by running

ip link

Leaving .spec.interfaces empty announces over all interfaces. For more details read the documentation here.

Caveats
#

Enabling L2 announcements can significantly increase API traffic depending on you configuration as explained in the Cilium L2 announcements documentation.

I already have quite a few services running in my homelab and skimmed over this important detail to begin with, which resulted in connectivity/discovery issues.

Looking at the logs of the cilium-agent container by running

kubectl logs -l app.kubernetes.io/name=cilium-agent -c cilium-agent -n kube-system --tail 128

I saw multiple error- and info-level messages indicating I hit some kind of rate limit

error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
level=error msg="error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline" subsys=klog
error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline
level=error msg="error retrieving resource lock kube-system/cilium-l2announce-...: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline" subsys=klog
level=info msg="failed to renew lease kube-system/cilium-l2announce-...: timed out waiting for the condition" subsys=klog
level=info msg="failed to renew lease kube-system/cilium-l2announce-...: timed out waiting for the condition" subsys=klog
...
level=info msg="Waited for 7.593379216s due to client-side throttling, not priority and fairness, request: PUT:https://...:6443/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/cilium-l2announce-kube-system-hubble-peer" subsys=klog

This was easily remedied by increasing the k8sClientRateLimit tenfold

k8sClientRateLimit:
  qps: 50
  burst: 100

Summary
#

My current Cilium configuration can be found on GitHub, but for posterity the below configuration should enable Cilium with LB-IPAM and L2 announcements using ArgoCD with Kustomize + Helm

# kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - ip-pool.yaml
  - announce.yaml

helmCharts:
  - name: cilium
    repo: https://helm.cilium.io
    version: 1.14.4
    releaseName: "cilium"
    namespace: kube-system
    valuesFile: values.yaml
#values.yaml
k8sServiceHost: <-- HOST IP -->
k8sServicePort: 6443

kubeProxyReplacement: true

# Roll out cilium agent pods automatically when ConfigMap is updated.
rollOutCiliumPods: true

# Increase rate limit when doing L2 announcements
k8sClientRateLimit:
  qps: 50
  burst: 100

# Announce IPs from services' `.status.loadbalancer.ingress` field (automatically assigned by LB-IPAM).
l2announcements:
  enabled: true

# Announce manually assigned IPs from services' `.spec.externalIPs` field
externalIPs:
  enabled: true

operator:
  # Can't have more replicas than nodes
  replicas: 1
# ip-pool.yaml
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
  name: default-pool
  namespace: kube-system
spec:
  blocks:
    - cidr: <-- VALID CIDR RANGE FOR YOUR NETWORK -->
    - start: <-- START IP -->
      stop: <-- END IP -->
# announce.yaml
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
  name: default-l2-announcement-policy
  namespace: kube-system
spec:
  externalIPs: true
  loadBalancerIPs: true

Hands-on lab
#

If you’re interested in more details, Isovalent — the company behind Cilium, have created multiple hands-on labs for getting started with Cilium, including one about implementing LB-IPAM and L2 service announcements which is helpful for better understanding what’s happening under the hood.