Skip to main content
  1. Articles/

Gateway API with Cilium and Cert-manager

··2726 words·13 mins·
Vegard S. Hagen
Author
Vegard S. Hagen
Pondering post-physicists
Table of Contents

In the Gateway API SIG’s own words,

If you’re familiar with the older Ingress API, you can think of the Gateway API as analogous to a more-expressive next-generation version of that API.

In this article we’ll quickly review the role-oriented architecture of the Gateway API before we implement it using Cilium and Cert-manager. Other Gateway API implementations are listed on the Gateway SIG site.

We’ll mainly take a look at replacing Ingress resources for traffic from clients outside the cluster to services inside the cluster (north/south traffic). Although the Gateway API also supports so-called east/west traffic between workloads within a cluster (through the GAMMA-initiative), this is outside the scope of this article.

Before reading this article you might want to try a hands-on lab on Cilium Gateway API by Isovalent, the company behind Cilium.

Edit 2024.08.10: This article is updated to work with Cilium v1.6.0 and Gateway API v1.1.0.

Overview
#

In the role-oriented design of the Gateway API, the infrastructure provider provisions a GatewayClass which the cluster operators can use to create different Gateway resources. Application developers can then connect to this Gateway using HTTPRoutes connecting to plain old Services.

--- title: Overview of the Gateway API --- flowchart TB subgraph API[Gateway API] GC[GatewayClass] --- G[Gateway] G --- HR1[HTTPRoute] G --- HR2[HTTPRoute] end HR1 --- S1[Service] HR1 --- S2[Service] HR2 --- S3[Service]

Comparing this with the Ingress API we see that the Ingress resource has been split into the Gateway and HTTPRoute objects with different responsibilities.

Gateway API
#

Kubernetes 1.30 doesn’t ship with the Gateway API CRDs (Custom Resource Definitions), we therefore need to add them ourselves. The Gateway API is split up into several components, and the SIG maintains releases with both standard and experimental installs that combines several of these components that should work with all compliant Gateway implementations.

You are however free choose which components you want, and Cilium recommends applying the following Gateway CRDs

kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_gateways.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml

Since we want to use the GatewayInfrastructure field to set the Gateway LoadBalancer Service IP using an annotation we’ve changed to the experimental Gateway CRD since the standard CRD doesn’t support that field, and Cilium has support for this feature.

Alternatively, you can use the full experimental Gateway CRD install which should also work

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.1.0/experimental-install.yaml

Cilium
#

Cilium v1.16 supports Gateway API v1.1. If you’re stuck with Cilium v1.15 you should use Gateway API v1.0.

To get started, Cilium has a page on Migrating from Ingress to Gateway.

Following the documentation of Cilium we can enable Gateway support in one of two ways, either with the Cilium-CLI (≥ v0.15)

cilium install --version 1.16.0 \
    --set kubeProxyReplacement=true \
    --set gatewayAPI.enabled=true \
    --set envoy.securityContext.capabilities.keepCapNetBindService=true

or using the Helm Chart as described in the summary section.

Note that with the dedicated L7 Envoy Proxy DaemonSet enabled by default in Cilium v1.16.0, you also have to set envoy.securityContext.capabilities.keepCapNetBindService to true.

If you’re in an environment where you can’t use LoadBalancer type Services it’s now also possible to run in host network mode by adding either --set gatewayAPI.hostNetwork.enabled=true to the cilium install command above, or

gatewayAPI:
  enabled: true
  hostNetwork:
    enabled: true

in the Helm values.

If you plan to use port numbers less than 1024 — e.g. 443 for HTTPS-traffic, in host network mode, you also need to add the NET_BIND_SERVICE Linux capability in the Envoy securityContext. We’ve done in the Helm Chart described in the summary.

Note that the Envoy proxy should also have the NET_ADMIN and SYS_ADMIN capabilities enabled. If you’re running with Linux Kernel ≥ 5.8, and CRI-O ≥ 1.22.0 or containerd ≥ 1.5.0, you can replace SYS_ADMIN with the BPF and PERFORM capabilities as noted in the Helm Chart value.yaml file comments.

See the Cilium Gateway API documentation for more information.

Cert-manager
#

Gateway API support is a beta feature in the latest stable release of Cert-manager at the time of writing (v1.15.2). To enable this support we add the --enable-gateway-api flag on startup of Cert-manager. This is done by setting it as an extra argument when installing Cert-manager using its Helm Chart

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --version v1.15.2 \
    --namespace cert-manager --set crds.enabled=true --create-namespace \
    --set "extraArgs={--enable-gateway-api}"

If you don’t want to enable Gateway API support in Cert-manager you can instead manually create a Certificate resource and reference the TLS-secrets generated by that in the Gateway resource.

Configuration
#

Once we have the Gateway API CRDs available and enabled support for it in Cilium and Cert-manager we can finally start to create the resources we need to utilise it.

Infrastructure Provider
#

If a cluster wide GatewayClass resource referencing Cilium is not already present (kubectl get gatewayclasses) we need to create one ourselves1

1
2
3
4
5
6
7
# gateway/gateway-class.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: cilium
spec:
  controllerName: io.cilium/gateway-controller

Take note of the GatewayClass name (line 5) and make sure of the controllerName on line 7.

If the GatewayClass is created successfully you should be able to view the supported features by running

kubectl describe gatewayclass cilium

Cluster Operator
#

For convenience, we’ll group the cluster operator related resources in the gateway namespace. This allows us an easy overview of our gateways and connected resources as cluster operators.

kubectl create ns gateway

TLS certificates (Cloudflare)
#

To automatically provision TLS certificates attached to our Gateway we can create a Cert-manager Issuer resource. This section is optional if you don’t want certificates, though it’s highly recommended!

For details on how to automatically provision wildcard certificates using Cert-manager and Let’s Encrypt I’ve summarised the process in a previous article on Traefik Wildcard Certificates, so I’ll allow myself to be brief here.

Obtain a Cloudflare API token (or from you supported DNS provider of choice) as mentioned in the above article and create a Secret for it

1
2
3
4
5
6
7
8
9
# gateway/cloudflare-api-token.yaml
apiVersion: v1
kind: Secret
metadata:
  name: cloudflare-api-token
  namespace: gateway
type: Opaque
stringData:
  api-token: "<--CLOUDFLARE API TOKEN-->"

We can then reference this secret in an Issuer resource (line 17) which enables us to complete a DNS-01 challenge that allows us to issue wildcard certificates for the proven domain. Remember to provide the domain owner e-mail on line 10.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# gateway/cloudflare-issuer.yaml
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: cloudflare-issuer
  namespace: gateway
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: "<--YOUR EMAIL-->"
    privateKeySecretRef:
      name: cloudflare-key
    solvers:
      - dns01:
          cloudflare:
            apiTokenSecretRef:
              name: cloudflare-api-token
              key: api-token

Gateway
#

Next we create a Gateway resources that references the cilium GatewayClass (line 10).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# gateway/gateway.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: cilium-gateway
  namespace: gateway
  annotations:
    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.<--YOUR DOMAIN-->"
      tls:
        certificateRefs:
          - kind: Secret
            name: cloudflare-cert
      allowedRoutes:
        namespaces:
          from: All

The annotation on line 8 is picked up by Cert-manager to automatically create a Certificate resource similar to the one below

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# gateway/certificate.yaml
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: certificate
  namespace: gateway
spec:
  dnsNames:
    - "*.<--YOUR DOMAIN-->"
  issuerRef:
    group: cert-manager.io
    kind: Issuer
    name: cloudflare-issuer
  usages:
    - digital signature
    - key encipherment
  secretName: cloudflare-cert

If you didn’t enable Gateway API support in Cert-manager, you can instead create this resource manually.

The Certificate uses the cloudflare-issuer Issuer (lines 11–13) and creates a TLS-Secret with a name (line 17) corresponding to the one requested by the Gateway resource (line 19).

We need to create at least one listener per Gateway that listens for e.g. HTTPRoute resources that matches. In our case we’ve created an HTTPS-listener on port 443 that matches all subdomains. The tls-field of the Gateway is picked up by Cert-manager which will create a TLS-secret with the given name when an HTTPRoute is attaches. We allow eligible HTTPRoutes from all namespaces to connect through this Gateway, though we can also create a selector for more fine-grained control.

Gateway Service
#

When the Gateway API is not run on the host network, a LoadBalancer type Service is created when the Gateway is picked up by the Cilium controller. Out-of-the box, this Service is assigned the next available IP from e.g. Cilium LB-IPAM.

For a deterministic and idempotent configuration we need to set a pre-determined specific IP, something we can do by annotating the spawned Service with io.cilium/lb-ipam-ips: "<--IP-->".

As of Cilium 1.16.0, the only way I’ve found to manipulate the spawned Service from the Gateway is to add

spec:
  infrastructure:
    annotations:
      io.cilium/lb-ipam-ips: "<--IP-->"

to the Gateway which requires the experimental Gateway CRD we applied earlier.

This approach was introduced to me by u/h_hover in a Reddit comment.

In another thread on Reddit, u/nuskovg points to the GatewayAddress field used by the GatewaySpec as a possible solution. The support for this field is extended, and u/TheGarbInC mentions an open GitHub issue for Cilium support which is yet undecided.

DNS
#

Now that you’ve got your Gateway and attached LoadBalancer Service set up you want to point you DNS to the Service IP. This IP address should be the same as what you set in the Gateway infrastructure annotation field (io.cilium/lb-ipam-ips), but to make sure you can run

kubectl get svc -A | grep LoadBalancer

and find the External IP of the Service named <GatewayClass.name>-gateway-<Gateway.name>, in our case cilium-gateway-cilium-gateway.2

Open up port 443 in your router and/or firewall to the Service IP and point you DNS to your public IP. If you don’t have your public IP at hand you can find it by running

dig +short myip.opendns.com @resolver1.opendns.com

In case you don’t have the possibility to open ports, — e.g. behind a CGNAT, you can try using a tunnel like cloudflared.

If everything is set up correctly, an external web request should roughly take the following path:

--- title: An external request routed to a Gateway --- flowchart LR Web --> DNS --> Router --> Service --> Gateway

First the hostname is looked up in a DNS. The DNS should respond with the IP you set up, and the request is relayed to your router. Next the router port forwards the request to the Service IP connected to the Gateway. The Gateway then presents the attached certificate and the journey continues.

--- title: Gateway routing to an eligible HTTPRoute --- flowchart LR Gateway --> HTTPRouteA[HTTPRoute ɑ] --> Service --> Pod/Application Gateway -.-> HTTPRouteB[HTTPRoute β] Gateway -.-> HTTPRouteC[HTTPRoute γ]

From the Gateway the request is channeled to the correct HTTPRoute – route ɑ in this case, based on the rules you’ve set up, e.g. hostname or header-matching. Next, the HTTPRoute directs the request to its attached Service, which then finally delivers the request to its destination. Hopefully the application responds with something nice.

If you don’t want to expose your public IP, you can instead use a service like cloudflared to tunnel traffic directly to your cluster. If you want to go this route you can find an example configuration here.

Application developer
#

Now that both the infrastructure provider and cluster operator have done their job (kudos to you!), we can let the application developers (also you) take the centre stage.

Given a Service named my-service we can create a simpleHTTPRoute referencing our Gateway (line 9) and the Service (line 19) to expose the Service

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# gateway/http-route.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: my-http-route
  namespace: default
spec:
  parentRefs:
    - name: cilium-gateway
      namespace: gateway
  hostnames:
    - "gateway.<--YOUR DOMAIN-->"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /
      backendRefs:
        - name: my-service
          port: 80

HTTPRoutes also allows developers to easily do header based routing for canary deployments, or traffic splitting for blue-green testing.

I strongly encourage you to take a look all the capabilities on the Gateway API user guide for more ideas.

Summary
#

I’m running Argo CD with Kustomize + Helm in an attempt to follow GitOps best-practices. This summary assumes a similar setup together with Sealed Secrets. My full homelab configuration as of the writing of this article can be found on GitHub as a reference. All the resources below can also be in the GitLab repository backing this site here.

Gateway API
#

We gather all resources related to the Gateway in one namespace. This includes the Cert-manager Issuer.

#gateway/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
  - https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_gateways.yaml
  - https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
  - https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
  - https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
  - https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.1.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml
  - gateway-class.yaml
  - ns.yaml
  - sealed-cloudflare-api-token.yaml
  - cloudflare-issuer.yaml
  - gateway-with-infrastructure.yaml.yaml
# gateway/gateway-class.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: cilium
spec:
  controllerName: io.cilium/gateway-controller
#gateway/ns.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: gateway
#gateway/sealed-cloudflare-api-token.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: cloudflare-api-token
  namespace: gateway
spec:
  encryptedData:
    api-token: <--Sealed Cloudflare API Token-->
  template:
    metadata:
      name: cloudflare-api-token
      namespace: gateway
    type: Opaque
# gateway/cloudflare-issuer.yaml
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
  name: cloudflare-issuer
  namespace: gateway
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: "<--YOUR EMAIL-->"
    privateKeySecretRef:
      name: cloudflare-key
    solvers:
      - dns01:
          cloudflare:
            apiTokenSecretRef:
              name: cloudflare-api-token
              key: api-token
# gateway/gateway-with-infrastructure.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: cilium-gateway
  namespace: gateway
  annotations:
    cert-manager.io/issuer: cloudflare-issuer
spec:
  gatewayClassName: cilium
  infrastructure:
    annotations:
      io.cilium/lb-ipam-ips: "<--IP-->"
  listeners:
    - protocol: HTTPS
      port: 443
      name: https-gateway
      hostname: "*.<--YOUR DOMAIN-->"
      tls:
        certificateRefs:
          - kind: Secret
            name: cloudflare-cert
      allowedRoutes:
        namespaces:
          from: All
    - protocol: HTTPS
      port: 443
      name: https-domain-gateway
      hostname: "<--YOUR DOMAIN-->"
      tls:
        certificateRefs:
          - kind: Secret
            name: cloudflare-domain-cert
      allowedRoutes:
        namespaces:
          from: All

Cilium
#

Cilium is configured to use v1.15.0 to support Gateway Service annotation which works in conjunction with Cilium LB-IPAM and L2 announcements.

# cilium/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - announce.yaml
  - ip-pool.yaml

helmCharts:
  - name: cilium
    repo: https://helm.cilium.io
    version: 1.16.0
    releaseName: "cilium"
    includeCRDs: true
    namespace: kube-system
    valuesFile: values.yaml
# cilium/values.yaml
kubeProxyReplacement: true

gatewayAPI:
  enabled: true
## Uncomment to run on the host network, e.g. when LoadBalancer Services are not available
#  hostNetwork:
#    enabled: true
envoy:
  securityContext:
    capabilities:
      keepCapNetBindService: true
      envoy:
        - NET_ADMIN
        - PERFMON
        - BPF
  ## Enable SYS_ADMIN capability instead of PERFMON and BPF if running on Linux Kernel < 5.8 and Cri-O < 1.22.0 or containerd < 1.5.0
  #       - SYS_ADMIN
  ## Enable NET_BIND_SERVICE capability to use port numbers < 1024, e.g. 80 or 443
  #       - NET_BIND_SERVICE

# Roll out cilium agent and operator pods automatically when ConfigMap is updated.
rollOutCiliumPods: true

operator:
  rollOutPods: true

# Increase rate limit when doing L2 announcements
k8sClientRateLimit:
  qps: 100
  burst: 200

l2announcements:
  enabled: true
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
  name: default-l2-announcement-policy
  namespace: kube-system
spec:
  externalIPs: true
  loadBalancerIPs: true
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
  name: first-pool
spec:
  blocks:
    - start: <--IP-->
      stop: <--IP-->

Cert-manager
#

# cert-manager/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - ns.yaml

helmCharts:
  - name: cert-manager
    repo: https://charts.jetstack.io
    version: 1.15.2
    releaseName: cert-manager
    namespace: cert-manager
    valuesInline:
      crds.enabled: true
      extraArgs:
        - "--enable-gateway-api"
# cert-manager/ns.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: cert-manager

Cloudflared
#

For completeness’s sake this is the relevant cloudflared config I’m currently running.

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

configMapGenerator:
  - name: config
    namespace: cloudflared
    files:
      - config.yaml

resources:
  - ns.yaml
  - credentials.yaml
  - daemon-set.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: cloudflared
tunnel: talos-tunnel
credentials-file: /etc/cloudflared/credentials/credentials.json
metrics: 0.0.0.0:2000
no-autoupdate: true

warp-routing:
  enabled: true

ingress:
  - hostname: hello.stonegarden.dev
    service: hello_world
  - hostname: "*.stonegarden.dev"
    service: https://cilium-gateway-stonegarden.gateway.svc.cluster.local:443
    originRequest:
      originServerName: "*.stonegarden.dev"
  - hostname: stonegarden.dev
    service: https://cilium-gateway-stonegarden.gateway.svc.cluster.local:443
    originRequest:
      originServerName: stonegarden.dev
  - service: http_status:404
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: credentials
  namespace: cloudflared
spec:
  template:
    metadata:
      name: credentials
      namespace: cloudflared
  encryptedData:
    credentials.json: "<--CREDENTIALS-->"
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: cloudflared
  name: cloudflared
  namespace: cloudflared
spec:
  selector:
    matchLabels:
      app: cloudflared
  template:
    metadata:
      labels:
        app: cloudflared
    spec:
      containers:
        - name: cloudflared
          image: cloudflare/cloudflared:2024.8.2
          imagePullPolicy: IfNotPresent
          args:
            - tunnel
            - --config
            - /etc/cloudflared/config/config.yaml
            - run
          livenessProbe:
            httpGet:
              path: /ready
              port: 2000
            initialDelaySeconds: 60
            failureThreshold: 5
            periodSeconds: 10
          resources:
            requests:
              cpu: 100m
              memory: 64Mi
            limits:
              memory: 512Mi
          volumeMounts:
            - name: config
              mountPath: /etc/cloudflared/config/config.yaml
              subPath: config.yaml
            - name: credentials
              mountPath: /etc/cloudflared/credentials
              readOnly: true
      restartPolicy: Always
      volumes:
        - name: config
          configMap:
            name: config
        - name: credentials
          secret:
            secretName: credentials

  1. In the Isovalent Cilium Gateway API lab this GatewayClass is already created for you. ↩︎

  2. Knowing about this scheme, I hope you pick better names for your Gateway and GatewayClass resources. ↩︎