In the Gateway API SIG’s own words,
If you’re familiar with the older Ingress API, you can think of the Gateway API as analogous to a more-expressive next-generation version of that API.
In this article we’ll quickly review the role-oriented architecture of the Gateway API before we implement it using Cilium and cert-manager. Other Gateway API implementations are listed on the Gateway Special Interest Group (SIG) site.
We’ll mainly take a look at replacing Ingress resources for traffic from clients outside the cluster to services inside the cluster (north/south traffic). Although the Gateway API also supports so-called east/west traffic between workloads within a cluster (through the GAMMA-initiative), this is outside the scope of this article.
Before reading this article, you might want to try a hands-on lab on Cilium Gateway API by Isovalent, the company behind Cilium.
Overview#
In the role-oriented design of the Gateway API, the infrastructure provider provisions a GatewayClass, which the cluster operators can use to create different Gateway resources. Application developers can then connect to this Gateway using HTTPRoutes connecting to plain old Services.
Comparing this with the Ingress API, we see that the Ingress resource has been split into the Gateway and different Route objects with different responsibilities.
With Kubernetes feature freeze on Ingress API and the Ingress NGINX project entering maintenance mode in favour of InGate, Gateway API is the next logical step.
Gateway API#
Kubernetes 1.33 doesn’t ship with the Gateway API Custom Resource Definitions (CRDs), we therefore need to add them ourselves. The Gateway API spec is split up into several components, and the SIG maintains releases for both standard and experimental installs that combine several of these components which should work with conformant Gateway API implementations.
The Cilium documentation on Gateway API support recommends Gateway API v1.2.0, though they’ve also uploaded conformance reports that indicate support for v1.3.0 as well. We will therefore apply the standard v1.3.0 CRDs along with the experimental TLSRoute which Cilium also supports
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.3.0/standard-install.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.3.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml
In previous iterations of this article we had to rely on the experimental GatewayInfrastructure field to set the Gateway LoadBalancer Service IP, though we can now use the new addresses field. Effectively changing the Gateway from
spec:
infrastructure:
annotations:
io.cilium/lb-ipam-ips: <--IP-->
to
spec:
addresses:
- type: IPAddress
value: <--IP-->
Cilium#
Cilium has a page on Migrating from Ingress to Gateway.
Following the documentation of Cilium we can enable Gateway support in one of two ways, either with the Cilium-CLI (≥ v0.15)
cilium install --version 1.17.6 \
--set kubeProxyReplacement=true \
--set gatewayAPI.enabled=true \
--set envoy.securityContext.capabilities.keepCapNetBindService=true
or using the Helm Chart as described in the summary section.
Note that with the dedicated L7 Envoy Proxy DaemonSet
enabled by default,
you also have to set envoy.securityContext.capabilities.keepCapNetBindService
to true.
If you’re in an environment where you can’t use LoadBalancer
type Services
it’s now also possible to run in
host network mode by
adding either --set gatewayAPI.hostNetwork.enabled=true
to the cilium install
command above,
or
gatewayAPI:
enabled: true
hostNetwork:
enabled: true
in the Helm values.
If you plan to use port numbers lower than 1024 — e.g. 443 for HTTPS-traffic, in host network mode, you also need to add the NET_BIND_SERVICE Linux capability in the Envoy securityContext. We’ve done in the Helm Chart described in the summary.
Note that the Envoy proxy should also have the NET_ADMIN and SYS_ADMIN capabilities enabled. If you’re running a newer Linux Kernel (≥ 5.8) and container runtime (CRI-O ≥ 1.22.0 or containerd ≥ 1.5.0), you can replace SYS_ADMIN with the BPF and PERFORM capabilities as noted in the Helm Chart value.yaml file comments, to constrain the necessary privileges.
See the Cilium Gateway API documentation for more information.
cert-manager#
Gateway API support in cert-manager has been a beta feature since v1.15, though it appears to graduate soon™.
To enable Gateway API support in cert-manager,
we have to add the --enable-gateway-api
flag on startup.
This is done by setting it as an extra argument
when installing cert-manager using its Helm Chart
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --version v1.18.2 \
--namespace cert-manager --set crds.enabled=true --create-namespace \
--set "extraArgs={--enable-gateway-api}"
If you don’t want to enable Gateway API support in cert-manager, you can instead manually create a Certificate resource and reference the TLS-secrets generated by that in the Gateway resource.
Configuration#
Once we have the Gateway API CRDs available and enabled support for it in Cilium and cert-manager, we can start creating resources to take advantage of it.
Infrastructure Provider#
If a cluster wide GatewayClass resource
referencing Cilium is not already present (kubectl get gatewayclasses
) we
need to create one ourselves1
|
|
Take note of the GatewayClass name (line 5) and make sure of the controllerName on line 7.
If the GatewayClass is created successfully, you should be able to view the supported features by running
kubectl describe gatewayclass cilium
Cluster Operator#
For convenience, we’ll group the cluster operator related resources in the gateway
namespace.
This allows us an easy overview of our gateways and connected resources as cluster operators.
kubectl create ns gateway
TLS certificates (Cloudflare)#
To automatically provision TLS certificates attached to our Gateway, we can create a cert-manager Issuer resource. This section is optional if you don’t want certificates, though it’s highly recommended!
For details on how to automatically provision wildcard certificates using Cert-manager and Let’s Encrypt, I’ve summarised the process in a previous article on Traefik Wildcard Certificates, so I’ll allow myself to be brief here.
Obtain a Cloudflare API token (or from your favourite DNS provider of choice) as mentioned in the above article and create a Secret containing it
|
|
We can then reference this secret in an Issuer resource (line 17) which enables us to complete a DNS-01 challenge that allows us to issue wildcard certificates for the proven domain. Remember to provide the domain owner e-mail on line 10.
|
|
Gateway#
Next we create a Gateway resource that references the cilium GatewayClass (line 10).
|
|
cert-manager picks up the annotation on line 8 to automatically create a Certificate resource similar to the one below
|
|
If you didn’t enable Gateway API support in cert-manager, you can instead create this resource manually.
The Certificate uses the cloudflare-issuer Issuer (lines 11–13) and creates a TLS-Secret with a name (line 17) corresponding to the one requested by the Gateway resource (line 19).
We need to create at least one listener per Gateway that listens for e.g. HTTPRoute resources that match. In our case we’ve created an HTTPS-listener on port 443 that matches all subdomains. The tls-field of the Gateway is picked up by Cert-manager which will create a TLS-secret with the given name when an HTTPRoute attaches. We allow eligible _HTTPRoutes- from all namespaces to connect through this Gateway, though we can also create a selector for more fine-grained control.
Gateway Service#
When the Gateway API is not run on the host network, a LoadBalancer type Service is created when the Gateway is picked up by the Cilium controller. Out-of-the box, this Service is assigned the next available IP from e.g. Cilium LB-IPAM.
For a deterministic
and idempotent configuration,
we can set this IP using the spec.addresses
field, .e.g
spec:
addresses:
- type: IPAddress
value: <--IP-->
DNS#
Now that you’ve got your Gateway and attached LoadBalancer Service set up you want to point you DNS to the
Service IP.
This IP address should be the same as what you set in the Gateway spec.addresses
field,
but to make sure you can run
kubectl get svc -A | grep LoadBalancer
and find the External IP of the Service named <GatewayClass.name>-gateway-<Gateway.name>
,
in our case cilium-gateway-cilium-gateway
.2
Open up port 443 in your router and/or firewall to the Service IP and point you DNS to your public IP. If you don’t have your public IP at hand, you can find it by running
dig +short myip.opendns.com @resolver1.opendns.com
In case you don’t have the possibility to open ports, — e.g. behind a CGNAT, you can try using a tunnel like cloudflared.
If everything is set up correctly, an external web request should roughly take the following path:
First, the hostname is looked up in a DNS. The DNS should respond with the IP you set up, and the request is relayed to your router. Next, the router port forwards the request to the Service IP connected to the Gateway. The Gateway then presents the attached certificate and the journey continues.
From the Gateway the request is channelled to the correct HTTPRoute – route ɑ in this case, based on the rules you’ve set up, e.g. hostname or header-matching. Then the HTTPRoute directs the request to its attached Service, which then finally delivers the request to its destination. Hopefully, the application responds with something nice.
If you don’t want to expose your public IP,
you can instead use a service
like cloudflared
to tunnel traffic
directly to your cluster.
If you want to go this route, you can find an example
configuration here.
Application developer#
Now that both the infrastructure provider and cluster operator have done their job (kudos to you!), we can let the application developers (also you) take the centre stage.
Given a Service named my-service, we can create a simple HTTPRoute referencing our Gateway (line 9) and the Service (line 19) to expose the Service under a hostname
|
|
HTTPRoutes also allows developers to easily do header-based routing for canary deployments, or traffic splitting for blue-green testing.
I strongly encourage you to take a look at all the capabilities on the Gateway API user guide for more ideas.
Summary#
I’m running Argo CD with Kustomize + Helm in an attempt to follow GitOps best practices. This summary assumes a similar setup together with Sealed Secrets. My full homelab configuration as of the writing of this article can be found on GitHub as a reference. All the resources below can also be in the GitLab repository backing this site here.
Gateway API#
We gather all resources related to the Gateway in one namespace. This includes the Cert-manager Issuer.
#gateway/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.3.0/standard-install.yaml
- https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.3.0/config/crd/experimental/gateway.networking.k8s.io_tlsroutes.yaml
- gateway-class.yaml
- ns.yaml
- sealed-cloudflare-api-token.yaml
- cloudflare-issuer.yaml
# gateway/gateway-class.yaml
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: cilium
spec:
controllerName: io.cilium/gateway-controller
#gateway/ns.yaml
apiVersion: v1
kind: Namespace
metadata:
name: gateway
#gateway/sealed-cloudflare-api-token.yaml
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: cloudflare-api-token
namespace: gateway
spec:
encryptedData:
api-token: <--Sealed Cloudflare API Token-->
template:
metadata:
name: cloudflare-api-token
namespace: gateway
type: Opaque
# gateway/cloudflare-issuer.yaml
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: cloudflare-issuer
namespace: gateway
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: "<--YOUR EMAIL-->"
privateKeySecretRef:
name: cloudflare-key
solvers:
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
Cilium#
# cilium/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- announce.yaml
- ip-pool.yaml
helmCharts:
- name: cilium
repo: https://helm.cilium.io
version: 1.17.6
releaseName: "cilium"
includeCRDs: true
namespace: kube-system
valuesFile: values.yaml
# cilium/values.yaml
kubeProxyReplacement: true
gatewayAPI:
enabled: true
# Enable Application-Layer Protocol Negotiation (ALPN) which wil attempt HTTP/2, then HTTP 1.1.
# Services that wish to use HTTP/2 must indicate that via their appProtocol (GEP-1911).
enableAlpn: true
gatewayClass:
# Always create a GatewayClass for Cilium
create: true
## Uncomment to run on the host network, e.g. when LoadBalancer Services are not available
# hostNetwork:
# enabled: true
envoy:
securityContext:
capabilities:
keepCapNetBindService: true
envoy:
- NET_ADMIN
- PERFMON
- BPF
## Enable SYS_ADMIN capability instead of PERFMON and BPF if running on Linux Kernel < 5.8 and Cri-O < 1.22.0 or containerd < 1.5.0
# - SYS_ADMIN
## Enable NET_BIND_SERVICE capability to use port numbers < 1024, e.g. 80 or 443
# - NET_BIND_SERVICE
# Roll out cilium agent and operator pods automatically when ConfigMap is updated.
rollOutCiliumPods: true
operator:
rollOutPods: true
# Increase rate limit when doing L2 announcements
k8sClientRateLimit:
qps: 100
burst: 200
l2announcements:
enabled: true
apiVersion: cilium.io/v2alpha1
kind: CiliumL2AnnouncementPolicy
metadata:
name: default-l2-announcement-policy
namespace: kube-system
spec:
externalIPs: true
loadBalancerIPs: true
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
name: first-pool
spec:
blocks:
- start: <--IP-->
stop: <--IP-->
cert-manager#
# cert-manager/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ns.yaml
helmCharts:
- name: cert-manager
repo: https://charts.jetstack.io
version: 1.18.2
releaseName: cert-manager
namespace: cert-manager
valuesInline:
crds.enabled: true
extraArgs:
- "--enable-gateway-api"
# cert-manager/ns.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cert-manager
Cloudflared#
For completeness’s sake, this is the relevant cloudflared config I’m currently running.
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
configMapGenerator:
- name: config
namespace: cloudflared
files:
- config.yaml
resources:
- ns.yaml
- credentials.yaml
- daemon-set.yaml
apiVersion: v1
kind: Namespace
metadata:
name: cloudflared
tunnel: talos-tunnel
credentials-file: /etc/cloudflared/credentials/credentials.json
metrics: 0.0.0.0:2000
no-autoupdate: true
warp-routing:
enabled: true
ingress:
- hostname: hello.stonegarden.dev
service: hello_world
- hostname: "*.stonegarden.dev"
service: https://cilium-gateway-stonegarden.gateway.svc.cluster.local:443
originRequest:
originServerName: "*.stonegarden.dev"
- hostname: stonegarden.dev
service: https://cilium-gateway-stonegarden.gateway.svc.cluster.local:443
originRequest:
originServerName: stonegarden.dev
- service: http_status:404
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: credentials
namespace: cloudflared
spec:
template:
metadata:
name: credentials
namespace: cloudflared
encryptedData:
credentials.json: "<--CREDENTIALS-->"
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: cloudflared
name: cloudflared
namespace: cloudflared
spec:
selector:
matchLabels:
app: cloudflared
template:
metadata:
labels:
app: cloudflared
spec:
containers:
- name: cloudflared
image: cloudflare/cloudflared:2024.8.2
imagePullPolicy: IfNotPresent
args:
- tunnel
- --config
- /etc/cloudflared/config/config.yaml
- run
livenessProbe:
httpGet:
path: /ready
port: 2000
initialDelaySeconds: 60
failureThreshold: 5
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 64Mi
limits:
memory: 512Mi
volumeMounts:
- name: config
mountPath: /etc/cloudflared/config/config.yaml
subPath: config.yaml
- name: credentials
mountPath: /etc/cloudflared/credentials
readOnly: true
restartPolicy: Always
volumes:
- name: config
configMap:
name: config
- name: credentials
secret:
secretName: credentials
In the Isovalent Cilium Gateway API lab, a GatewayClass is already created for you. ↩︎
Knowing about this scheme, I hope you pick better names for your Gateway and GatewayClass resources. ↩︎