Cilium Host Policies
At Puzzle, we are constantly looking for new bleeding edge technologies which benefit our customers’ system/software architectures. This was also one of the reasons, why Puzzle partnered with Isovalent a few months ago, since they are the main company driving Cilium development, and a major contributor to eBPF. We think the impact of these technologies will be massive in the near future–especially in the container platform ecosystem.
In order to strengthen our knowledge about Cilium even further, we recently gathered as a team of five system engineers to deep dive into specific Cilium features for a full week long. During this week, Isovalent enabled multiple tech-sessions with some of their Cilium engineers to gain even more insight about the architecture and implementation of the following features:
- Cilium Host Policies
- Cilium Cluster Mesh
- Kubernetes without Kube-Proxy
- Cilium LoadBalancer
More Cilium related content will follow in the future, but in this blog post we’ll to stick with the first topic, a very handy feature of Cilium: Host Policies.
In the following guide, we would like to present a complete Cilium host policy example for a RKE2-based Kubernetes cluster. By using this feature, you can run and at the same time protect your RKE2 nodes without any firewall daemon running or iptables/nftables installed. Because the host firewalling is performed directly inside eBPF, this scheme provides high performance and visibility right away. On top, you can even manage host policies directly inside Kubernetes with CiliumClusterwideNetworkPolicy
custom resources.
Prerequisites
To benefit from all the mentioned features, some prerequisites need to be fulfilled:
- We recommend running the RKE2 cluster without any Kube-Proxy instance. It can be disabled with the
disable-kube-proxy
flag inside the RKE2 server config (/etc/rancher/rke2/config.yaml
). - Enable Cilium “Kube-Proxy free” mode by configuring
kubeProxyReplacement=strict
,k8sServiceHost=REPLACE_WITH_API_SERVER_IP
andk8sServicePort=REPLACE_WITH_API_SERVER_PORT
. (More details in the Cilium documentation.) - Supply appropriate labels to the RKE2 worker nodes, as RKE2 does not do so by default. Some host policies will be assigned to workers based on this label.
kubectl label node node-role.kubernetes.io/worker=true
Configuration
Audit Mode Safeguard
First, set the Cilium policy enforcement mode for the host endpoints to audit. This is a crucial to-do before applying any host policy custom resources because it’s easy to lock yourself out of your Kubernetes cluster/nodes by just missing a single port within your policies!
CILIUM_NAMESPACE=kube-system for NODE_NAME in $(kubectl get nodes --no-headers=true | awk '{print $1}') do CILIUM_POD_NAME=$(kubectl -n $CILIUM_NAMESPACE get pods -l "k8s-app=cilium" -o jsonpath="{.items[?(@.spec.nodeName=='$NODE_NAME')].metadata.name}") HOST_EP_ID=$(kubectl -n $CILIUM_NAMESPACE exec $CILIUM_POD_NAME -c cilium-agent -- cilium endpoint list -o jsonpath='{[?(@.status.identity.id==1)].id}') kubectl -n $CILIUM_NAMESPACE exec $CILIUM_POD_NAME -c cilium-agent -- cilium endpoint config $HOST_EP_ID PolicyAuditMode=Enabled kubectl -n $CILIUM_NAMESPACE exec $CILIUM_POD_NAME -c cilium-agent -- cilium endpoint config $HOST_EP_ID | grep PolicyAuditMode done
To see the currently activated policy mode, use the following commands:
for NODE_NAME in $(kubectl get nodes --no-headers=true | awk '{print $1}') do CILIUM_POD_NAME=$(kubectl -n $CILIUM_NAMESPACE get pods -l "k8s-app=cilium" -o jsonpath="{.items[?(@.spec.nodeName=='$NODE_NAME')].metadata.name}") kubectl -n $CILIUM_NAMESPACE exec $CILIUM_POD_NAME -c cilium-agent -- cilium endpoint list done
The output should look something like this (search for the endpoints with the label reserved:host
):
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS ENFORCEMENT ENFORCEMENT ... 2152 Disabled (Audit) Disabled 1 k8s:node-role.kubernetes.io/control-plane=true ready k8s:node-role.kubernetes.io/etcd=true k8s:node-role.kubernetes.io/master=true k8s:node-role.kubernetes.io/worker=true k8s:node.kubernetes.io/instance-type=rke2 reserved:host
Applying the Host Policies
Next, we recommend to apply a suitable host policy for the master nodes. You could theoretically lock yourself out of the cluster (unless running policy audit mode), as soon as there is another host policy in place which does not explicitly allow the Kubernetes API (6443/TCP).
apiVersion: "cilium.io/v2" kind: CiliumClusterwideNetworkPolicy metadata: name: "rke2-master-host-rule-set" spec: description: "Cilium host policy set for RKE2 masters" nodeSelector: matchLabels: node-role.kubernetes.io/master: 'true' node.kubernetes.io/instance-type: rke2 ingress: - fromEntities: - all toPorts: - ports: # Kubernetes API - port: "6443" protocol: TCP # RKE2 API for nodes to register - port: "9345" protocol: TCP
With the following set of base host policy rules, you regain SSH access and also permit access to some additional ports which are required by RKE2 and Cilium:
apiVersion: "cilium.io/v2" kind: CiliumClusterwideNetworkPolicy metadata: name: "rke2-base-host-rule-set" spec: description: "Cilium default host policy set for RKE2 nodes" nodeSelector: matchLabels: node.kubernetes.io/instance-type: rke2 ingress: - fromEntities: - remote-node toPorts: - ports: # Cilium WireGuard - port: "51871" protocol: UDP # Cilium VXLAN - port: "8472" protocol: UDP - fromEntities: - health - remote-node toPorts: - ports: # Inter-node Cilium cluster health checks (please also have a look at the appendix for further details) - port: "4240" protocol: TCP # Cilium cilium-agent health status API - port: "9876" protocol: TCP - fromEntities: - cluster toPorts: - ports: # RKE2 Kubelet - port: "10250" protocol: TCP # Cilium cilium-agent Prometheus metrics - port: "9090" protocol: TCP - fromEntities: - all toPorts: - ports: # SSH - port: "22" protocol: TCP
Now it’s time for the ETCD rules:
apiVersion: "cilium.io/v2" kind: CiliumClusterwideNetworkPolicy metadata: name: "rke2-etcd-host-rule-set" spec: description: "Cilium host policy set for RKE2 etcd nodes" nodeSelector: matchLabels: node-role.kubernetes.io/etcd: 'true' node.kubernetes.io/instance-type: rke2 ingress: - fromEntities: - host - remote-node toPorts: - ports: # RKE2 etcd client port - port: "2379" protocol: TCP # RKE2 etcd peer communication port - port: "2380" protocol: TCP
And finally, the configuration can be completed by applying the rules which are required for worker nodes:
apiVersion: "cilium.io/v2" kind: CiliumClusterwideNetworkPolicy metadata: name: "rke2-worker-host-rule-set" spec: description: "Cilium host policy set for RKE2 workers" nodeSelector: matchLabels: node-role.kubernetes.io/worker: 'true' node.kubernetes.io/instance-type: rke2 ingress: - fromEntities: - cluster toPorts: - ports: # Rancher monitoring Cilium operator metrics - port: "6942" protocol: TCP # Cilium Hubble relay - port: "4244" protocol: TCP - fromEntities: - all toPorts: - ports: # Ingress HTTP - port: "80" protocol: TCP # Ingress HTTPS - port: "443" protocol: TCP
Policy Verification
Before you deactivate the policy audit mode, have a look at the packets which would have been dropped if the rules were already enforced:
# Set Cilium namespace CILIUM_NAMESPACE=kube-system # Print Cilium pod names to console: kubectl -n $CILIUM_NAMESPACE get pods -l "k8s-app=cilium" -o jsonpath="{.items[*].metadata.name}" # Find the Cilium endpoint ID of the host itself (again, search for the endpoint the the label "reserved:host"): kubectl -n $CILIUM_NAMESPACE exec -c cilium-agent -- cilium endpoint list # Output all connections (allowed & audited) kubectl -n $CILIUM_NAMESPACE exec -c cilium-agent -- cilium monitor -t policy-verdict --related-to
Activating the Host Policies
Once you’re confident that the rule sets are fine and no essential connections get blocked, set the policy mode back to enforcing:
for NODE_NAME in $(kubectl get nodes --no-headers=true | awk '{print $1}') do CILIUM_POD_NAME=$(kubectl -n $CILIUM_NAMESPACE get pods -l "k8s-app=cilium" -o jsonpath="{.items[?(@.spec.nodeName=='$NODE_NAME')].metadata.name}") HOST_EP_ID=$(kubectl -n $CILIUM_NAMESPACE exec $CILIUM_POD_NAME -c cilium-agent -- cilium endpoint list -o jsonpath='{[?(@.status.identity.id==1)].id}') kubectl -n $CILIUM_NAMESPACE exec $CILIUM_POD_NAME -c cilium-agent -- cilium endpoint config $HOST_EP_ID PolicyAuditMode=Disabled done
That’s basically it! Your nodes are now firewalled via Cilium with eBPF in the background, while you can manage the required rules in the same easy way as any other “traditional” Kubernetes NetworkPolicy–via Kubernetes (custom) resources.
Appendix
Here are some more topics to consider before trying to work with Cilium host policies:
- Cilium requires ICMP type 0/8 (or 4240/TCP as an alternative) for internal health checks. As host policies do not yet support ICMP filtering without a feature flag (more details here), you will see dropped ICMP packages between the nodes inside Hubble/the
cilium monitor
output. Nevertheless, that’s not something to worry about because 4240/TCP is allowed in the shown manifests, and the health checks pass because of that anyway. - Keep in mind, these rules only allow TCP/UDP filtering. ICMP (and especially ICMPv6) is not supported yet (without a feature flag), hence IPv6 communication will not work. NeighborSolicitation (ICMPv6 type 135), NeighborAdvertisement (ICMPv6 type 136), RouterSolicitation (ICMPv6 type 133) and RouterAdvertisement (ICMPv6 type 134) packets will be dropped.
- Finally, we would like to highlight that the newest Cilium release 1.11.0 brought „ICMP and ICMPv6 support for CNP and CCNP policies with a feature flag“. Nevertheless, as the main issue regarding this topic is still open, and some final work needs to be done, we think it’s probably better to wait with the activation of this feature on productive clusters until it’s fully supported, marked as stable and activated by default in a future release of Cilium.
Sources: