It was DNS

I use kubeadm for my home k8s cluster that I run my blog on. Yesterday, I refactored my blog server to push static resources to Google Cloud Storage and then pull those resources down in an init container, instead of packaging them in the container image. Originally, I planned to redirect to a signed GCS URL from my server application and have GCS handle serving static resources to the browser, but I ran into issues with page load time even with Cache-Control set to 48 hours. Eventually, I'd like to use a k8s storage class and push resources to NFS or similar served by a redundant local deployment.

The previous flow:

The updated flow:

I added the following init contianer to my deployment to pull resources into a new emptyDir volume:

...
apiVersion: apps/v1
kind: Deployment
...
spec:
  ...
  template:
    ...
    spec:
      containers:
      - ...
        volumeMounts:
        - name: resources
          readOnly: true
          mountPath: /usr/src/blog/resources
      initContainers:
      - name: init
        image: gcr.io/google.com/cloudsdktool/google-cloud-cli:alpine
        command: ["/bin/sh"]
        args: ["-c", "gcloud auth activate-service-account --key-file=/var/gcp-creds/creds.json; gsutil -m rsync -r gs://connor.zip/resources /var/resources"]
        volumeMounts:
        - name: gcp-creds
          readOnly: true
          mountPath: /var/gcp-creds
        - name: resources
          mountPath: /var/resources
      volumes:
      - name: gcp-creds
        secret:
          secretName: gcp-creds
      - name: resources
        emptyDir: {}

I used the google-cloud-cli container for access to the gsutil command. The gsutil command doesn't look at the GOOGLE_APPLICATION_CREDENTIALS environment variable like the SDK client would, so we need to instead call

; gcloud auth activate-service-account --key-file=/var/gcp-creds/creds.json

Once the init container scheduled, it wouldn't finish rsync-ing files and instead showed:

; kubectl get pods
NAME                        READY   STATUS             RESTARTS          AGE
blog-67f956dd6-26q85        0/1     Init:0/1           0                 7s

The logs for the pod showed a connectivity issue:

; kubectl logs pods/blog-67f956dd6-26q85 -c init
INFO 0223 06:29:43.857160 retry_util.py] Retrying request, attempt #2...

So, why can't gsutil complete the request? Let's check the pod's connectivity:

; kubectl exec -it blog-b85849cd-8j4mf -c init -- /bin/sh
/ # ping google.com
ping: bad address 'google.com'
/ # curl google.com
curl: (6) Could not resolve host: google.com

Now that we've identified the issue as DNS, we can follow the instructions at Debugging DNS Resolution. The resolution flow in my network is shown below:

DNS resolution within a pod is managed by k8s, which manages the /etc/resolv.conf file. Our pod has one like:

search default.svc.cluster.local svc.cluster.local cluster.local home.arpa
nameserver 10.96.0.10
options ndots:5

We should ensure our kube-dns service has the correct cluster IP:

; kubectl get svc/kube-dns -n kube-system
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.3.0.10    <none>        53/UDP,53/TCP,9153/TCP   466d

There is a mismatch! For me, the reason is that the kube-dns IP is mismatched is that the worker's config wasn't reflecting the updated IP I chose for kube-dns. Previously, I had manually updated the /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf file on the node. When I updated the node, the ConfigMap stored in k8s overwrote this file. The section of k8s docs on Updating the KubeletConfiguration instructs us to edit:

; kubectl edit cm -n kube-system kubelet-config

In my cluster, the config contained:

clusterDNS:
- 10.96.0.10
clusterDomain: cluster.local

which is outdated, it needed to be updated to:

clusterDNS:
- 10.3.0.10
clusterDomain: k8s.home.arpa

This is because my kube-dns service exists within the 10.3.x.x IP range, and because I want k8s DNS addresses to be placed under my home networks home.arpa DNS zone. The pfSense firewall is configured to delegate DNS for the 10.3.x.x and 10.2.x.x service and pod networks, as well as k8s.home.arpa to Kubernetes so that I can easily reach k8s resources from the rest of the network. For instance, this blog is accessible on my local network via http://blog.default.svc.k8s.home.arpa/.

Once updated, your nodes need to be updated:

; sudo kubeadm upgrade node phase kubelet-config
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.
; sudo systemctl restart kubelet.service

This will update the /etc/resolv.conf on pods created by the kubelet to reflect our new dns and domain configuration. We may need to roll pods so they reflect this.

The CoreDNS config can be edited with:

; kubectl -n kube-system edit configmap coredns

which contains:

.:53 {
    log
    errors
    health {
        lameduck 5s
    }
    ready
    kubernetes k8s.home.arpa in-addr.arpa ip6.arpa {
        pods insecure
        fallthrough in-addr.arpa ip6.arpa
        ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf {
        max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}

The log line was added to help with debugging. The k8s.home.arpa mentions on the kubernetes line is meant to coincide with our clusterDomain configuration in kubelet, and the /etc/resolv.conf follows system configuration when forwarding. We can get this configuration by running a dnsutils pod:

apiVersion: v1
kind: Pod
metadata:
  name: dnsutils
  namespace: default
spec:
  containers:
  - name: dnsutils
    image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3
    command:
      - sleep
      - "infinity"
    imagePullPolicy: IfNotPresent
  dnsPolicy: Default
  restartPolicy: Always

This pod will receive the same /etc/resolv.conf that CoreDNS will since it has dnsPolicy: Default. On my system, the kubelet is configured to pull /etc/resolv.conf from /run/systemd/resolve/resolv.conf, which contains:

...
nameserver 10.0.0.1
nameserver 2600:1700:f08:111f:20c:29ff:fe6f:1149
nameserver 10.0.0.1
# Too many DNS servers configured, the following entries may be ignored.
nameserver 2600:1700:f08:111f:20c:29ff:fe6f:1149
search home.arpa

In the logs for CoreDNS, I noticed errors around resolution of GCS DNS:

; kubectl logs --namespace=kube-system -l k8s-app=kube-dns -f
...
[INFO] 10.2.110.43:50804 - 1443 "A IN storage.googleapis.com.home.arpa. udp 50 false 512" SERVFAIL qr,aa,rd,ra 50 0.000196806s

The search home.arpa line is informing CoreDNS to test DNS addresses within .home.arpa as well as globally, but lookups are failing within the .home.arpa zone. It turns out that pfSense’s unbound can get overloaded if set in recursive mode. Since the /etc/resolv.conf search line contains home.arpa, every domain gets tested with that suffix e.g. google.com.home.arpa.

Under Services, DNS Resolver in pfSense, I could see “System Domain Local Zone Type” configured as Transparent, which attempts to ask upstream if x.home.arpa wasn’t present. Swapping this to Static avoids this lookup and return an NXDOMAIN if no overrides are present, since pfSense should be the owner of .home.arpa. I also toggled on forwarding mode so that it hits the upstream DNS server for zones it doesn’t manage instead of being a recursive resolver itself.

These changes solved my issue and the init container was able to pull resources from GCR. I later realized that setting the "System Domain Local Zone Type" to Static caused the Domain Overrides for k8s.home.arpa to fail, so I switched it back to Transparent. So far, DNS within Kubernetes is working properly:

; kubectl exec -i -t dnsutils -- nslookup google.com
Server:		10.3.0.10
Address:	10.3.0.10#53

Non-authoritative answer:
Name:	google.com
Address: 142.250.113.101
Name:	google.com
Address: 142.250.113.100
Name:	google.com
Address: 142.250.113.138
Name:	google.com
Address: 142.250.113.102
Name:	google.com
Address: 142.250.113.113
Name:	google.com
Address: 142.250.113.139