Kubernetes で意図的に障害を起こしたらどうなるのか？

Posted on 2018-02-25

#kubernetes

Kubernetes を本格的に使っていくにあたり Kubernetes の裏側の仕組みを勉強しています。抽象化が進みブラックボックスになっているものを何となくの知識で運用するのは怖いからです。仕組みをちゃんと理解しているかどうかは障害時にはっきりと現れます。

というわけで、Kubernetes で意図的に障害を起こしたらどうなるのか試してみました。今回は特殊な Node 障害を想定して、Kubernetes のネットワークで重要な役割を担っている iptables のルールがすべて消えたという想定です。

なお、この検証で使った Dockerfile や Kubernetes の Manifest ファイルは GitHub で公開しています。 Docker イメージも Docker Hub の Public リポジトリにあるので、Kubernetes クラスタさえあればすぐに試すことができます。

manabusakai/k8s-hello-world

検証環境

AWS 上に Kops で立てた Kubernetes クラスタで検証しています。 Kubernetes のバージョンは 1.8.6 です。

Kubernetes クラスタの Node, Pod, Service の状態は次のとおり。

$ kubectl get no
NAME                                               STATUS    ROLES     AGE       VERSION
ip-172-20-46-130.ap-northeast-1.compute.internal   Ready     master    17h       v1.8.6
ip-172-20-64-88.ap-northeast-1.compute.internal    Ready     node      18h       v1.8.6

$ kubectl get po
NAME                               READY     STATUS    RESTARTS   AGE
k8s-hello-world-55f48f8c94-7shq5   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-9w5tj   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-cdc64   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-lkdvj   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-npkn6   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-ppsqk   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-sc9pf   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-tjg4n   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-vrkr9   1/1       Running   0          1m
k8s-hello-world-55f48f8c94-xzvlc   1/1       Running   0          1m

$ kubectl get svc
NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
k8s-hello-world   NodePort    100.69.211.31   <none>        8080:30000/TCP   3h
kubernetes        ClusterIP   100.64.0.1      <none>        443/TCP          18h

Node は 1 台でその上に 10 個の Pod が動いています。 Service の NodePort へは ALB を経由してインターネットからアクセスできるようにしています。

iptables のルールを削除してみる

さっそく Node サーバにログインして iptables のルールを iptables -F ですべて削除してみます。これで Service と Pod の通信が絶たれるはずです。

このとき、別のターミナルから 1 秒おきに curl コマンドを叩いてリクエストが落ちる瞬間があるか確認します。比較できるように date +%s でタイムスタンプを表示しています。

$ date +%s && sudo iptables -F
1519526806

$ while sleep 1; do date +%s; curl -sS http://k8s-hello-world.manabusakai.com/ | grep ^Hello; done
(snip)
1519526803
Hello world! via k8s-hello-world-55f48f8c94-vrkr9
1519526804
Hello world! via k8s-hello-world-55f48f8c94-tjg4n
1519526805
Hello world! via k8s-hello-world-55f48f8c94-vrkr9
1519526806
Hello world! via k8s-hello-world-55f48f8c94-tjg4n
1519526807
Hello world! via k8s-hello-world-55f48f8c94-tjg4n
1519526834
Hello world! via k8s-hello-world-55f48f8c94-vrkr9
1519526835
Hello world! via k8s-hello-world-55f48f8c94-vrkr9
^C

結果を見ると 1519526807 のあとに 27 秒の間が空いて 1519526834 にリクエストが返ってきています。この間 curl コマンドは応答を待っている状態になっていました。

再び Node サーバで iptables -L すると削除したはずのルールが元に戻っています。

$ sudo iptables -L KUBE-FORWARD
Chain KUBE-FORWARD (1 references)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere             /* kubernetes forwarding rules */ mark match 0x4000/0x4000
ACCEPT     all  --  100.96.0.0/11        anywhere             /* kubernetes forwarding conntrack pod source rule */ ctstate RELATED,ESTABLISHED
ACCEPT     all  --  anywhere             100.96.0.0/11        /* kubernetes forwarding conntrack pod destination rule */ ctstate RELATED,ESTABLISHED

何度やってもだいたい 20 秒ちょっとで元に戻ります。この 20 秒という数字はあとで理由がわかります。

誰が iptables のルールを元に戻したのか？

iptables のルールを削除してもすぐに元に戻ることはわかりました。ここで疑問なのが、誰が iptables のルールを元に戻しているのかということ。

iptables を使って Pod へのプロキシを行なっているのは kube-proxy ですが、/var/log/daemon.log を追いかけると iptables -F でルールを削除したあとに次のようなログが出力されていました（本来は 1 行です）。

Feb 25 03:17:33 ip-172-20-64-88 kubelet[1229]: I0225 03:17:33.070531
1229 qos_container_manager_linux.go:320] [ContainerManager]: Updated QoS cgroup configuration

ソースコードを追いかけると kubernetes/pkg/kubelet/cm/qos_container_manager_linux.go の中でこのログを出力していました。

kubernetes/qos_container_manager_linux.go at release-1.8 · kubernetes/kubernetes

kubelet の説明を読むと次のような記述があります。

The kubelet takes a set of PodSpecs that are provided through various mechanisms (primarily through the apiserver) and ensures that the containers described in those PodSpecs are running and healthy.

File: Path passed as a flag on the command line. Files under this path will be monitored periodically for updates. The monitoring period is 20s by default and is configurable via a flag.

kubelet のプロセスを確認すると、たしかに --pod-manifest-path=/etc/kubernetes/manifests という引数が指定されています。そのディレクトリを確認すると…。

$ ls -l /etc/kubernetes/manifests
total 4
-rw-r--r-- 1 root root 1398 Feb 24 08:08 kube-proxy.manifest

kube-proxy.manifest というファイルがありました！ kubelet は 20 秒おきに状態を監視して、違っていれば Manifest ファイルの設定に戻しているわけですね。検証したときに 20 秒ちょっとで元に戻ったのもつじつまが合います。

まとめ

Kubernetes の耐障害性は、よくできた仕組みで実現されていることがわかりました。

ドキュメントに書かれたことを鵜呑みにせず、自分で手を動かしてソースコードを読んでみることが大切ですね。エンジニアとしてこの基本を忘れずにいようと思います。

Follow @manabusakai

はったりエンジニアの備忘録