Zamperini


  • 首页

  • 关于

  • 标签23

  • 分类11

  • 归档56

k8s本地调试

发表于 2020-02-04 | 分类于 k8s

0. 本地调试

k8s各组件之间的交互需要用到加密通信,如果简单配置各组件的启动参数,相互之间无法正常通信。本节将介绍如何在配置进而可以在本地进行debugk8s集群。

0.1 本地运行调试

k8s有个文件hack/local-up-cluster.sh用于起一个本地k8s集群,而该脚本生成的证书等同样可以用于ideadebug时使用。这里进行了以下修改:将执行组件的命令,转换成输出命令参数echo命令见附件^1。

0.2 远程debug

还有一种方式则是在另一台机器上运行一个k8s集群,通过dlv来远程debug,可以参考此篇: 搭建k8s的开发调试环境

附件

[^1]: 本地生成密钥以及组件命令参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
#!/usr/bin/env bash

# Copyright 2014 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

KUBE_ROOT=$(dirname "${BASH_SOURCE[0]}")/..

# This command builds and runs a local kubernetes cluster.
# You may need to run this as root to allow kubelet to open docker's socket,
# and to write the test CA in /var/run/kubernetes.
DOCKER_OPTS=${DOCKER_OPTS:-""}
export DOCKER=(docker "${DOCKER_OPTS[@]}")
DOCKER_ROOT=${DOCKER_ROOT:-""}
ALLOW_PRIVILEGED=${ALLOW_PRIVILEGED:-""}
DENY_SECURITY_CONTEXT_ADMISSION=${DENY_SECURITY_CONTEXT_ADMISSION:-""}
PSP_ADMISSION=${PSP_ADMISSION:-""}
NODE_ADMISSION=${NODE_ADMISSION:-""}
RUNTIME_CONFIG=${RUNTIME_CONFIG:-""}
KUBELET_AUTHORIZATION_WEBHOOK=${KUBELET_AUTHORIZATION_WEBHOOK:-""}
KUBELET_AUTHENTICATION_WEBHOOK=${KUBELET_AUTHENTICATION_WEBHOOK:-""}
POD_MANIFEST_PATH=${POD_MANIFEST_PATH:-"/var/run/kubernetes/static-pods"}
KUBELET_FLAGS=${KUBELET_FLAGS:-""}
KUBELET_IMAGE=${KUBELET_IMAGE:-""}
# many dev environments run with swap on, so we don't fail in this env
FAIL_SWAP_ON=${FAIL_SWAP_ON:-"false"}
# Name of the network plugin, eg: "kubenet"
NET_PLUGIN=${NET_PLUGIN:-""}
# Place the config files and binaries required by NET_PLUGIN in these directory,
# eg: "/etc/cni/net.d" for config files, and "/opt/cni/bin" for binaries.
CNI_CONF_DIR=${CNI_CONF_DIR:-""}
CNI_BIN_DIR=${CNI_BIN_DIR:-""}
SERVICE_CLUSTER_IP_RANGE=${SERVICE_CLUSTER_IP_RANGE:-10.0.0.0/24}
FIRST_SERVICE_CLUSTER_IP=${FIRST_SERVICE_CLUSTER_IP:-10.0.0.1}
# if enabled, must set CGROUP_ROOT
CGROUPS_PER_QOS=${CGROUPS_PER_QOS:-true}
# name of the cgroup driver, i.e. cgroupfs or systemd
CGROUP_DRIVER=${CGROUP_DRIVER:-""}
# if cgroups per qos is enabled, optionally change cgroup root
CGROUP_ROOT=${CGROUP_ROOT:-""}
# owner of client certs, default to current user if not specified
USER=${USER:-$(whoami)}

# enables testing eviction scenarios locally.
EVICTION_HARD=${EVICTION_HARD:-"memory.available<100Mi,nodefs.available<10%,nodefs.inodesFree<5%"}
EVICTION_SOFT=${EVICTION_SOFT:-""}
EVICTION_PRESSURE_TRANSITION_PERIOD=${EVICTION_PRESSURE_TRANSITION_PERIOD:-"1m"}

# This script uses docker0 (or whatever container bridge docker is currently using)
# and we don't know the IP of the DNS pod to pass in as --cluster-dns.
# To set this up by hand, set this flag and change DNS_SERVER_IP.
# Note also that you need API_HOST (defined above) for correct DNS.
KUBE_PROXY_MODE=${KUBE_PROXY_MODE:-""}
ENABLE_CLUSTER_DNS=${KUBE_ENABLE_CLUSTER_DNS:-true}
ENABLE_NODELOCAL_DNS=${KUBE_ENABLE_NODELOCAL_DNS:-false}
DNS_SERVER_IP=${KUBE_DNS_SERVER_IP:-10.0.0.10}
LOCAL_DNS_IP=${KUBE_LOCAL_DNS_IP:-169.254.20.10}
DNS_MEMORY_LIMIT=${KUBE_DNS_MEMORY_LIMIT:-170Mi}
DNS_DOMAIN=${KUBE_DNS_NAME:-"cluster.local"}
KUBECTL=${KUBECTL:-"${KUBE_ROOT}/cluster/kubectl.sh"}
WAIT_FOR_URL_API_SERVER=${WAIT_FOR_URL_API_SERVER:-60}
MAX_TIME_FOR_URL_API_SERVER=${MAX_TIME_FOR_URL_API_SERVER:-1}
ENABLE_DAEMON=${ENABLE_DAEMON:-false}
HOSTNAME_OVERRIDE=${HOSTNAME_OVERRIDE:-"127.0.0.1"}
EXTERNAL_CLOUD_PROVIDER=${EXTERNAL_CLOUD_PROVIDER:-false}
EXTERNAL_CLOUD_PROVIDER_BINARY=${EXTERNAL_CLOUD_PROVIDER_BINARY:-""}
CLOUD_PROVIDER=${CLOUD_PROVIDER:-""}
CLOUD_CONFIG=${CLOUD_CONFIG:-""}
FEATURE_GATES=${FEATURE_GATES:-"AllAlpha=false"}
STORAGE_BACKEND=${STORAGE_BACKEND:-"etcd3"}
STORAGE_MEDIA_TYPE=${STORAGE_MEDIA_TYPE:-""}
# preserve etcd data. you also need to set ETCD_DIR.
PRESERVE_ETCD="${PRESERVE_ETCD:-false}"

# enable kubernetes dashboard
ENABLE_CLUSTER_DASHBOARD=${KUBE_ENABLE_CLUSTER_DASHBOARD:-false}

# RBAC Mode options
AUTHORIZATION_MODE=${AUTHORIZATION_MODE:-"Node,RBAC"}
KUBECONFIG_TOKEN=${KUBECONFIG_TOKEN:-""}
AUTH_ARGS=${AUTH_ARGS:-""}

# WebHook Authentication and Authorization
AUTHORIZATION_WEBHOOK_CONFIG_FILE=${AUTHORIZATION_WEBHOOK_CONFIG_FILE:-""}
AUTHENTICATION_WEBHOOK_CONFIG_FILE=${AUTHENTICATION_WEBHOOK_CONFIG_FILE:-""}

# Install a default storage class (enabled by default)
DEFAULT_STORAGE_CLASS=${KUBE_DEFAULT_STORAGE_CLASS:-true}

# Do not run the mutation detector by default on a local cluster.
# It is intended for a specific type of testing and inherently leaks memory.
KUBE_CACHE_MUTATION_DETECTOR="${KUBE_CACHE_MUTATION_DETECTOR:-false}"
export KUBE_CACHE_MUTATION_DETECTOR

# panic the server on watch decode errors since they are considered coder mistakes
KUBE_PANIC_WATCH_DECODE_ERROR="${KUBE_PANIC_WATCH_DECODE_ERROR:-true}"
export KUBE_PANIC_WATCH_DECODE_ERROR

# Default list of admission Controllers to invoke prior to persisting objects in cluster
# The order defined here does not matter.
ENABLE_ADMISSION_PLUGINS=${ENABLE_ADMISSION_PLUGINS:-"NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,Priority,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota"}
DISABLE_ADMISSION_PLUGINS=${DISABLE_ADMISSION_PLUGINS:-""}
ADMISSION_CONTROL_CONFIG_FILE=${ADMISSION_CONTROL_CONFIG_FILE:-""}

# START_MODE can be 'all', 'kubeletonly', 'nokubelet', or 'nokubeproxy'
START_MODE=${START_MODE:-"all"}

# A list of controllers to enable
KUBE_CONTROLLERS="${KUBE_CONTROLLERS:-"*"}"

# Audit policy
AUDIT_POLICY_FILE=${AUDIT_POLICY_FILE:-""}

# sanity check for OpenStack provider
if [ "${CLOUD_PROVIDER}" == "openstack" ]; then
if [ "${CLOUD_CONFIG}" == "" ]; then
echo "Missing CLOUD_CONFIG env for OpenStack provider!"
exit 1
fi
if [ ! -f "${CLOUD_CONFIG}" ]; then
echo "Cloud config ${CLOUD_CONFIG} doesn't exist"
exit 1
fi
fi

# warn if users are running with swap allowed
if [ "${FAIL_SWAP_ON}" == "false" ]; then
echo "WARNING : The kubelet is configured to not fail even if swap is enabled; production deployments should disable swap."
fi

if [ "$(id -u)" != "0" ]; then
echo "WARNING : This script MAY be run as root for docker socket / iptables functionality; if failures occur, retry as root." 2>&1
fi

# Stop right away if the build fails
set -e

source "${KUBE_ROOT}/hack/lib/init.sh"
kube::util::ensure-gnu-sed

function usage {
echo "This script starts a local kube cluster. "
echo "Example 0: hack/local-up-cluster.sh -h (this 'help' usage description)"
echo "Example 1: hack/local-up-cluster.sh -o _output/dockerized/bin/linux/amd64/ (run from docker output)"
echo "Example 2: hack/local-up-cluster.sh -O (auto-guess the bin path for your platform)"
echo "Example 3: hack/local-up-cluster.sh (build a local copy of the source)"
}

# This function guesses where the existing cached binary build is for the `-O`
# flag
function guess_built_binary_path {
local apiserver_path
apiserver_path=$(kube::util::find-binary "kube-apiserver")
if [[ -z "${apiserver_path}" ]]; then
return
fi
echo -n "$(dirname "${apiserver_path}")"
}

### Allow user to supply the source directory.
GO_OUT=${GO_OUT:-}
while getopts "ho:O" OPTION
do
case ${OPTION} in
o)
echo "skipping build"
GO_OUT="${OPTARG}"
echo "using source ${GO_OUT}"
;;
O)
GO_OUT=$(guess_built_binary_path)
if [ "${GO_OUT}" == "" ]; then
echo "Could not guess the correct output directory to use."
exit 1
fi
;;
h)
usage
exit
;;
?)
usage
exit
;;
esac
done

if [ "x${GO_OUT}" == "x" ]; then
make -C "${KUBE_ROOT}" WHAT="cmd/kubectl cmd/kube-apiserver cmd/kube-controller-manager cmd/cloud-controller-manager cmd/kubelet cmd/kube-proxy cmd/kube-scheduler"
else
echo "skipped the build."
fi

# Shut down anyway if there's an error.
set +e

API_PORT=${API_PORT:-8080}
API_SECURE_PORT=${API_SECURE_PORT:-6443}

# WARNING: For DNS to work on most setups you should export API_HOST as the docker0 ip address,
API_HOST=${API_HOST:-localhost}
API_HOST_IP=${API_HOST_IP:-"127.0.0.1"}
ADVERTISE_ADDRESS=${ADVERTISE_ADDRESS:-""}
NODE_PORT_RANGE=${NODE_PORT_RANGE:-""}
API_BIND_ADDR=${API_BIND_ADDR:-"0.0.0.0"}
EXTERNAL_HOSTNAME=${EXTERNAL_HOSTNAME:-localhost}

KUBELET_HOST=${KUBELET_HOST:-"127.0.0.1"}
# By default only allow CORS for requests on localhost
API_CORS_ALLOWED_ORIGINS=${API_CORS_ALLOWED_ORIGINS:-/127.0.0.1(:[0-9]+)?$,/localhost(:[0-9]+)?$}
KUBELET_PORT=${KUBELET_PORT:-10250}
LOG_LEVEL=${LOG_LEVEL:-3}
# Use to increase verbosity on particular files, e.g. LOG_SPEC=token_controller*=5,other_controller*=4
LOG_SPEC=${LOG_SPEC:-""}
LOG_DIR=${LOG_DIR:-"/tmp"}
CONTAINER_RUNTIME=${CONTAINER_RUNTIME:-"docker"}
CONTAINER_RUNTIME_ENDPOINT=${CONTAINER_RUNTIME_ENDPOINT:-""}
RUNTIME_REQUEST_TIMEOUT=${RUNTIME_REQUEST_TIMEOUT:-"2m"}
IMAGE_SERVICE_ENDPOINT=${IMAGE_SERVICE_ENDPOINT:-""}
CHAOS_CHANCE=${CHAOS_CHANCE:-0.0}
CPU_CFS_QUOTA=${CPU_CFS_QUOTA:-true}
ENABLE_HOSTPATH_PROVISIONER=${ENABLE_HOSTPATH_PROVISIONER:-"false"}
CLAIM_BINDER_SYNC_PERIOD=${CLAIM_BINDER_SYNC_PERIOD:-"15s"} # current k8s default
ENABLE_CONTROLLER_ATTACH_DETACH=${ENABLE_CONTROLLER_ATTACH_DETACH:-"true"} # current default
# This is the default dir and filename where the apiserver will generate a self-signed cert
# which should be able to be used as the CA to verify itself
CERT_DIR=${CERT_DIR:-"/etc/run/kubernetes"}
ROOT_CA_FILE=${CERT_DIR}/server-ca.crt
ROOT_CA_KEY=${CERT_DIR}/server-ca.key
CLUSTER_SIGNING_CERT_FILE=${CLUSTER_SIGNING_CERT_FILE:-"${ROOT_CA_FILE}"}
CLUSTER_SIGNING_KEY_FILE=${CLUSTER_SIGNING_KEY_FILE:-"${ROOT_CA_KEY}"}
# Reuse certs will skip generate new ca/cert files under CERT_DIR
# it's useful with PRESERVE_ETCD=true because new ca will make existed service account secrets invalided
REUSE_CERTS=${REUSE_CERTS:-false}

# name of the cgroup driver, i.e. cgroupfs or systemd
if [[ ${CONTAINER_RUNTIME} == "docker" ]]; then
# default cgroup driver to match what is reported by docker to simplify local development
if [[ -z ${CGROUP_DRIVER} ]]; then
# match driver with docker runtime reported value (they must match)
CGROUP_DRIVER=$(docker info | grep "Cgroup Driver:" | sed -e 's/^[[:space:]]*//'|cut -f3- -d' ')
echo "Kubelet cgroup driver defaulted to use: ${CGROUP_DRIVER}"
fi
if [[ -f /var/log/docker.log && ! -f "${LOG_DIR}/docker.log" ]]; then
ln -s /var/log/docker.log "${LOG_DIR}/docker.log"
fi
fi



# Ensure CERT_DIR is created for auto-generated crt/key and kubeconfig
mkdir -p "${CERT_DIR}" &>/dev/null || sudo mkdir -p "${CERT_DIR}"
CONTROLPLANE_SUDO=$(test -w "${CERT_DIR}" || echo "sudo -E")

function test_apiserver_off {
# For the common local scenario, fail fast if server is already running.
# this can happen if you run local-up-cluster.sh twice and kill etcd in between.
if [[ "${API_PORT}" -gt "0" ]]; then
if ! curl --silent -g "${API_HOST}:${API_PORT}" ; then
echo "API SERVER insecure port is free, proceeding..."
else
echo "ERROR starting API SERVER, exiting. Some process on ${API_HOST} is serving already on ${API_PORT}"
exit 1
fi
fi

if ! curl --silent -k -g "${API_HOST}:${API_SECURE_PORT}" ; then
echo "API SERVER secure port is free, proceeding..."
else
echo "ERROR starting API SERVER, exiting. Some process on ${API_HOST} is serving already on ${API_SECURE_PORT}"
exit 1
fi
}

function detect_binary {
# Detect the OS name/arch so that we can find our binary
case "$(uname -s)" in
Darwin)
host_os=darwin
;;
Linux)
host_os=linux
;;
*)
echo "Unsupported host OS. Must be Linux or Mac OS X." >&2
exit 1
;;
esac

case "$(uname -m)" in
x86_64*)
host_arch=amd64
;;
i?86_64*)
host_arch=amd64
;;
amd64*)
host_arch=amd64
;;
aarch64*)
host_arch=arm64
;;
arm64*)
host_arch=arm64
;;
arm*)
host_arch=arm
;;
i?86*)
host_arch=x86
;;
s390x*)
host_arch=s390x
;;
ppc64le*)
host_arch=ppc64le
;;
*)
echo "Unsupported host arch. Must be x86_64, 386, arm, arm64, s390x or ppc64le." >&2
exit 1
;;
esac

GO_OUT="${KUBE_ROOT}/_output/local/bin/${host_os}/${host_arch}"
}

cleanup()
{
echo "Cleaning up..."
# delete running images
# if [[ "${ENABLE_CLUSTER_DNS}" == true ]]; then
# Still need to figure why this commands throw an error: Error from server: client: etcd cluster is unavailable or misconfigured
# ${KUBECTL} --namespace=kube-system delete service kube-dns
# And this one hang forever:
# ${KUBECTL} --namespace=kube-system delete rc kube-dns-v10
# fi

# Check if the API server is still running
[[ -n "${APISERVER_PID-}" ]] && kube::util::read-array APISERVER_PIDS < <(pgrep -P "${APISERVER_PID}" ; ps -o pid= -p "${APISERVER_PID}")
[[ -n "${APISERVER_PIDS-}" ]] && sudo kill "${APISERVER_PIDS[@]}" 2>/dev/null

# Check if the controller-manager is still running
[[ -n "${CTLRMGR_PID-}" ]] && kube::util::read-array CTLRMGR_PIDS < <(pgrep -P "${CTLRMGR_PID}" ; ps -o pid= -p "${CTLRMGR_PID}")
[[ -n "${CTLRMGR_PIDS-}" ]] && sudo kill "${CTLRMGR_PIDS[@]}" 2>/dev/null

# Check if the kubelet is still running
[[ -n "${KUBELET_PID-}" ]] && kube::util::read-array KUBELET_PIDS < <(pgrep -P "${KUBELET_PID}" ; ps -o pid= -p "${KUBELET_PID}")
[[ -n "${KUBELET_PIDS-}" ]] && sudo kill "${KUBELET_PIDS[@]}" 2>/dev/null

# Check if the proxy is still running
[[ -n "${PROXY_PID-}" ]] && kube::util::read-array PROXY_PIDS < <(pgrep -P "${PROXY_PID}" ; ps -o pid= -p "${PROXY_PID}")
[[ -n "${PROXY_PIDS-}" ]] && sudo kill "${PROXY_PIDS[@]}" 2>/dev/null

# Check if the scheduler is still running
[[ -n "${SCHEDULER_PID-}" ]] && kube::util::read-array SCHEDULER_PIDS < <(pgrep -P "${SCHEDULER_PID}" ; ps -o pid= -p "${SCHEDULER_PID}")
[[ -n "${SCHEDULER_PIDS-}" ]] && sudo kill "${SCHEDULER_PIDS[@]}" 2>/dev/null

# Check if the etcd is still running
[[ -n "${ETCD_PID-}" ]] && kube::etcd::stop
if [[ "${PRESERVE_ETCD}" == "false" ]]; then
[[ -n "${ETCD_DIR-}" ]] && kube::etcd::clean_etcd_dir
fi
exit 0
}

# Check if all processes are still running. Prints a warning once each time
# a process dies unexpectedly.
function healthcheck {
if [[ -n "${APISERVER_PID-}" ]] && ! sudo kill -0 "${APISERVER_PID}" 2>/dev/null; then
warning_log "API server terminated unexpectedly, see ${APISERVER_LOG}"
APISERVER_PID=
fi

if [[ -n "${CTLRMGR_PID-}" ]] && ! sudo kill -0 "${CTLRMGR_PID}" 2>/dev/null; then
warning_log "kube-controller-manager terminated unexpectedly, see ${CTLRMGR_LOG}"
CTLRMGR_PID=
fi

if [[ -n "${KUBELET_PID-}" ]] && ! sudo kill -0 "${KUBELET_PID}" 2>/dev/null; then
warning_log "kubelet terminated unexpectedly, see ${KUBELET_LOG}"
KUBELET_PID=
fi

if [[ -n "${PROXY_PID-}" ]] && ! sudo kill -0 "${PROXY_PID}" 2>/dev/null; then
warning_log "kube-proxy terminated unexpectedly, see ${PROXY_LOG}"
PROXY_PID=
fi

if [[ -n "${SCHEDULER_PID-}" ]] && ! sudo kill -0 "${SCHEDULER_PID}" 2>/dev/null; then
warning_log "scheduler terminated unexpectedly, see ${SCHEDULER_LOG}"
SCHEDULER_PID=
fi

if [[ -n "${ETCD_PID-}" ]] && ! sudo kill -0 "${ETCD_PID}" 2>/dev/null; then
warning_log "etcd terminated unexpectedly"
ETCD_PID=
fi
}

function print_color {
message=$1
prefix=${2:+$2: } # add colon only if defined
color=${3:-1} # default is red
echo -n "$(tput bold)$(tput setaf "${color}")"
echo "${prefix}${message}"
echo -n "$(tput sgr0)"
}

function warning_log {
print_color "$1" "W$(date "+%m%d %H:%M:%S")]" 1
}

function start_etcd {
echo "Starting etcd"
#export ETCD_LOGFILE=${LOG_DIR}/etcd.log
#kube::etcd::start
}

function set_service_accounts {
SERVICE_ACCOUNT_LOOKUP=${SERVICE_ACCOUNT_LOOKUP:-true}
SERVICE_ACCOUNT_KEY=${SERVICE_ACCOUNT_KEY:-/tmp/kube-serviceaccount.key}
# Generate ServiceAccount key if needed
if [[ ! -f "${SERVICE_ACCOUNT_KEY}" ]]; then
mkdir -p "$(dirname "${SERVICE_ACCOUNT_KEY}")"
openssl genrsa -out "${SERVICE_ACCOUNT_KEY}" 2048 2>/dev/null
fi
}

function generate_certs {
# Create CA signers
if [[ "${ENABLE_SINGLE_CA_SIGNER:-}" = true ]]; then
kube::util::create_signing_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" server '"client auth","server auth"'
sudo cp "${CERT_DIR}/server-ca.key" "${CERT_DIR}/client-ca.key"
sudo cp "${CERT_DIR}/server-ca.crt" "${CERT_DIR}/client-ca.crt"
sudo cp "${CERT_DIR}/server-ca-config.json" "${CERT_DIR}/client-ca-config.json"
else
kube::util::create_signing_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" server '"server auth"'
kube::util::create_signing_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" client '"client auth"'
fi

# Create auth proxy client ca
kube::util::create_signing_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" request-header '"client auth"'

# serving cert for kube-apiserver
kube::util::create_serving_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "server-ca" kube-apiserver kubernetes.default kubernetes.default.svc "localhost" "${API_HOST_IP}" "${API_HOST}" "${FIRST_SERVICE_CLUSTER_IP}"

# Create client certs signed with client-ca, given id, given CN and a number of groups
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' controller system:kube-controller-manager
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' scheduler system:kube-scheduler
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' admin system:admin system:masters
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' kube-apiserver kube-apiserver

# Create matching certificates for kube-aggregator
kube::util::create_serving_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "server-ca" kube-aggregator api.kube-public.svc "localhost" "${API_HOST_IP}"
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" request-header-ca auth-proxy system:auth-proxy

# TODO remove masters and add rolebinding
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' kube-aggregator system:kube-aggregator system:masters
kube::util::write_client_kubeconfig "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "${ROOT_CA_FILE}" "${API_HOST}" "${API_SECURE_PORT}" kube-aggregator
}

function generate_kubeproxy_certs {
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' kube-proxy system:kube-proxy system:nodes
kube::util::write_client_kubeconfig "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "${ROOT_CA_FILE}" "${API_HOST}" "${API_SECURE_PORT}" kube-proxy
}

function generate_kubelet_certs {
kube::util::create_client_certkey "${CONTROLPLANE_SUDO}" "${CERT_DIR}" 'client-ca' kubelet "system:node:${HOSTNAME_OVERRIDE}" system:nodes
kube::util::write_client_kubeconfig "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "${ROOT_CA_FILE}" "${API_HOST}" "${API_SECURE_PORT}" kubelet
}

function start_apiserver {
security_admission=""
if [[ -n "${DENY_SECURITY_CONTEXT_ADMISSION}" ]]; then
security_admission=",SecurityContextDeny"
fi
if [[ -n "${PSP_ADMISSION}" ]]; then
security_admission=",PodSecurityPolicy"
fi
if [[ -n "${NODE_ADMISSION}" ]]; then
security_admission=",NodeRestriction"
fi

# Append security_admission plugin
ENABLE_ADMISSION_PLUGINS="${ENABLE_ADMISSION_PLUGINS}${security_admission}"

authorizer_arg=""
if [[ -n "${AUTHORIZATION_MODE}" ]]; then
authorizer_arg="--authorization-mode=${AUTHORIZATION_MODE}"
fi
priv_arg=""
if [[ -n "${ALLOW_PRIVILEGED}" ]]; then
priv_arg="--allow-privileged=${ALLOW_PRIVILEGED}"
fi

runtime_config=""
if [[ -n "${RUNTIME_CONFIG}" ]]; then
runtime_config="--runtime-config=${RUNTIME_CONFIG}"
fi

# Let the API server pick a default address when API_HOST_IP
# is set to 127.0.0.1
advertise_address=""
if [[ "${API_HOST_IP}" != "127.0.0.1" ]]; then
advertise_address="--advertise-address=${API_HOST_IP}"
fi
if [[ "${ADVERTISE_ADDRESS}" != "" ]] ; then
advertise_address="--advertise-address=${ADVERTISE_ADDRESS}"
fi
node_port_range=""
if [[ "${NODE_PORT_RANGE}" != "" ]] ; then
node_port_range="--service-node-port-range=${NODE_PORT_RANGE}"
fi

if [[ "${REUSE_CERTS}" != true ]]; then
# Create Certs
generate_certs
fi

cloud_config_arg="--cloud-provider=${CLOUD_PROVIDER} --cloud-config=${CLOUD_CONFIG}"
if [[ "${EXTERNAL_CLOUD_PROVIDER:-}" == "true" ]]; then
cloud_config_arg="--cloud-provider=external"
fi

if [[ -z "${AUDIT_POLICY_FILE}" ]]; then
cat <<EOF > /tmp/kube-audit-policy-file
# Log all requests at the Metadata level.
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
EOF
AUDIT_POLICY_FILE="/tmp/kube-audit-policy-file"
fi

APISERVER_LOG=${LOG_DIR}/kube-apiserver.log
# shellcheck disable=SC2086
echo "go build apiserver.go ${authorizer_arg} ${priv_arg} ${runtime_config} ${cloud_config_arg} ${advertise_address} ${node_port_range} --v=${LOG_LEVEL} --vmodule=${LOG_SPEC} --audit-policy-file=${AUDIT_POLICY_FILE} --audit-log-path=${LOG_DIR}/kube-apiserver-audit.log --authorization-webhook-config-file=${AUTHORIZATION_WEBHOOK_CONFIG_FILE} --authentication-token-webhook-config-file=${AUTHENTICATION_WEBHOOK_CONFIG_FILE} --cert-dir=${CERT_DIR} --client-ca-file=${CERT_DIR}/client-ca.crt --kubelet-client-certificate=${CERT_DIR}/client-kube-apiserver.crt --kubelet-client-key=${CERT_DIR}/client-kube-apiserver.key --service-account-key-file=${SERVICE_ACCOUNT_KEY} --service-account-lookup=${SERVICE_ACCOUNT_LOOKUP} --enable-admission-plugins=${ENABLE_ADMISSION_PLUGINS} --disable-admission-plugins=${DISABLE_ADMISSION_PLUGINS} --admission-control-config-file=${ADMISSION_CONTROL_CONFIG_FILE} --bind-address=${API_BIND_ADDR} --secure-port=${API_SECURE_PORT} --tls-cert-file=${CERT_DIR}/serving-kube-apiserver.crt --tls-private-key-file=${CERT_DIR}/serving-kube-apiserver.key --insecure-bind-address=${API_HOST_IP} --insecure-port=${API_PORT} --storage-backend=${STORAGE_BACKEND} --storage-media-type=${STORAGE_MEDIA_TYPE} --etcd-servers=http://${ETCD_HOST}:${ETCD_PORT} --service-cluster-ip-range=${SERVICE_CLUSTER_IP_RANGE} --feature-gates=${FEATURE_GATES} --external-hostname=${EXTERNAL_HOSTNAME} --requestheader-username-headers=X-Remote-User --requestheader-group-headers=X-Remote-Group --requestheader-extra-headers-prefix=X-Remote-Extra- --requestheader-client-ca-file=${CERT_DIR}/request-header-ca.crt --requestheader-allowed-names=system:auth-proxy --proxy-client-cert-file=${CERT_DIR}/client-auth-proxy.crt --proxy-client-key-file=${CERT_DIR}/client-auth-proxy.key --cors-allowed-origins=${API_CORS_ALLOWED_ORIGINS} >${APISERVER_LOG} 2>&1 & "

# Wait for kube-apiserver to come up before launching the rest of the components.
echo "Waiting for apiserver to come up"
#kube::util::wait_for_url "https://${API_HOST_IP}:${API_SECURE_PORT}/healthz" "apiserver: " 1 "${WAIT_FOR_URL_API_SERVER}" "${MAX_TIME_FOR_URL_API_SERVER}" \
# || { echo "check apiserver logs: ${APISERVER_LOG}" ; exit 1 ; }

# Create kubeconfigs for all components, using client certs
kube::util::write_client_kubeconfig "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "${ROOT_CA_FILE}" "${API_HOST}" "${API_SECURE_PORT}" admin
${CONTROLPLANE_SUDO} chown "${USER}" "${CERT_DIR}/client-admin.key" # make readable for kubectl
kube::util::write_client_kubeconfig "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "${ROOT_CA_FILE}" "${API_HOST}" "${API_SECURE_PORT}" controller
kube::util::write_client_kubeconfig "${CONTROLPLANE_SUDO}" "${CERT_DIR}" "${ROOT_CA_FILE}" "${API_HOST}" "${API_SECURE_PORT}" scheduler

if [[ -z "${AUTH_ARGS}" ]]; then
AUTH_ARGS="--client-key=${CERT_DIR}/client-admin.key --client-certificate=${CERT_DIR}/client-admin.crt"
fi

# Grant apiserver permission to speak to the kubelet
${KUBECTL} --kubeconfig "${CERT_DIR}/admin.kubeconfig" create clusterrolebinding kube-apiserver-kubelet-admin --clusterrole=system:kubelet-api-admin --user=kube-apiserver

${CONTROLPLANE_SUDO} cp "${CERT_DIR}/admin.kubeconfig" "${CERT_DIR}/admin-kube-aggregator.kubeconfig"
${CONTROLPLANE_SUDO} chown "$(whoami)" "${CERT_DIR}/admin-kube-aggregator.kubeconfig"
${KUBECTL} config set-cluster local-up-cluster --kubeconfig="${CERT_DIR}/admin-kube-aggregator.kubeconfig" --server="https://${API_HOST_IP}:31090"
echo "use 'kubectl --kubeconfig=${CERT_DIR}/admin-kube-aggregator.kubeconfig' to use the aggregated API server"

}

function start_controller_manager {
node_cidr_args=()
if [[ "${NET_PLUGIN}" == "kubenet" ]]; then
node_cidr_args=("--allocate-node-cidrs=true" "--cluster-cidr=10.1.0.0/16")
fi

cloud_config_arg=("--cloud-provider=${CLOUD_PROVIDER}" "--cloud-config=${CLOUD_CONFIG}")
if [[ "${EXTERNAL_CLOUD_PROVIDER:-}" == "true" ]]; then
cloud_config_arg=("--cloud-provider=external")
cloud_config_arg+=("--external-cloud-volume-plugin=${CLOUD_PROVIDER}")
cloud_config_arg+=("--cloud-config=${CLOUD_CONFIG}")
fi

CTLRMGR_LOG=${LOG_DIR}/kube-controller-manager.log

echo "go build controller-manager.go ${GO_OUT}/kube-controller-manager --v=${LOG_LEVEL} --vmodule=${LOG_SPEC} --service-account-private-key-file=${SERVICE_ACCOUNT_KEY} --root-ca-file=${ROOT_CA_FILE} --cluster-signing-cert-file=${CLUSTER_SIGNING_CERT_FILE} --cluster-signing-key-file=${CLUSTER_SIGNING_KEY_FILE} --enable-hostpath-provisioner=${ENABLE_HOSTPATH_PROVISIONER} ${node_cidr_args[@]+${node_cidr_args[@]}} --pvclaimbinder-sync-period=${CLAIM_BINDER_SYNC_PERIOD} --feature-gates=${FEATURE_GATES} ${cloud_config_arg[@]} --kubeconfig ${CERT_DIR}/controller.kubeconfig --use-service-account-credentials --controllers=${KUBE_CONTROLLERS} --leader-elect=false --cert-dir=${CERT_DIR} --master=https://${API_HOST}:${API_SECURE_PORT} >${CTLRMGR_LOG} 2>&1 &"
# CTLRMGR_PID=$!
}

function start_cloud_controller_manager {
if [ -z "${CLOUD_CONFIG}" ]; then
echo "CLOUD_CONFIG cannot be empty!"
exit 1
fi
if [ ! -f "${CLOUD_CONFIG}" ]; then
echo "Cloud config ${CLOUD_CONFIG} doesn't exist"
exit 1
fi

node_cidr_args=()
if [[ "${NET_PLUGIN}" == "kubenet" ]]; then
node_cidr_args=("--allocate-node-cidrs=true" "--cluster-cidr=10.1.0.0/16")
fi

CLOUD_CTLRMGR_LOG=${LOG_DIR}/cloud-controller-manager.log
${CONTROLPLANE_SUDO} "${EXTERNAL_CLOUD_PROVIDER_BINARY:-"${GO_OUT}/cloud-controller-manager"}" \
--v="${LOG_LEVEL}" \
--vmodule="${LOG_SPEC}" \
"${node_cidr_args[@]:-}" \
--feature-gates="${FEATURE_GATES}" \
--cloud-provider="${CLOUD_PROVIDER}" \
--cloud-config="${CLOUD_CONFIG}" \
--kubeconfig "${CERT_DIR}"/controller.kubeconfig \
--use-service-account-credentials \
--leader-elect=false \
--master="https://${API_HOST}:${API_SECURE_PORT}" >"${CLOUD_CTLRMGR_LOG}" 2>&1 &
export CLOUD_CTLRMGR_PID=$!
}

function wait_node_ready(){
# check the nodes information after kubelet daemon start
local nodes_stats="${KUBECTL} --kubeconfig '${CERT_DIR}/admin.kubeconfig' get nodes"
local node_name=$HOSTNAME_OVERRIDE
local system_node_wait_time=30
local interval_time=2
kube::util::wait_for_success "$system_node_wait_time" "$interval_time" "$nodes_stats | grep $node_name"
if [ $? == "1" ]; then
echo "time out on waiting $node_name info"
exit 1
fi
}

function start_kubelet {
KUBELET_LOG=${LOG_DIR}/kubelet.log
mkdir -p "${POD_MANIFEST_PATH}" &>/dev/null || sudo mkdir -p "${POD_MANIFEST_PATH}"

cloud_config_arg=("--cloud-provider=${CLOUD_PROVIDER}" "--cloud-config=${CLOUD_CONFIG}")
if [[ "${EXTERNAL_CLOUD_PROVIDER:-}" == "true" ]]; then
cloud_config_arg=("--cloud-provider=external")
cloud_config_arg+=("--provider-id=$(hostname)")
fi

mkdir -p "/var/lib/kubelet" &>/dev/null || sudo mkdir -p "/var/lib/kubelet"
# Enable dns
if [[ "${ENABLE_CLUSTER_DNS}" = true ]]; then
if [[ "${ENABLE_NODELOCAL_DNS:-}" == "true" ]]; then
dns_args=("--cluster-dns=${LOCAL_DNS_IP}" "--cluster-domain=${DNS_DOMAIN}")
else
dns_args=("--cluster-dns=${DNS_SERVER_IP}" "--cluster-domain=${DNS_DOMAIN}")
fi
else
# To start a private DNS server set ENABLE_CLUSTER_DNS and
# DNS_SERVER_IP/DOMAIN. This will at least provide a working
# DNS server for real world hostnames.
dns_args=("--cluster-dns=8.8.8.8")
fi
net_plugin_args=()
if [[ -n "${NET_PLUGIN}" ]]; then
net_plugin_args=("--network-plugin=${NET_PLUGIN}")
fi

auth_args=()
if [[ "${KUBELET_AUTHORIZATION_WEBHOOK:-}" != "false" ]]; then
auth_args+=("--authorization-mode=Webhook")
fi
if [[ "${KUBELET_AUTHENTICATION_WEBHOOK:-}" != "false" ]]; then
auth_args+=("--authentication-token-webhook")
fi
if [[ -n "${CLIENT_CA_FILE:-}" ]]; then
auth_args+=("--client-ca-file=${CLIENT_CA_FILE}")
else
auth_args+=("--client-ca-file=${CERT_DIR}/client-ca.crt")
fi

cni_conf_dir_args=()
if [[ -n "${CNI_CONF_DIR}" ]]; then
cni_conf_dir_args=("--cni-conf-dir=${CNI_CONF_DIR}")
fi

cni_bin_dir_args=()
if [[ -n "${CNI_BIN_DIR}" ]]; then
cni_bin_dir_args=("--cni-bin-dir=${CNI_BIN_DIR}")
fi

container_runtime_endpoint_args=()
if [[ -n "${CONTAINER_RUNTIME_ENDPOINT}" ]]; then
container_runtime_endpoint_args=("--container-runtime-endpoint=${CONTAINER_RUNTIME_ENDPOINT}")
fi

image_service_endpoint_args=()
if [[ -n "${IMAGE_SERVICE_ENDPOINT}" ]]; then
image_service_endpoint_args=("--image-service-endpoint=${IMAGE_SERVICE_ENDPOINT}")
fi

# shellcheck disable=SC2206
all_kubelet_flags=(
"--v=${LOG_LEVEL}"
"--vmodule=${LOG_SPEC}"
"--chaos-chance=${CHAOS_CHANCE}"
"--container-runtime=${CONTAINER_RUNTIME}"
"--hostname-override=${HOSTNAME_OVERRIDE}"
"${cloud_config_arg[@]}"
"--address=${KUBELET_HOST}"
--kubeconfig "${CERT_DIR}"/kubelet.kubeconfig
"--feature-gates=${FEATURE_GATES}"
"--cpu-cfs-quota=${CPU_CFS_QUOTA}"
"--enable-controller-attach-detach=${ENABLE_CONTROLLER_ATTACH_DETACH}"
"--cgroups-per-qos=${CGROUPS_PER_QOS}"
"--cgroup-driver=${CGROUP_DRIVER}"
"--cgroup-root=${CGROUP_ROOT}"
"--eviction-hard=${EVICTION_HARD}"
"--eviction-soft=${EVICTION_SOFT}"
"--eviction-pressure-transition-period=${EVICTION_PRESSURE_TRANSITION_PERIOD}"
"--pod-manifest-path=${POD_MANIFEST_PATH}"
"--fail-swap-on=${FAIL_SWAP_ON}"
${auth_args[@]+"${auth_args[@]}"}
${dns_args[@]+"${dns_args[@]}"}
${cni_conf_dir_args[@]+"${cni_conf_dir_args[@]}"}
${cni_bin_dir_args[@]+"${cni_bin_dir_args[@]}"}
${net_plugin_args[@]+"${net_plugin_args[@]}"}
${container_runtime_endpoint_args[@]+"${container_runtime_endpoint_args[@]}"}
${image_service_endpoint_args[@]+"${image_service_endpoint_args[@]}"}
"--runtime-request-timeout=${RUNTIME_REQUEST_TIMEOUT}"
"--port=${KUBELET_PORT}"
${KUBELET_FLAGS}
)

if [[ "${REUSE_CERTS}" != true ]]; then
generate_kubelet_certs
fi

# shellcheck disable=SC2024
}

function start_kubeproxy {
PROXY_LOG=${LOG_DIR}/kube-proxy.log

# wait for kubelet collect node information
echo "wait kubelet ready"
#wait_node_ready

cat <<EOF > /tmp/kube-proxy.yaml
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
clientConnection:
kubeconfig: ${CERT_DIR}/kube-proxy.kubeconfig
hostnameOverride: ${HOSTNAME_OVERRIDE}
mode: ${KUBE_PROXY_MODE}
EOF
if [[ -n ${FEATURE_GATES} ]]; then
echo "featureGates:"
# Convert from foo=true,bar=false to
# foo: true
# bar: false
for gate in $(echo "${FEATURE_GATES}" | tr ',' ' '); do
echo "${gate}" | ${SED} -e 's/\(.*\)=\(.*\)/ \1: \2/'
done
fi >>/tmp/kube-proxy.yaml

if [[ "${REUSE_CERTS}" != true ]]; then
generate_kubeproxy_certs
fi

# shellcheck disable=SC2024
echo "go build proxy.go --v=${LOG_LEVEL} --config=/tmp/kube-proxy.yaml --master=https://${API_HOST}:${API_SECURE_PORT} >${PROXY_LOG} 2>&1 &"
}

function start_kubescheduler {

SCHEDULER_LOG=${LOG_DIR}/kube-scheduler.log
echo "go build scheduler.go --v=${LOG_LEVEL} --leader-elect=false --kubeconfig ${CERT_DIR}/scheduler.kubeconfig --feature-gates=${FEATURE_GATES} --master=https://${API_HOST}:${API_SECURE_PORT} >${SCHEDULER_LOG} 2>&1 &"
}

function start_kubedns {
if [[ "${ENABLE_CLUSTER_DNS}" = true ]]; then
cp "${KUBE_ROOT}/cluster/addons/dns/kube-dns/kube-dns.yaml.in" kube-dns.yaml
${SED} -i -e "s/{{ pillar\['dns_domain'\] }}/${DNS_DOMAIN}/g" kube-dns.yaml
${SED} -i -e "s/{{ pillar\['dns_server'\] }}/${DNS_SERVER_IP}/g" kube-dns.yaml
${SED} -i -e "s/{{ pillar\['dns_memory_limit'\] }}/${DNS_MEMORY_LIMIT}/g" kube-dns.yaml
# TODO update to dns role once we have one.
# use kubectl to create kubedns addon
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" --namespace=kube-system create -f kube-dns.yaml
echo "Kube-dns addon successfully deployed."
rm kube-dns.yaml
fi
}

function start_nodelocaldns {
cp "${KUBE_ROOT}/cluster/addons/dns/nodelocaldns/nodelocaldns.yaml" nodelocaldns.yaml
${SED} -i -e "s/__PILLAR__DNS__DOMAIN__/${DNS_DOMAIN}/g" nodelocaldns.yaml
${SED} -i -e "s/__PILLAR__DNS__SERVER__/${DNS_SERVER_IP}/g" nodelocaldns.yaml
${SED} -i -e "s/__PILLAR__LOCAL__DNS__/${LOCAL_DNS_IP}/g" nodelocaldns.yaml
# use kubectl to create nodelocaldns addon
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" --namespace=kube-system create -f nodelocaldns.yaml
echo "NodeLocalDNS addon successfully deployed."
rm nodelocaldns.yaml
}

function start_kubedashboard {
if [[ "${ENABLE_CLUSTER_DASHBOARD}" = true ]]; then
echo "Creating kubernetes-dashboard"
# use kubectl to create the dashboard
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" apply -f "${KUBE_ROOT}/cluster/addons/dashboard/dashboard-secret.yaml"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" apply -f "${KUBE_ROOT}/cluster/addons/dashboard/dashboard-configmap.yaml"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" apply -f "${KUBE_ROOT}/cluster/addons/dashboard/dashboard-rbac.yaml"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" apply -f "${KUBE_ROOT}/cluster/addons/dashboard/dashboard-deployment.yaml"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" apply -f "${KUBE_ROOT}/cluster/addons/dashboard/dashboard-service.yaml"
echo "kubernetes-dashboard deployment and service successfully deployed."
fi
}

function create_psp_policy {
echo "Create podsecuritypolicy policies for RBAC."
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" create -f "${KUBE_ROOT}/examples/podsecuritypolicy/rbac/policies.yaml"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" create -f "${KUBE_ROOT}/examples/podsecuritypolicy/rbac/roles.yaml"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" create -f "${KUBE_ROOT}/examples/podsecuritypolicy/rbac/bindings.yaml"
}

function create_storage_class {
if [ -z "${CLOUD_PROVIDER}" ]; then
CLASS_FILE=${KUBE_ROOT}/cluster/addons/storage-class/local/default.yaml
else
CLASS_FILE=${KUBE_ROOT}/cluster/addons/storage-class/${CLOUD_PROVIDER}/default.yaml
fi

if [ -e "${CLASS_FILE}" ]; then
echo "Create default storage class for ${CLOUD_PROVIDER}"
${KUBECTL} --kubeconfig="${CERT_DIR}/admin.kubeconfig" create -f "${CLASS_FILE}"
else
echo "No storage class available for ${CLOUD_PROVIDER}."
fi
}

function print_success {
if [[ "${START_MODE}" != "kubeletonly" ]]; then
if [[ "${ENABLE_DAEMON}" = false ]]; then
echo "Local Kubernetes cluster is running. Press Ctrl-C to shut it down."
else
echo "Local Kubernetes cluster is running."
fi
cat <<EOF

Logs:
${APISERVER_LOG:-}
${CTLRMGR_LOG:-}
${CLOUD_CTLRMGR_LOG:-}
${PROXY_LOG:-}
${SCHEDULER_LOG:-}
EOF
fi

if [[ "${START_MODE}" == "all" ]]; then
echo " ${KUBELET_LOG}"
elif [[ "${START_MODE}" == "nokubelet" ]]; then
echo
echo "No kubelet was started because you set START_MODE=nokubelet"
echo "Run this script again with START_MODE=kubeletonly to run a kubelet"
fi

if [[ "${START_MODE}" != "kubeletonly" ]]; then
echo
if [[ "${ENABLE_DAEMON}" = false ]]; then
echo "To start using your cluster, you can open up another terminal/tab and run:"
else
echo "To start using your cluster, run:"
fi
cat <<EOF

export KUBECONFIG=${CERT_DIR}/admin.kubeconfig
cluster/kubectl.sh

Alternatively, you can write to the default kubeconfig:

export KUBERNETES_PROVIDER=local

cluster/kubectl.sh config set-cluster local --server=https://${API_HOST}:${API_SECURE_PORT} --certificate-authority=${ROOT_CA_FILE}
cluster/kubectl.sh config set-credentials myself ${AUTH_ARGS}
cluster/kubectl.sh config set-context local --cluster=local --user=myself
cluster/kubectl.sh config use-context local
cluster/kubectl.sh
EOF
else
cat <<EOF
The kubelet was started.

Logs:
${KUBELET_LOG}
EOF
fi
}


if [ "${CONTAINER_RUNTIME}" == "docker" ] && ! kube::util::ensure_docker_daemon_connectivity; then
exit 1
fi

if [[ "${START_MODE}" != "kubeletonly" ]]; then
test_apiserver_off
fi

kube::util::test_openssl_installed
kube::util::ensure-cfssl

### IF the user didn't supply an output/ for the build... Then we detect.
if [ "${GO_OUT}" == "" ]; then
detect_binary
fi
echo "Detected host and ready to start services. Doing some housekeeping first..."
echo "Using GO_OUT ${GO_OUT}"
export KUBELET_CIDFILE=/tmp/kubelet.cid
if [[ "${ENABLE_DAEMON}" = false ]]; then
trap cleanup EXIT
fi

echo "Starting services now!"
if [[ "${START_MODE}" != "kubeletonly" ]]; then
start_etcd
set_service_accounts
start_apiserver
start_controller_manager
start_kubescheduler
#start_kubedashboard
fi

#if [[ "${START_MODE}" != "nokubelet" ]]; then
# ## TODO remove this check if/when kubelet is supported on darwin
# # Detect the OS name/arch and display appropriate error.
# case "$(uname -s)" in
# Darwin)
# print_color "kubelet is not currently supported in darwin, kubelet aborted."
# KUBELET_LOG=""
# ;;
# Linux)
# start_kubelet
# ;;
# *)
# print_color "Unsupported host OS. Must be Linux or Mac OS X, kubelet aborted."
# ;;
# esac
#fi

if [[ "${START_MODE}" != "kubeletonly" ]]; then
if [[ "${START_MODE}" != "nokubeproxy" ]]; then
start_kubeproxy
fi
fi



#print_success
#
#if [[ "${ENABLE_DAEMON}" = false ]]; then
# while true; do sleep 1; healthcheck; done
#fi
#
#if [[ "${KUBETEST_IN_DOCKER:-}" == "true" ]]; then
# cluster/kubectl.sh config set-cluster local --server=https://localhost:6443 --certificate-authority=/var/run/kubernetes/server-ca.crt
# cluster/kubectl.sh config set-credentials myself --client-key=/var/run/kubernetes/client-admin.key --client-certificate=/var/run/kubernetes/client-admin.crt
# cluster/kubectl.sh config set-context local --cluster=local --user=myself
# cluster/kubectl.sh config use-context local
#fi
阅读全文 »

k8s简介

发表于 2020-02-03 | 分类于 k8s

k8s简介

k8s 是谷歌开源的容器集群管理系统,主要功能包括:

  1. 基于容器的应用部署、维护和滚动升级
  2. 负载均衡和服务发现
  3. 跨机器和跨地区的集群调度
  4. 自动伸缩
  5. 无状态服务和有状态服务
  6. 广泛的 Volume 支持
  7. 插件机制保证扩展性

现如今,k8s 发展非常迅速,已经成为容器编排领域的领导者。k8s由以下重要组件组成

  1. ETCD :是用来存储所有 Kubernetes 的集群状态的,它除了具备状态存储的功能,还有事件监听和订阅、Leader选举的功能,所谓事件监听和订阅,各个其他组件通信,都并不是互相调用 API 来完成的,而是把状态写入 ETCD(相当于写入一个消息),其他组件通过监听 ETCD 的状态的的变化(相当于订阅消息),然后做后续的处理,然后再一次把更新的数据写入 ETCD。所谓 Leader 选举,其它一些组件比如 Scheduler,为了做实现高可用,通过 ETCD 从多个(通常是3个)实例里面选举出来一个做Master,其他都是Standby。
  2. API Server:刚才说了 ETCD 是整个系统的最核心,所有组件之间通信都需要通过 ETCD,实际上,他们并不是直接访问 ETCD,而是访问一个代理,这个代理是通过标准的RESTFul API,重新封装了对 ETCD 接口调用,除此之外,这个代理还实现了一些附加功能,比如身份的认证、缓存等。这个代理就是 API Server。
  3. Controller Manager:是实现任务调度的,关于任务调度可以参考之前的文章,简单说,直接请求 Kubernetes 做调度的都是任务,比如 Deployment 、Deamon Set 或者 Job,每一个任务请求发送给Kubernetes之后,都是由Controller Manager来处理的,每一个任务类型对应一个Controller Manager,比如 Deployment对应一个叫做 Deployment Controller,DaemonSet 对应一个 DaemonSet Controller。
  4. Scheduler:是用来做资源调度的,Controller Manager会把任务对资源要求,其实就是Pod,写入到ETCD里面,Scheduler监听到有新的资源需要调度(新的Pod),就会根据整个集群的状态,给Pod分配到具体的节点上。
  5. Kubelet:是一个Agent,运行在每一个节点上,它会监听ETCD中的Pod信息,发现有分配给它所在节点的Pod需要运行,就在节点上运行相应的Pod,并且把状态更新回到ETCD。
  6. Kubectl: 是一个命令行工具,它会调用 API Server发送请求写入状态到ETCD,或者查询ETCD的状态。

其代码结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| 目录 | 说明 |
| ----------- | ---------------------------------------- |
| api | 输出接口文档用 |
| build | 构建脚本 |
| cluster | 适配不同I层的云,例如亚马逊AWS,微软Azure,谷歌GCE的集群启动脚本 |
| cmd | 所有的二进制可执行文件入口代码,例如apiserver/scheduler/kubelet |
| contrib | 项目贡献者 |
| docs | 文档,包括了用户文档、管理员文档、设计、新功能提议 |
| example | 使用案例 |
| Godeps | 项目中依赖使用的Go第三方包,例如docker客户端SDK,rest等 |
| hack | 工具箱,各种编译、构建、测试、校验的脚本都在这里面 |
| hooks | git提交前后触发的脚本 |
| pkg | 项目代码主目录,cmd的只是个入口,这里是所有的具体实现 |
| plugin | 插件,k8s认为调度器是插件的一部分,所以调度器的代码在这里 |
| release | 应该是Google发版本用的? |
| test | 测试相关的工具 |
| third_party | 一些第三方工具,应该不是强依赖的? |
| www | UI,不过已经被移动到新项目了 |

可以看到,关键实现代码都放在pkg这个目录下。对于apiserver这种跨度很广的组件而言,唯一有效的阅读方式估计就是
遍历pkg下所有的目录,概览大概知道这个目录是干啥的
从cmd这个入口来看apiserver的代码,然后一点点由浅入深,看apiserver的大致实现
分特性,看具体某个大的特性是怎么实现的,例如安全,例如和etcd存储对接
在上面这几步的过程中可以看看别人的代码阅读文档,能有效的节省时间

本系列文章,将分五节来介绍k8s组件:

  1. 本地调试
  2. apiserver工作原理
  3. Informer工作原理
  4. kube-controller-manager工作原理
  5. kube-scheduler工作原理
阅读全文 »

Zookeeper & Etcd 关键对比汇总

发表于 2019-12-01 | 分类于 etcd

Zookeeper & Etcd 关键对比汇总

1. 选举

1.1 Etcd选举过程

  1. 每个节点启动时作为Follwer角色,经过一个ElectionTimeout(启动时随时生成的 默认150ms~300ms)后进入Candidate状态,并向其他节点发送消息MsgVote消息(本节点的Term、最新的日志Index,最新日志的LogTerm);

  2. 其他节点收到投标后,进行两步判断:

    1. 预判断,:
      1. r.Vote==m.Form(是否已经投标给该节点)
      2. 当前节点未投标且无leader节点;
      3. 对于预投标(PreVote),消息的任期比当前节点大;
    2. 任何一个条件判断则可以进入正式判断:
      1. 若消息中日志Term 对当前节点大则投给消息发送方,若日志Term相等,Index比当前节点大则也将票投给消息发送方r.raftLog.isUpToDate(m.Index, m.LogTerm)
    3. 若上面的判断失败,则返回拒绝
  3. 当发起投标方收到半数以上的头条,则转换自己的角色becomeLader (raft.go/stepCandidate),并广播一条空消息给其他所有节点。

    • Candidate –> Leader
    • Candidate –> Follower

    广播时间 << 选举超时时间 << 平均故障间隔时间

1.2 Zookeeper选举过程

双向心跳,Leader收不到半数以上的Follower节点发来的心跳则发起选举;Follower未收到Leader心跳则发起选举,选举时状态切换成LOOKING。

  1. 投票节点:首先将节点状态转变成LOOKING状态;

  2. 投票节点:将自己作为(SID、ZXID)广播给其他节点;

  3. 其他节点:接收其他节点的选举投票;

  4. 其他节点:如果接收的选举纪元大于本地的选举轮数则更新本地选举轮数,并将接收的投票跟本地投票比较后更新本地投票,再广播出去。否则执行4;

  5. 其他节点:判断外部投票是否优于本地,如果是则更新本地提议再广播出去;

    1. 比较纪元大小(最新日志记录的纪元,并不是当前选举纪元);

    2. 纪元相等的情况,比较 Zxid;

    3. Zxid相等的情况下 则比较 Sid;

      return ((newEpoch > curEpoch) ||

      ((newEpoch == curEpoch) &&
      ((newZxid > curZxid) || ((newZxid == curZxid) && (newId > curId)))));
  6. 投票节点:收集投票提议,并进行统计,若有节点收集的投票大于一半以上则认为该节点为Leader节点;

  7. 投票节点: 若投票选定节点是该节点则转变自己的服务器状态为LEADING,若不是则转换成FOLLOWING | OBSERVING。

  • LOOKING –> LEADING
  • LOOKING –> FOLLOWING

2. 数据同步

2.1 Zookeeper数据同步

阅读全文 »

ETCD汇总

发表于 2019-12-01 | 分类于 etcd

0. ETCD简介

本文将从以下几个方面来分析 ETCD (v3.3.12)。

  1. 整体架构
  2. 启动过程
  3. 数据存储
  4. 通信方式
  5. TTL实现原理
  6. Lease实现原理
  7. 单次事务过程
  8. 线性一致性读过程
  9. Watch机制

在介绍上面所有过程之前,我们先来介绍下 ETCD的整体架构以及相关名词术语。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
- Node:一个Raft状态机节点;
- Proxy:etcd的一种模式,为etcd集群提供反向代理服务;
- Member: etcd集群中的一个节点。它可以与其他节点进行交互且为客户端提供服务;
- Cluster:由多个Member组成的etcd集群;
- Peer:对处在相同集群中其他节点的称呼;
- Client: 请求客户端;
- Candidate 候选人
- Leader 领导者
- Follower 跟随者
- Term 选举任期,每次选举之后递增1
- Index:索引号,Raft中通过Term和Index来定位数据。
- Vote 选举投票(的ID)
- Commit 提交
- Propose 提议
- WAL:预写式日志
- SoftState:软状态,软状态易变且不需要保存在WAL日志中的状态数据,包括:集群leader、节点的当前状态
- HardState:硬状态,与软状态相反,需要写入持久化存储中,包括:节点当前Term、Vote、Commit
- ReadStates:用于读一致性的数据,后续会详细介绍
- Entries:在向其他集群发送消息之前需要先写入持久化存储的日志数据
- Snapshot:需要写入持久化存储中的快照数据
- CommittedEntries:需要输入到状态机中的数据,这些数据之前已经被保存到持久化存储中了
- Messages:在entries被写入持久化存储中以后,需要发送出去的数据

peer间通信消息类型:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
- MsgHup            // 不用于节点间通信,仅用于发送给本节点让本节点进行选举
- MsgBeat // 心跳消息
- MsgProp // raft库使用者提议(propose)数据
- MsgApp // 用于leader向集群中其他节点同步数据的消息
- MsgAppResp // 消息同步回复
- MsgVote // 请求投票
- MsgVoteResp // 投票反馈
- MsgSnap // 用于leader向follower同步数据用的快照消息
- MsgHeartbeat // 心跳消息
- MsgHeartbeatResp // 心跳回复消息
- MsgTransferLeader // 转移主节点
- MsgReadIndex // 用于线性一致性读
- MsgReadIndexResp
- MsgPreVote // 请求预先投票
- MsgPreVoteResp

Etcd整体架构图如下:

下面将简单介绍下:

  1. etcd面向client和peer节点开放http服务以及grpc服务,对于像watch机制就是基于grpc的stream通信模式实现的;
  2. EtcdServer是etcd上层结构体,其负责对外提供服务,且负责应用层的实现,比如操作应用层存储器,管理leassor、watch;
  3. raftNode负责上层与raft层的衔接。其负责将应用的需求传递到raft中进行处理(通过Step函数)、在消息发送到其他节点前将消息保存到WAL中、调用传输器发送消息;
  4. raft是raft协议的承载者;
  5. raftLog用于存储状态机信息:memoryStorge保存稳定的记录,unstable保存不稳定的记录。

1. Etcd初始化流程解析

阅读全文 »

ETCD 集群管理(is comming)

发表于 2019-11-25 | 分类于 etcd

7. 集群管理

7.1 集群间数据同步

7.2 变更节点

7.2.1 新增节点

7.2.2 下线节点

7.3 节点宕机

7.3.1 主节点宕机

7.3.2 follower节点宕机

7.4 节点迁移、替换

阅读全文 »
prev123…12next
Zamperini

Zamperini

GitHub E-Mail Weibo
© 2020 Zamperini
由 Hexo 强力驱动
|
主题 — NexT.Gemini v6.0.3