介绍
Pod的调度在默认情况下是 Scheduler Controller 采用默认算法的全自动调度,在实际使用中并不满足我们的需求,因为并不能事先掌握pod被调度到哪个Node之上,所以kubernetes又提供了好几种方式让我们自已选择调度到什么Node中,比如有NodeSelector(定向调度)、NodeAffinity(Node亲和性)、PodAffinity(Pod亲和性)、Taints
NodeSelector调度算法比较简单,NodeSelector 只调度到某个拥有特定标签的Node上,如果没有满足条件的Node,那么此Pod将不会被运行,即使在集群中还有可用Node列表,这就限制了它的使用场景,现在基本上被NodeAffinity取代了,NodeAffinity在NodeSelector的基础之上的进行了扩展,使调度更加灵活,除了有一个必须要满足条件的Node之外,还要可设置优先条件,下面来看下NodeAffinity的使用方式:
当前节点:
当前Pod:
1、nodeName
通过主机名指定pod调度到指定节点
[root@k8s-master 02]# vim manual-schedule.yaml apiVersion: v1 kind: Pod metadata: labels: run: pod-manual-schedule name: pod-manual-schedule namespace: default spec: nodeName: k8s-node2 containers: - name: my-pod image: nginx imagePullPolicy: IfNotPresent
[root@k8s-master 02]# kubectl apply -f manual-schedule.yaml
2、nodeSelector
Kubernetes中常用label来管理集群的资源,nodeSelector可通过标签实现pod调度到指定节点上。
举列:使用 nodeSelector 将 pod 调度到 k8s-node2 节点上
Step1:给k8s-node2打标签
kubectl label nodes k8s-node2 test=nginx #设置标签 kubectl get nodes --show-labels |grep test=nginx #查询标签是否设置成功
Step2:nodeSelector设置对应标签
apiVersion: v1 kind: Pod metadata: labels: run: pod-selector-schedule name: pod-selector-schedule namespace: default spec: nodeName: k8s-node2 containers: - name: my-pod image: nginx imagePullPolicy: IfNotPresent nodeSelector: test: nginx
3、nodeAffinity
node 节点 Affinity ,从字面上很容易理解nodeAffinity就是节点亲和性,Anti-Affinity也就是反亲和性。节点亲和性就是控制pod是否调度到指定节点,相对nodeSelector来说更为灵活,可以实现一些简单的逻辑组合。
preferredDuringSchedulingIgnoredDuringExecution #软策略,尽量满足 requiredDuringSchedulingIgnoredDuringExecution #硬策略,必须满足
场景1:必须部署到有test=nginx(k8s-node2)标签的节点
apiVersion: v1 kind: Pod metadata: labels: run: node-affinity name: node-affinity spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: node-affinity affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: test operator: In values: - nginx
场景2:最好部署到有test=nginx(k8s-node2)标签的节点
apiVersion: v1 kind: Pod metadata: labels: run: node-affinity name: node-affinity spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: node-affinity affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: test operator: In values: - nginx
场景3:不能部署在k8s-node2节点
apiVersion: v1 kind: Pod metadata: labels: run: node-affinity name: node-affinity spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: node-affinity affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: NotIn values: - k8s-node2
Kubernetes中的operator提供了下面几种过滤条件:
- In:label 的值在某个列表中 - NotIn:label 的值不在某个列表中 - Gt:label 的值大于某个值 - Lt:label 的值小于某个值 - Exists:某个 label 存在 - DoesNotExist:某个 label 不存
3、podAffinity
nodeSelector 和 nodeAffinity 都是控制 pod 调度到节点的操作,在实际项目部署场景中,希望根据服务与服务之间的关系进行调度,也就是根据 pod 之间的关系进行调度,比如同一个项目的前后端,希望可以在同一个节点上,Kubernetes的podAffinity就可以实现这样的场景,podAffinity的调度策略和nodeAffinity类似也有:
requiredDuringSchedulingIgnoredDuringExecution preferredDuringSchedulingIgnoredDuringExecution
场景:希望新建的pod-affinity 与 node-affinity运行在同一个节点上
apiVersion: v1 kind: Pod metadata: labels: run: pod-affinity name: pod-affinity spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: pod-affinity affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: run operator: In values: - "node-affinity" topologyKey: kubernetes.io/hostname
4、podAntiAffinity
Pod反亲和性
场景:希望新建的pod-affinity 与 node-affinity运行在不同节点上
apiVersion: v1 kind: Pod metadata: labels: run: pod-anti-affinity name: pod-anti-affinity spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: pod-anti-affinity affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: run operator: In values: - "pod-affinity" topologyKey: kubernetes.io/hostname
5、Taints & tolerations
Taints:污点,
tolerations:容忍
在实际项目实践中有时候不希望某些服务调度到指定节点上,比如带GPU节点只运行带GPU需求的组件,其他不需要GPU服务不希望调度到GPU上。这种场景只需要在GPU节点设置一个污点。
场景1:k8s-node1不带GPU,k8s-node2带GPU,希望Pod可以调度到带GPU的节点
Step1:给k8s-node1节点打污点
[root@k8s-master 02]# kubectl taint node k8s-node1 gpu=no:NoSchedule node/k8s-node1 tainted
Step2:指定Pod创建时指定去 GPU 值为 "yes" 的节点
apiVersion: v1 kind: Pod metadata: labels: run: pod-tolerations name: pod-tolerations spec: containers: - image: nginx imagePullPolicy: IfNotPresent name: pod-tolerations tolerations: - key: gpu operator: Equal value: "yes" effect: NoSchedule
Step3:删除污点并更改 GPU 为 yes
[root@k8s-master 02]# kubectl taint node k8s-node1 gpu:NoSchedule- node/k8s-node1 untainted [root@k8s-master 02]# kubectl taint node k8s-node1 gpu=yes:NoSchedule node/k8s-node1 tainted
设置了污点怎么调度?
如果设置了污点还是希望某些pod能够调度上去,可以给pod针对污点加容忍