Taint ā Extended Technical Detail
What is a Taint in Simple Terms?
A taint is like putting a "No Entry" sign on a node. By default, no pod will be scheduled on a tainted node. Only pods that explicitly say "I am allowed to go here" (via a Toleration) will be scheduled there.
+------------------------------------------+| Without Taint || || gpu-node-1 <- any pod can land here | <- Regular API pods, batch jobs,| gpu-node-2 <- any pod can land here | and GPU workloads all compete| | for expensive GPU nodes+------------------------------------------+ +------------------------------------------+| With Taint: workload=gpu:NoSchedule || || gpu-node-1 [TAINTED] | <- Only pods with matching| gpu-node-2 [TAINTED] | Toleration can land here.| | API pods are repelled.+------------------------------------------+Real Use Case
At Hotstar, GPU nodes for video transcoding are expensive (10x the cost of regular nodes). Tainting them ensures only transcoding pods run there ā not regular API pods accidentally consuming GPU capacity during a traffic spike.
Taint Effects ā The Three Modes
+------------------------+ +------------------------+ +------------------------+| NoSchedule | | PreferNoSchedule | | NoExecute || | | | | || New pods without | | Kubernetes will TRY | | New pods repelled AND || Toleration will NOT | | to avoid this node | | existing pods without || be scheduled here | | but may use it if no | | Toleration are EVICTED || | | other node is free | | || Existing pods: safe | | Existing pods: safe | | Use for: maintenance || Use for: isolation | | Use for: soft hints | | and node draining |+------------------------+ +------------------------+ +------------------------+How to Add a Taint to a Node
1# Taint a node ā only GPU workloads allowed2kubectl taint nodes gpu-node-1 workload=gpu:NoSchedule3# Taint format: key=value:effect4 5# Taint for node maintenance ā evicts non-tolerating pods immediately6kubectl taint nodes mumbai-worker-3 maintenance=patching:NoExecute7 8# Taint without a value (key-only taint)9kubectl taint nodes gpu-node-1 dedicated:NoSchedule10 11# Verify the taint was applied12kubectl describe node gpu-node-1 | grep -A 5 Taints13# Taints: workload=gpu:NoScheduleHow to Remove a Taint
1# Add a minus sign at the end to remove ā exact key=value:effect match required2kubectl taint nodes gpu-node-1 workload=gpu:NoSchedule-3 4# Remove a key-only taint5kubectl taint nodes gpu-node-1 dedicated:NoSchedule-6 7# Verify taint is gone8kubectl describe node gpu-node-1 | grep Taints9# Taints: <none>The Matching Toleration in Pod Spec
A taint does nothing without a matching Toleration in the pod spec:
1# deployment.yaml ā GPU transcoding pod that tolerates the taint2apiVersion: apps/v13kind: Deployment4metadata:5 name: video-transcoder6 namespace: streaming-prod7spec:8 template:9 spec:10 tolerations:11 - key: "workload"12 operator: "Equal"13 value: "gpu"14 effect: "NoSchedule" # Must match the taint effect exactly15 nodeSelector:16 workload: gpu # Also add nodeSelector to PREFER GPU nodes17 containers:18 - name: transcoder19 image: registry.hotstar.in/transcoder:v4.2.120 resources:21 limits:22 nvidia.com/gpu: "1" # Request 1 GPUHow to View Taints on All Nodes
1# Custom column output ā quick overview2kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints3 4# Full taint details including key, value, effect5kubectl get nodes -o json | \6 jq '.items[] | {name: .metadata.name, taints: .spec.taints}'7 8# Check a specific node9kubectl describe node mumbai-worker-3 | grep -A 10 TaintsTaint + Toleration + NodeAffinity ā The Complete Pattern
For hard node dedication (only GPU pods on GPU nodes, and GPU pods ONLY on GPU nodes), combine all three:
1spec:2 tolerations:3 - key: "workload"4 operator: "Equal"5 value: "gpu"6 effect: "NoSchedule"7 affinity:8 nodeAffinity:9 requiredDuringSchedulingIgnoredDuringExecution:10 nodeSelectorTerms:11 - matchExpressions:12 - key: workload13 operator: In14 values: ["gpu"]+------------------------------------------+| Taint on node | <- Repels non-GPU pods+------------------------------------------+| Toleration on pod | <- Allows GPU pod past the repel+------------------------------------------+| NodeAffinity on pod | <- Ensures GPU pod goes ONLY to| | GPU nodes (not just tolerates)+------------------------------------------+Troubleshooting Common Taint Problems
| Problem | Symptom | Fix |
|---|---|---|
Pod stuck in Pending with taint |
0/5 nodes are available: 5 node(s) had taint |
Pod missing Toleration ā add matching tolerations block to pod spec |
| Toleration set but pod still Pending | Pod has Toleration but won't schedule | Toleration effect doesn't match taint effect ā they must be identical |
| Non-GPU pods landing on GPU nodes | Cost spike on GPU instance billing | Taint is set but pods have a catch-all Toleration (operator: Exists) ā scope the Toleration to a specific key and value |
| NoExecute evicting critical pods | Running pods evicted unexpectedly | Wrong node tainted ā verify kubectl get nodes names before applying NoExecute |
| Taint not removed after maintenance | Node still repelling pods after patch | Forgot the trailing - in kubectl taint remove command ā re-run with - suffix |
š Remember: Taints and Tolerations work together. A taint on a node does nothing unless the pod also has the matching Toleration. Forgetting the Toleration means the pod stays inPendingforever ā andkubectl describe podwill shownode(s) had taintin the Events section.
š” Tip: UseNoExecutetaint effect during node maintenance. It automatically evicts all non-tolerating pods, so you can safely patch the node without runningkubectl drainmanually. The system daemonsets (kube-proxy, CNI) always have built-in Tolerations for system taints and will not be evicted.
ā ļø Security: On PCI-DSS workloads like Razorpay's card processing pods, use NoSchedule taints on dedicated nodes to ensure no other workload can co-reside on those nodes. Co-tenancy on the same node means potential side-channel attacks via shared CPU caches ā taint-based isolation is a critical compliance control.š“ Common Mistake: Usingoperator: Existswithout akeyin a Toleration. This creates a wildcard Toleration that matches ALL taints on ALL nodes ā effectively defeating every taint in the cluster for that pod. Always scope Tolerations to a specifickey.