OpenID Connect กับ Kubernetes Pod Scheduling
OpenID Connect (OIDC) ใช้เป็น authentication layer สำหรับ Kubernetes API server ทำให้ผู้ใช้ authenticate ผ่าน identity providers เช่น Google, Azure AD, Keycloak แล้วได้รับ permissions ตาม RBAC policies ที่กำหนดไว้ รวมถึง permissions ในการจัดการ Pod scheduling
Pod Scheduling เป็นกระบวนการที่ Kubernetes scheduler เลือก node ที่เหมาะสมสำหรับ run pod โดยพิจารณาจาก resource requests, node selectors, affinity rules, taints/tolerations และ topology constraints การรวม OIDC กับ scheduling ทำให้ควบคุมว่าใครสามารถ schedule pods ไปที่ nodes ไหนได้
Use cases สำคัญได้แก่ Multi-tenant clusters ที่แต่ละทีมมี dedicated nodes, Compliance requirements ที่ต้อง isolate workloads, Cost management ที่ต้อง control ว่าใครใช้ GPU nodes ได้, Security ที่ต้อง restrict scheduling บน sensitive nodes
ตั้งค่า OIDC Authentication สำหรับ Kubernetes
Configure OIDC กับ Kubernetes API server
# === OIDC Authentication for Kubernetes ===
# 1. Configure API Server with OIDC
# Add flags to kube-apiserver:
cat > /etc/kubernetes/manifests/kube-apiserver.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: kube-apiserver
namespace: kube-system
spec:
containers:
- name: kube-apiserver
command:
- kube-apiserver
# ... existing flags ...
- --oidc-issuer-url=https://keycloak.example.com/realms/k8s
- --oidc-client-id=kubernetes
- --oidc-username-claim=email
- --oidc-groups-claim=groups
- --oidc-username-prefix=oidc:
- --oidc-groups-prefix=oidc:
- --oidc-ca-file=/etc/kubernetes/pki/oidc-ca.pem
EOF
# 2. Configure kubectl with OIDC
# Install kubelogin (OIDC helper)
# kubectl krew install oidc-login
# Configure kubeconfig
cat >> ~/.kube/config << 'EOF'
users:
- name: oidc-user
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
command: kubectl
args:
- oidc-login
- get-token
- --oidc-issuer-url=https://keycloak.example.com/realms/k8s
- --oidc-client-id=kubernetes
- --oidc-client-secret=kubernetes-secret
contexts:
- name: oidc-context
context:
cluster: my-cluster
user: oidc-user
namespace: default
EOF
kubectl config use-context oidc-context
# 3. Keycloak Realm Setup for Kubernetes
# Create realm: k8s
# Create client: kubernetes
# - Access Type: confidential
# - Valid Redirect URIs: http://localhost:8000
# Create groups: devops, developers, data-science
# Map groups to token claims:
# Client Scopes > groups > Mappers > Add
# Name: groups
# Mapper Type: Group Membership
# Token Claim Name: groups
# 4. Test Authentication
kubectl auth whoami
# Output: oidc:user@example.com
# Groups: oidc:devops
echo "OIDC authentication configured"
Pod Scheduling Strategies
กลยุทธ์ Pod Scheduling
# === Pod Scheduling Strategies ===
# 1. Node Selector (Simple)
cat > scheduling/node-selector.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: gpu-workload
spec:
nodeSelector:
gpu: "true"
team: data-science
containers:
- name: training
image: tensorflow/tensorflow:latest-gpu
resources:
limits:
nvidia.com/gpu: 1
EOF
# 2. Node Affinity (Advanced)
cat > scheduling/node-affinity.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
replicas: 3
selector:
matchLabels:
app: api-service
template:
metadata:
labels:
app: api-service
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: team
operator: In
values: ["backend", "shared"]
- key: environment
operator: In
values: ["production"]
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 80
preference:
matchExpressions:
- key: zone
operator: In
values: ["ap-southeast-1a"]
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["api-service"]
topologyKey: kubernetes.io/hostname
containers:
- name: api
image: api-service:v1
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
EOF
# 3. Taints and Tolerations
# Taint GPU nodes:
kubectl taint nodes gpu-node-01 gpu=true:NoSchedule
kubectl taint nodes gpu-node-02 gpu=true:NoSchedule
# Only GPU workloads can schedule on GPU nodes:
cat > scheduling/gpu-toleration.yaml << 'EOF'
apiVersion: v1
kind: Pod
metadata:
name: ml-training
spec:
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
nodeSelector:
gpu: "true"
containers:
- name: training
image: pytorch/pytorch:latest
resources:
limits:
nvidia.com/gpu: 2
EOF
# 4. Topology Spread Constraints
cat > scheduling/topology-spread.yaml << 'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 6
template:
spec:
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: web-app
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app: web-app
containers:
- name: web
image: web-app:v1
EOF
echo "Scheduling strategies configured"
RBAC กับ OIDC Integration
ตั้งค่า RBAC ตาม OIDC groups
#!/usr/bin/env python3
# rbac_oidc.py — RBAC with OIDC Integration
import json
import logging
from typing import Dict, List
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("rbac")
class RBACManager:
def __init__(self):
self.roles = {}
self.bindings = {}
def generate_rbac_manifests(self):
"""Generate RBAC manifests for OIDC groups"""
return {
"devops_role": {
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "ClusterRole",
"metadata": {"name": "devops-scheduling"},
"rules": [
{"apiGroups": [""], "resources": ["pods", "nodes"], "verbs": ["get", "list", "watch", "create", "update", "delete"]},
{"apiGroups": ["apps"], "resources": ["deployments", "statefulsets"], "verbs": ["*"]},
{"apiGroups": [""], "resources": ["namespaces"], "verbs": ["get", "list"]},
],
},
"devops_binding": {
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "ClusterRoleBinding",
"metadata": {"name": "devops-scheduling-binding"},
"subjects": [{"kind": "Group", "name": "oidc:devops", "apiGroup": "rbac.authorization.k8s.io"}],
"roleRef": {"kind": "ClusterRole", "name": "devops-scheduling", "apiGroup": "rbac.authorization.k8s.io"},
},
"developer_role": {
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "Role",
"metadata": {"name": "developer-pods", "namespace": "dev"},
"rules": [
{"apiGroups": [""], "resources": ["pods", "pods/log"], "verbs": ["get", "list", "watch", "create", "delete"]},
{"apiGroups": ["apps"], "resources": ["deployments"], "verbs": ["get", "list", "create", "update"]},
],
},
"developer_binding": {
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "RoleBinding",
"metadata": {"name": "developer-pods-binding", "namespace": "dev"},
"subjects": [{"kind": "Group", "name": "oidc:developers", "apiGroup": "rbac.authorization.k8s.io"}],
"roleRef": {"kind": "Role", "name": "developer-pods", "apiGroup": "rbac.authorization.k8s.io"},
},
"datascience_role": {
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "Role",
"metadata": {"name": "datascience-gpu", "namespace": "ml"},
"rules": [
{"apiGroups": [""], "resources": ["pods"], "verbs": ["get", "list", "create", "delete"]},
{"apiGroups": ["batch"], "resources": ["jobs"], "verbs": ["*"]},
],
},
}
def audit_permissions(self, user, groups):
"""Audit what a user can do"""
permissions = {
"user": user,
"groups": groups,
"can_schedule_gpu": "oidc:data-science" in groups or "oidc:devops" in groups,
"can_schedule_production": "oidc:devops" in groups,
"can_view_all_namespaces": "oidc:devops" in groups,
"namespaces_allowed": [],
}
if "oidc:devops" in groups:
permissions["namespaces_allowed"] = ["*"]
elif "oidc:developers" in groups:
permissions["namespaces_allowed"] = ["dev", "staging"]
elif "oidc:data-science" in groups:
permissions["namespaces_allowed"] = ["ml", "notebooks"]
return permissions
manager = RBACManager()
manifests = manager.generate_rbac_manifests()
print("DevOps Role:", json.dumps(manifests["devops_role"]["rules"], indent=2))
audit = manager.audit_permissions("user@example.com", ["oidc:developers"])
print("\nAudit:", json.dumps(audit, indent=2))
Advanced Scheduling Techniques
เทคนิค scheduling ขั้นสูง
# === Advanced Scheduling ===
# 1. Priority Classes
cat > scheduling/priority.yaml << 'EOF'
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: critical-production
value: 1000000
globalDefault: false
description: "Critical production workloads"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: batch-processing
value: 100
globalDefault: false
preemptionPolicy: Never
description: "Batch jobs that can be preempted"
EOF
# 2. Resource Quotas per Namespace (team)
cat > scheduling/quota.yaml << 'EOF'
apiVersion: v1
kind: ResourceQuota
metadata:
name: dev-team-quota
namespace: dev
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
pods: "50"
requests.nvidia.com/gpu: "0"
---
apiVersion: v1
kind: ResourceQuota
metadata:
name: ml-team-quota
namespace: ml
spec:
hard:
requests.cpu: "40"
requests.memory: 80Gi
limits.cpu: "80"
limits.memory: 160Gi
pods: "30"
requests.nvidia.com/gpu: "8"
EOF
# 3. Limit Ranges
cat > scheduling/limits.yaml << 'EOF'
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: dev
spec:
limits:
- default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 100m
memory: 128Mi
max:
cpu: 2000m
memory: 4Gi
min:
cpu: 50m
memory: 64Mi
type: Container
EOF
# 4. Pod Disruption Budget
cat > scheduling/pdb.yaml << 'EOF'
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api-service
EOF
kubectl apply -f scheduling/
echo "Advanced scheduling configured"
Monitoring และ Troubleshooting
Monitor scheduling decisions
#!/usr/bin/env python3
# scheduling_monitor.py — Pod Scheduling Monitor
import json
import logging
from datetime import datetime
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("schedule")
class SchedulingMonitor:
def __init__(self):
self.events = []
def cluster_status(self):
return {
"nodes": {
"total": 10,
"ready": 10,
"by_team": {"backend": 4, "frontend": 2, "ml": 2, "shared": 2},
"by_zone": {"ap-southeast-1a": 4, "ap-southeast-1b": 3, "ap-southeast-1c": 3},
},
"pods": {
"running": 145,
"pending": 3,
"failed": 0,
"by_namespace": {"default": 20, "dev": 45, "staging": 30, "ml": 25, "monitoring": 25},
},
"resources": {
"cpu_requests_pct": 65,
"memory_requests_pct": 72,
"gpu_allocated": "6/8",
},
}
def scheduling_issues(self):
return {
"pending_pods": [
{
"name": "ml-training-xyz",
"namespace": "ml",
"reason": "Insufficient nvidia.com/gpu",
"message": "0/10 nodes available: 8 insufficient gpu, 2 node tainted",
"pending_since": "5 minutes",
"fix": "Wait for GPU to free up or add GPU nodes",
},
{
"name": "api-deploy-abc",
"namespace": "dev",
"reason": "NodeAffinity",
"message": "0/10 nodes match node affinity rules",
"pending_since": "2 minutes",
"fix": "Check node labels match affinity requirements",
},
],
"common_fixes": {
"insufficient_cpu": "Increase node pool or reduce resource requests",
"insufficient_memory": "Increase node pool or optimize memory usage",
"node_affinity": "Verify node labels: kubectl get nodes --show-labels",
"taint_no_schedule": "Add toleration to pod or remove taint from node",
"quota_exceeded": "Request quota increase or reduce usage",
},
}
monitor = SchedulingMonitor()
status = monitor.cluster_status()
print("Cluster:", json.dumps(status["nodes"], indent=2))
issues = monitor.scheduling_issues()
print("\nPending:", json.dumps(issues["pending_pods"][0], indent=2))
FAQ คำถามที่พบบ่อย
Q: ทำไมต้องใช้ OIDC กับ Kubernetes?
A: Kubernetes รองรับ authentication หลายวิธี แต่ OIDC ดีที่สุดสำหรับ multi-user environments เพราะ ใช้ identity provider ที่มีอยู่แล้ว (Google, Azure AD, Keycloak) ไม่ต้องสร้าง user management ใหม่, Short-lived tokens ปลอดภัยกว่า static kubeconfig, Group-based access control map OIDC groups กับ Kubernetes RBAC, Audit trail รู้ว่าใครทำอะไรเมื่อไหร่, SSO experience login ครั้งเดียวใช้ได้ทุก cluster ทางเลือกอื่นเช่น client certificates ยากในการ revoke, service account tokens ไม่เหมาะสำหรับ human users
Q: Pod Pending เพราะ scheduling ทำอย่างไร?
A: ดู events ก่อน kubectl describe pod pod-name ดู Events section จะบอกสาเหตุ สาเหตุที่พบบ่อย Insufficient cpu/memory เพิ่ม nodes หรือลด requests, Node affinity ไม่ match ตรวจ labels บน nodes, Taint ไม่มี toleration เพิ่ม toleration หรือลบ taint, PVC ไม่ available ตรวจ storage class, Quota exceeded ขอเพิ่ม quota สำหรับ debug เร็ว kubectl get events --sort-by='.lastTimestamp' ดู events ล่าสุด
Q: Multi-tenant cluster ควรแบ่ง nodes อย่างไร?
A: มีหลายแนวทาง Namespace isolation แบ่ง namespace ต่อทีม ใช้ ResourceQuota จำกัด resources, Node pools แบ่ง node groups ต่อทีม ใช้ taints + nodeSelector บังคับ scheduling, Mixed approach shared nodes สำหรับ workloads ทั่วไป dedicated nodes สำหรับ sensitive/GPU workloads แนะนำ Mixed approach เพราะ utilization ดีที่สุด ใช้ taints สำหรับ specialized nodes (GPU, high-memory) ใช้ namespace quotas สำหรับ general resources เพิ่ม RBAC ตาม OIDC groups เพื่อ restrict access
Q: Topology Spread Constraints ใช้เมื่อไหร?
A: ใช้เมื่อต้องการ distribute pods ข้าม failure domains เช่น กระจาย pods ข้าม availability zones เพื่อ HA, กระจาย pods ข้าม nodes เพื่อไม่ให้ node เดียว overload, กระจาย pods ข้าม racks สำหรับ on-premise ตั้ง maxSkew=1 สำหรับ even distribution ใช้ DoNotSchedule สำหรับ strict requirement ใช้ ScheduleAnyway สำหรับ best-effort ร่วมกับ podAntiAffinity สำหรับ spread ข้าม nodes
