Kubernetes Resource Management คืออะไร? สอนจัดการ CPU Memory Requests Limits 2026

ใน Kubernetes การจัดการ Resource (CPU และ Memory) เป็นหนึ่งในสิ่งที่สำคัญที่สุดแต่มักถูกมองข้าม การตั้ง Requests/Limits ที่ไม่ถูกต้องอาจทำให้ Pod ถูก OOMKilled, CPU Throttled, Scheduling ล้มเหลว หรือเสีย Cloud Cost มากเกินจำเป็น

ทำไม Resource Management ถึงสำคัญ?

Scheduling: Kubernetes ใช้ Requests เพื่อตัดสินใจว่า Pod จะรันบน Node ไหน ถ้าไม่ตั้ง Requests Scheduler ไม่มีข้อมูล
Stability: ถ้า Pod ใช้ Memory เกิน Limit จะถูก OOMKilled ทันที ทำให้ Service ล่ม
Cost: ตั้ง Requests สูงเกินไป = จอง Resource มากเกินจำเป็น = เสียเงิน Cloud เปล่า ๆ
Fairness: ถ้า Pod หนึ่งใช้ Resource ทั้งหมด Pod อื่นจะทำงานไม่ได้

Requests vs Limits

คุณสมบัติ	Requests	Limits
ความหมาย	จำนวน Resource ขั้นต่ำที่ Pod ต้องการ	จำนวน Resource สูงสุดที่ Pod ใช้ได้
Scheduling	ใช้ตัดสินใจว่า Pod ไปรันบน Node ไหน	ไม่ใช้ในการ Scheduling
ถ้าเกิน	Pod ยังใช้ได้มากกว่า Requests (ถ้า Node มี Resource ว่าง)	CPU: Throttled, Memory: OOMKilled
แนะนำ	ตั้งเสมอ!	ตั้ง Memory Limit เสมอ, CPU Limit ขึ้นอยู่กับกรณี

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  containers:
  - name: app
    image: my-app:v1
    resources:
      requests:          # ขั้นต่ำที่ต้องการ (Scheduling)
        cpu: "250m"      # 250 millicores = 0.25 CPU core
        memory: "256Mi"  # 256 Mebibytes
      limits:            # สูงสุดที่ใช้ได้
        cpu: "500m"      # 500 millicores = 0.5 CPU core
        memory: "512Mi"  # 512 Mebibytes

CPU Units — Millicores

# CPU Units ใน Kubernetes
# 1 CPU = 1 vCPU = 1 Core = 1000m (millicores)
#
# ตัวอย่าง:
# "100m" = 0.1 CPU = 10% ของ 1 Core
# "250m" = 0.25 CPU = 25% ของ 1 Core
# "500m" = 0.5 CPU = 50% ของ 1 Core
# "1" = 1 CPU = 100% ของ 1 Core
# "2" = 2 CPU = 200% (ใช้ 2 Cores)
#
# ⚡ CPU เป็น Compressible Resource:
# ถ้า Pod ใช้ CPU เกิน Limit → Throttled (ช้าลง แต่ไม่ตาย)
# ถ้า Pod ใช้ CPU เกิน Requests แต่ไม่เกิน Limit → ใช้ได้ถ้า Node มี CPU ว่าง

Memory Units — Mi/Gi

# Memory Units ใน Kubernetes
# Ki = Kibibyte (1024 bytes)
# Mi = Mebibyte (1024 Ki = 1,048,576 bytes)
# Gi = Gibibyte (1024 Mi = 1,073,741,824 bytes)
#
# ตัวอย่าง:
# "128Mi" = 128 MB = เหมาะ Sidecar/Init Container
# "256Mi" = 256 MB = App เล็ก
# "512Mi" = 512 MB = App ปานกลาง
# "1Gi" = 1 GB = App ใหญ่
# "4Gi" = 4 GB = Database/Cache
#
# ⚠️ Memory เป็น Incompressible Resource:
# ถ้า Pod ใช้ Memory เกิน Limit → OOMKilled ทันที!
# ไม่มีการ Throttle สำหรับ Memory

QoS Classes — ระดับคุณภาพบริการ

Kubernetes กำหนด QoS Class โดยอัตโนมัติจาก Requests/Limits ที่คุณตั้ง:

QoS Class	เงื่อนไข	ถูก Evict ลำดับ	เหมาะกับ
Guaranteed	ทุก Container: Requests = Limits (ทั้ง CPU + Memory)	สุดท้าย (ปลอดภัยที่สุด)	Production Critical Apps
Burstable	มี Requests แต่ Requests != Limits	กลาง	App ทั่วไป
BestEffort	ไม่มี Requests และ Limits เลย	แรก (ถูกไล่ก่อน)	ไม่ควรใช้ใน Production

# Guaranteed (ปลอดภัยสุด)
resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "500m"       # เท่ากับ Requests
    memory: "512Mi"   # เท่ากับ Requests

# Burstable (ยืดหยุ่น)
resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "500m"       # สูงกว่า Requests
    memory: "512Mi"   # สูงกว่า Requests

# BestEffort (ไม่แนะนำ)
# ไม่ตั้ง resources เลย

LimitRange — Default per Namespace

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: production
spec:
  limits:
  - type: Container
    default:          # Default Limits (ถ้า Pod ไม่ได้ตั้ง)
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:   # Default Requests (ถ้า Pod ไม่ได้ตั้ง)
      cpu: "100m"
      memory: "128Mi"
    max:              # Limit สูงสุดที่ตั้งได้
      cpu: "2"
      memory: "4Gi"
    min:              # Limit ต่ำสุดที่ตั้งได้
      cpu: "50m"
      memory: "64Mi"
  - type: Pod
    max:
      cpu: "4"
      memory: "8Gi"

ResourceQuota — Total per Namespace

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: production
spec:
  hard:
    requests.cpu: "10"         # รวม CPU Requests ทั้ง Namespace ไม่เกิน 10 Cores
    requests.memory: "20Gi"    # รวม Memory Requests ไม่เกิน 20 Gi
    limits.cpu: "20"           # รวม CPU Limits ไม่เกิน 20 Cores
    limits.memory: "40Gi"      # รวม Memory Limits ไม่เกิน 40 Gi
    pods: "50"                 # จำนวน Pod สูงสุด 50

# ดู Quota Usage
# kubectl describe resourcequota compute-quota -n production

VPA — Vertical Pod Autoscaler

VPA ปรับ Requests/Limits ของ Pod อัตโนมัติตามการใช้งานจริง ช่วย Right-sizing ไม่ต้องเดาเอง

# ติดตั้ง VPA
# kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/.../vpa.yaml

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"    # Auto = ปรับอัตโนมัติ, Off = แค่ Recommend
  resourcePolicy:
    containerPolicies:
    - containerName: '*'
      minAllowed:
        cpu: "50m"
        memory: "64Mi"
      maxAllowed:
        cpu: "2"
        memory: "4Gi"
      controlledResources: ["cpu", "memory"]

# ดู Recommendations
# kubectl describe vpa my-app-vpa

Goldilocks — Right-Sizing Recommendations

# Goldilocks = VPA + Dashboard
# ติดตั้งด้วย Helm:
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm install goldilocks fairwinds-stable/goldilocks --namespace goldilocks --create-namespace

# Enable ใน Namespace ที่ต้องการ
kubectl label namespace production goldilocks.fairwinds.com/enabled=true

# เปิด Dashboard
kubectl port-forward svc/goldilocks-dashboard -n goldilocks 8080:80
# → http://localhost:8080
# ดู Recommended Requests/Limits สำหรับทุก Deployment

Monitoring Resource Usage

# kubectl top — ดูการใช้ Resource แบบ Real-time
kubectl top nodes                          # ดู Resource Usage ของ Nodes
kubectl top pods -n production             # ดู Resource Usage ของ Pods
kubectl top pods --sort-by=memory          # Sort by Memory Usage
kubectl top pods --sort-by=cpu             # Sort by CPU Usage

# ดู Resource ที่ตั้งไว้
kubectl describe pod my-app-pod | grep -A 5 "Requests\|Limits"

# ดู Node Resources
kubectl describe node worker-1 | grep -A 10 "Allocated resources"

# Prometheus Queries (PromQL)
# CPU Usage vs Requests:
# container_cpu_usage_seconds_total / container_spec_cpu_quota
#
# Memory Usage vs Limits:
# container_memory_working_set_bytes / container_spec_memory_limit_bytes
#
# OOMKilled Events:
# kube_pod_container_status_last_terminated_reason{reason="OOMKilled"}

OOMKilled Troubleshooting

สาเหตุ	อาการ	วิธีแก้
Memory Limit ต่ำเกินไป	Pod Restart บ่อย, Exit Code 137	เพิ่ม Memory Limit (ดูจาก Monitoring)
Memory Leak ใน App	Memory ค่อย ๆ สูงขึ้นจน OOMKilled	Fix Memory Leak ใน App Code
Java Heap Size	JVM ใช้ Memory เกิน Container Limit	ตั้ง -Xmx ให้น้อยกว่า Container Limit 20-30%
ไม่ตั้ง Limit	Pod ใช้ Memory จน Node OOM	ตั้ง Memory Limit ทุก Pod

# ตรวจสอบ OOMKilled
kubectl get pods | grep -i oomkill
kubectl describe pod my-app-pod | grep -i oom
kubectl get events --field-selector reason=OOMKilling

# ดู Exit Code
kubectl get pod my-app-pod -o jsonpath='{.status.containerStatuses[0].lastState}'
# Exit Code 137 = OOMKilled

CPU Throttling Detection

# CPU Throttling เกิดเมื่อ Pod ใช้ CPU เกิน Limit
# อาการ: App ช้าลง, Latency สูงขึ้น, แต่ Pod ไม่ Restart

# ตรวจด้วย cAdvisor Metrics (Prometheus):
# container_cpu_cfs_throttled_seconds_total
# container_cpu_cfs_throttled_periods_total / container_cpu_cfs_periods_total

# ถ้า Throttle Rate > 25% = มีปัญหา
# วิธีแก้:
# 1. เพิ่ม CPU Limit
# 2. ลบ CPU Limit ออก (ใช้แค่ Requests) — บาง Best Practice แนะนำ
# 3. Optimize App ให้ใช้ CPU น้อยลง

Resource Management Best Practices

Practice	ทำไม
ตั้ง Requests ทุก Container	Scheduler ต้องใช้ข้อมูลนี้ ไม่มี = Scheduling ไม่มีประสิทธิภาพ
ตั้ง Memory Limits ทุก Container	ป้องกัน Memory Leak ทำ Node OOM
CPU Limits: พิจารณาเป็นกรณี	บาง Workload ดีกว่าไม่ตั้ง CPU Limit (หลีกเลี่ยง Throttling)
ใช้ LimitRange	กันไม่ให้ Pod deploy โดยไม่ตั้ง Resources
ใช้ ResourceQuota	จำกัด Resource รวมต่อ Namespace ป้องกัน Namespace หนึ่งกิน Resource หมด
ใช้ VPA หรือ Goldilocks	Right-sizing อัตโนมัติ ไม่ต้องเดา
Monitor อย่างต่อเนื่อง	kubectl top, Prometheus + Grafana Dashboard
Requests = Usage + Buffer 20-30%	ให้ Room สำหรับ Spike แต่ไม่สูงเกินไป

Cost Optimization — ลดค่า Cloud ด้วย Right-Sizing

# ปัญหาที่พบบ่อย:
# Requests สูงเกินจริง → จอง Resource เปลือง → Cloud Bill สูง
#
# ตัวอย่าง:
# Deployment: 10 Pods, Requests: CPU 1000m, Memory 2Gi
# จอง: 10 CPU + 20Gi Memory
#
# การใช้จริง (จาก Monitoring):
# CPU เฉลี่ย: 200m, Memory เฉลี่ย: 500Mi
# จอง: 10 x 200m = 2 CPU + 10 x 500Mi = 5Gi
#
# สิ้นเปลือง: 8 CPU + 15Gi Memory!
# ถ้า Cloud ราคา $0.05/CPU/hr + $0.007/GB/hr:
# สิ้นเปลือง/เดือน = (8 x 0.05 + 15 x 0.007) x 720 = $363.6/เดือน!
#
# Right-Sizing:
# Requests: CPU 300m (usage + 50% buffer), Memory 750Mi
# ประหยัด ~70% ของ Resource จอง!

เคล็ดลับ: ใช้ Goldilocks ดู Recommended Requests/Limits สำหรับทุก Deployment แล้ว Right-size ทั้ง Cluster สามารถลดค่า Cloud ได้ 30-70%

สรุป: Resource Management Checklist

ลำดับ	งาน
1	ตั้ง Requests + Limits ทุก Container (อย่างน้อย Memory Limit)
2	สร้าง LimitRange ทุก Namespace (Default Resources)
3	สร้าง ResourceQuota สำหรับ Production Namespace
4	ติดตั้ง Metrics Server + Prometheus + Grafana
5	Monitor Resource Usage อย่างสม่ำเสมอ
6	ติดตั้ง VPA หรือ Goldilocks เพื่อ Right-sizing
7	ตรวจ OOMKilled + CPU Throttling Events
8	Right-size ทุก Quarter เพื่อ Cost Optimization

Kubernetes Resource Management ไม่ซับซ้อน แต่ต้องทำอย่างจริงจัง ตั้ง Requests/Limits ให้ถูกต้อง Monitor อย่างต่อเนื่อง Right-size เป็นประจำ แค่นี้ Cluster ของคุณจะเสถียร ประหยัด และ Scale ได้อย่างมีประสิทธิภาพ