SiamCafe.net Blog
Technology

OPA Gatekeeper High Availability HA Setup

opa gatekeeper high availability ha setup
OPA Gatekeeper High Availability HA Setup | SiamCafe Blog
2026-02-09· อ. บอม — SiamCafe.net· 8,331 คำ

OPA Gatekeeper HA

OPA Gatekeeper High Availability Kubernetes Admission Controller Policy Rego ConstraintTemplate Webhook Audit Prometheus Monitor

ComponentReplicasPurposeHA Config
Controller Manager3Webhook ตรวจ Admission RequestAntiAffinity + PDB
Audit Controller2ตรวจ Existing ResourcesAntiAffinity + PDB
Webhook Config-K8s เรียก GatekeeperfailurePolicy + timeout
Constraint Templates-Policy Template (Rego)Version Control Git
Constraints-Policy Instance + Parametersdryrun → enforce

HA Configuration

# === OPA Gatekeeper HA Setup ===

# Helm Install with HA
# helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
# helm install gatekeeper gatekeeper/gatekeeper \
#   --namespace gatekeeper-system --create-namespace \
#   --set replicas=3 \
#   --set audit.replicas=2 \
#   --set podAnnotations."prometheus\.io/scrape"=true \
#   --set podAnnotations."prometheus\.io/port"=8888 \
#   --set pdb.controllerManager.minAvailable=2 \
#   --set controllerManager.resources.requests.cpu=500m \
#   --set controllerManager.resources.requests.memory=512Mi \
#   --set controllerManager.resources.limits.cpu=1000m \
#   --set controllerManager.resources.limits.memory=1Gi \
#   --set controllerManager.priorityClassName=system-cluster-critical \
#   --set controllerManager.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0].weight=100 \
#   --set controllerManager.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution[0].podAffinityTerm.topologyKey=kubernetes.io/hostname

# Webhook Configuration
# apiVersion: admissionregistration.k8s.io/v1
# kind: ValidatingWebhookConfiguration
# webhooks:
#   - name: validation.gatekeeper.sh
#     failurePolicy: Ignore  # Allow if Gatekeeper down (safety)
#     timeoutSeconds: 5
#     matchPolicy: Exact

from dataclasses import dataclass

@dataclass
class HAConfig:
    setting: str
    value: str
    purpose: str
    risk_if_missing: str

ha_configs = [
    HAConfig("replicas",
        "3 (Controller Manager)",
        "Webhook HA ถ้า Pod หนึ่งตาย ยังมี 2 ตัวรับ Request",
        "Single Point of Failure ถ้า Pod ตาย Webhook ไม่ทำงาน"),
    HAConfig("audit.replicas",
        "2",
        "Audit HA ตรวจ Existing Resources อย่างต่อเนื่อง",
        "ไม่ตรวจ Existing Violations"),
    HAConfig("podAntiAffinity",
        "preferredDuringScheduling hostname",
        "กระจาย Pod ไปต่างคนละ Node",
        "ทุก Pod อยู่ Node เดียว Node ตาย = ทั้งหมดตาย"),
    HAConfig("PodDisruptionBudget",
        "minAvailable: 2",
        "ป้องกัน Rolling Update ทำให้ทุก Pod หยุดพร้อมกัน",
        "kubectl drain อาจหยุดทุก Pod"),
    HAConfig("failurePolicy",
        "Ignore (recommended) หรือ Fail (strict)",
        "Ignore: Allow ถ้า Gatekeeper ไม่ตอบ, Fail: Block ทุกอย่าง",
        "Fail + Gatekeeper down = Cluster Lock ไม่มีใครสร้างอะไรได้"),
    HAConfig("priorityClassName",
        "system-cluster-critical",
        "Gatekeeper Pod ได้ Schedule ก่อน Workload ปกติ",
        "ถ้า Node เต็ม Gatekeeper อาจถูก Evict"),
]

print("=== HA Configuration ===")
for h in ha_configs:
    print(f"  [{h.setting}] = {h.value}")
    print(f"    Purpose: {h.purpose}")
    print(f"    Risk: {h.risk_if_missing}")

ConstraintTemplate

# === ConstraintTemplate Examples ===

# apiVersion: templates.gatekeeper.sh/v1
# kind: ConstraintTemplate
# metadata:
#   name: k8srequiredlabels
# spec:
#   crd:
#     spec:
#       names:
#         kind: K8sRequiredLabels
#       validation:
#         openAPIV3Schema:
#           type: object
#           properties:
#             labels:
#               type: array
#               items: { type: string }
#   targets:
#     - target: admission.k8s.gatekeeper.sh
#       rego: |
#         package k8srequiredlabels
#         violation[{"msg": msg}] {
#           provided := {label | input.review.object.metadata.labels[label]}
#           required := {label | label := input.parameters.labels[_]}
#           missing := required - provided
#           count(missing) > 0
#           msg := sprintf("Missing labels: %v", [missing])
#         }

# Constraint Instance
# apiVersion: constraints.gatekeeper.sh/v1beta1
# kind: K8sRequiredLabels
# metadata:
#   name: require-team-label
# spec:
#   enforcementAction: deny  # deny | dryrun | warn
#   match:
#     kinds:
#       - apiGroups: [""] kinds: ["Pod"]
#       - apiGroups: ["apps"] kinds: ["Deployment"]
#     excludedNamespaces: ["kube-system", "gatekeeper-system"]
#   parameters:
#     labels: ["app", "team", "env"]

@dataclass
class PolicyExample:
    name: str
    purpose: str
    rego_logic: str
    parameters: str

policies = [
    PolicyExample("K8sRequiredLabels",
        "ทุก Resource ต้องมี Label ที่กำหนด",
        "ตรวจ metadata.labels มี required labels ครบ",
        "labels: ['app', 'team', 'env']"),
    PolicyExample("K8sContainerLimits",
        "ทุก Container ต้องมี Resource Limits",
        "ตรวจ containers[].resources.limits.cpu/memory",
        "cpu: '2', memory: '4Gi'"),
    PolicyExample("K8sAllowedRepos",
        "ใช้ Image จาก Trusted Registry เท่านั้น",
        "ตรวจ containers[].image starts with allowed repos",
        "repos: ['gcr.io/my-project/', 'registry.example.com/']"),
    PolicyExample("K8sBlockPrivileged",
        "ห้ามใช้ Privileged Container",
        "ตรวจ securityContext.privileged != true",
        "ไม่มี (Block ทั้งหมด)"),
    PolicyExample("K8sDisallowLatest",
        "ห้ามใช้ Image Tag :latest",
        "ตรวจ image tag != 'latest' และต้องมี Tag",
        "ไม่มี (Block ทั้งหมด)"),
]

print("=== Policy Examples ===")
for p in policies:
    print(f"  [{p.name}] {p.purpose}")
    print(f"    Rego: {p.rego_logic}")
    print(f"    Params: {p.parameters}")

Monitoring & Alerting

# === Gatekeeper Monitoring ===

@dataclass
class GKMetric:
    metric: str
    type: str
    alert_condition: str
    action: str

metrics = [
    GKMetric("gatekeeper_violations",
        "Gauge (per constraint)",
        "เพิ่มขึ้น > 10 ใน 5 นาที",
        "ตรวจ Constraint ไหน Violation เพิ่ม แจ้ง Team"),
    GKMetric("gatekeeper_request_duration_seconds",
        "Histogram",
        "p99 > 3 seconds",
        "ตรวจ Rego Policy ซับซ้อนเกิน Optimize"),
    GKMetric("gatekeeper_audit_duration_seconds",
        "Histogram",
        "> 30 seconds",
        "Cluster ใหญ่เกิน ลด Audit Interval"),
    GKMetric("gatekeeper_constraint_templates",
        "Gauge",
        "status != active",
        "ConstraintTemplate มี Error ตรวจ Rego Syntax"),
    GKMetric("up{job='gatekeeper'}",
        "Gauge",
        "== 0 (Pod down)",
        "Pod Restart Alert ตรวจ OOM Resource Limit"),
]

print("=== Monitoring Metrics ===")
for m in metrics:
    print(f"  [{m.metric}] ({m.type})")
    print(f"    Alert: {m.alert_condition}")
    print(f"    Action: {m.action}")

เคล็ดลับ

OPA Gatekeeper คืออะไร

Policy Engine Kubernetes Admission Controller Rego ConstraintTemplate Webhook Reject Privileged Registry Label Resource Limit Audit

HA Setup ทำอย่างไร

Replicas 3 AntiAffinity PDB minAvailable 2 failurePolicy Ignore timeout 5s Priority system-cluster-critical Resource Request Limit Audit 2

ConstraintTemplate เขียนอย่างไร

Rego Language violation msg input.review.object CRD Validation Constraint enforcementAction deny dryrun warn match kinds excludedNamespaces

Monitor อย่างไร

Prometheus Metrics 8888 gatekeeper_violations request_duration audit_duration constraint_templates Grafana Dashboard Slack Alert Certificate

สรุป

OPA Gatekeeper HA Kubernetes Admission Controller Rego ConstraintTemplate Replicas AntiAffinity PDB failurePolicy Prometheus Monitoring Production

📖 บทความที่เกี่ยวข้อง

OPA Gatekeeper Observability Stackอ่านบทความ → OPA Gatekeeper Microservices Architectureอ่านบทความ → Directus CMS High Availability HA Setupอ่านบทความ → WordPress Block Theme High Availability HA Setupอ่านบทความ → OPA Gatekeeper Message Queue Designอ่านบทความ →

📚 ดูบทความทั้งหมด →