OPA Gatekeeper ?????????????????????
OPA (Open Policy Agent) Gatekeeper ???????????? policy engine ?????????????????? Kubernetes ??????????????????????????????????????? admission controller ????????????????????? resources ???????????????????????????????????????????????????????????????????????????????????? cluster ????????? resource ????????????????????? policy ??????????????? reject ???????????????
?????????????????? Cost Optimization Gatekeeper ??????????????????????????????????????? policies ????????????????????????????????????????????? ???????????? ??????????????? resource requests/limits ????????? pod, ??????????????????????????? pod ?????????????????? resources ???????????????????????????, ??????????????????????????? spot instances, ??????????????? namespace quotas, ????????????????????????????????????????????? resources ????????????????????????????????????
??????????????????????????????????????? ??????????????? cloud ????????? 20-40% ????????????????????????????????? performance ???????????????????????????????????? resource waste ?????????????????????????????? ??????????????????????????????????????????????????????????????? ????????? deployment ???????????????????????? policy check ???????????????????????????
????????????????????? OPA Gatekeeper ?????? Kubernetes
Setup Gatekeeper ?????????????????? cost management
# === OPA Gatekeeper Installation ===
# 1. Install Gatekeeper via Helm
helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm repo update
helm install gatekeeper gatekeeper/gatekeeper \
--namespace gatekeeper-system \
--create-namespace \
--set replicas=3 \
--set audit.replicas=1 \
--set audit.interval=60
# 2. Verify Installation
kubectl get pods -n gatekeeper-system
kubectl get crd | grep gatekeeper
# 3. Create Constraint Template (Resource Limits Required)
cat > template-resource-limits.yaml << 'EOF'
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredresources
spec:
crd:
spec:
names:
kind: K8sRequiredResources
validation:
openAPIV3Schema:
type: object
properties:
maxCPU:
type: string
maxMemory:
type: string
requireLimits:
type: boolean
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredresources
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.resources.limits
msg := sprintf("Container '%v' must have resource limits", [container.name])
}
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.resources.requests
msg := sprintf("Container '%v' must have resource requests", [container.name])
}
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
cpu_limit := container.resources.limits.cpu
max_cpu := input.parameters.maxCPU
cpu_limit > max_cpu
msg := sprintf("Container '%v' CPU limit %v exceeds max %v", [container.name, cpu_limit, max_cpu])
}
EOF
kubectl apply -f template-resource-limits.yaml
# 4. Create Constraint
cat > constraint-resource-limits.yaml << 'EOF'
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredResources
metadata:
name: require-resource-limits
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
- apiGroups: ["apps"]
kinds: ["Deployment", "StatefulSet", "DaemonSet"]
excludedNamespaces:
- kube-system
- gatekeeper-system
parameters:
maxCPU: "4"
maxMemory: "8Gi"
requireLimits: true
EOF
kubectl apply -f constraint-resource-limits.yaml
echo "Gatekeeper installed and configured"
Cost Optimization Policies
Policies ??????????????????????????????????????????????????????
#!/usr/bin/env python3
# cost_policies.py ??? Cost Optimization Policies
import json
import logging
from typing import Dict, List
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("cost")
class CostOptimizationPolicies:
def __init__(self):
self.policies = {}
def policy_catalog(self):
return {
"require_resource_limits": {
"description": "????????? container ?????????????????? resource requests ????????? limits",
"impact": "????????????????????? unbounded resource usage ???????????????????????????????????? 15-25%",
"rego": """
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.resources.limits.cpu
msg := sprintf("Container '%v' missing CPU limit", [container.name])
}
""",
},
"max_replicas": {
"description": "?????????????????????????????? replicas ??????????????????",
"impact": "????????????????????? over-provisioning ?????? 10-20%",
"max_values": {
"development": 2,
"staging": 3,
"production": 10,
},
},
"require_spot_tolerations": {
"description": "Non-critical workloads ??????????????????????????? spot instances",
"impact": "??????????????? compute 60-70% ?????????????????? spot-eligible workloads",
"applicable_to": ["batch jobs", "dev environments", "CI/CD runners"],
},
"block_expensive_storage": {
"description": "????????????????????? storage class ??????????????????????????????????????????????????????",
"impact": "??????????????? storage 30-50%",
"allowed_classes": ["gp3", "standard"],
"blocked_classes": ["io2", "io1"],
},
"require_autoscaling": {
"description": "Production workloads ?????????????????? HPA",
"impact": "Scale down ?????????????????????????????????????????? ?????? 20-40%",
},
"namespace_quotas": {
"description": "??????????????? budget ????????? namespace/team",
"impact": "??????????????????????????????????????????????????????????????????",
"example_quotas": {
"dev-team-a": {"cpu": "20", "memory": "40Gi", "pods": "50"},
"dev-team-b": {"cpu": "16", "memory": "32Gi", "pods": "40"},
"staging": {"cpu": "32", "memory": "64Gi", "pods": "100"},
},
},
"require_cost_labels": {
"description": "????????? resource ?????????????????? cost-center label",
"impact": "Track ????????????????????????????????????????????????/project ?????????",
"required_labels": ["cost-center", "team", "environment"],
},
}
def estimated_savings(self):
return {
"before": {
"monthly_cost": 50000,
"currency": "USD",
"issues": ["No resource limits", "Over-provisioned", "No spot usage", "No autoscaling"],
},
"after": {
"monthly_cost": 30000,
"savings_pct": 40,
"savings_usd": 20000,
"policies_applied": 7,
},
}
policies = CostOptimizationPolicies()
catalog = policies.policy_catalog()
print("Cost Optimization Policies:")
for name, info in catalog.items():
print(f" {name}: {info['impact']}")
savings = policies.estimated_savings()
print(f"\nEstimated Savings: /month ({savings['after']['savings_pct']}%)")
Resource Quota Enforcement
??????????????????????????? resource quotas
# === Resource Quota Enforcement ===
# 1. Namespace Resource Quotas
cat > namespace-quotas.yaml << 'EOF'
# Development namespace quota
apiVersion: v1
kind: ResourceQuota
metadata:
name: dev-quota
namespace: development
spec:
hard:
requests.cpu: "20"
requests.memory: "40Gi"
limits.cpu: "40"
limits.memory: "80Gi"
pods: "50"
persistentvolumeclaims: "20"
services.loadbalancers: "2"
---
# Limit Range (default limits per container)
apiVersion: v1
kind: LimitRange
metadata:
name: dev-limit-range
namespace: development
spec:
limits:
- type: Container
default:
cpu: "500m"
memory: "512Mi"
defaultRequest:
cpu: "100m"
memory: "128Mi"
max:
cpu: "2"
memory: "4Gi"
min:
cpu: "50m"
memory: "64Mi"
- type: Pod
max:
cpu: "4"
memory: "8Gi"
---
# Production namespace quota (higher limits)
apiVersion: v1
kind: ResourceQuota
metadata:
name: prod-quota
namespace: production
spec:
hard:
requests.cpu: "100"
requests.memory: "200Gi"
limits.cpu: "200"
limits.memory: "400Gi"
pods: "200"
persistentvolumeclaims: "50"
EOF
kubectl apply -f namespace-quotas.yaml
# 2. Gatekeeper Policy ??? Require Cost Labels
cat > template-cost-labels.yaml << 'EOF'
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequiredlabels
spec:
crd:
spec:
names:
kind: K8sRequiredLabels
validation:
openAPIV3Schema:
type: object
properties:
labels:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequiredlabels
violation[{"msg": msg}] {
provided := {label | input.review.object.metadata.labels[label]}
required := {label | label := input.parameters.labels[_]}
missing := required - provided
count(missing) > 0
msg := sprintf("Missing required labels: %v", [missing])
}
EOF
cat > constraint-cost-labels.yaml << 'EOF'
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: require-cost-labels
spec:
match:
kinds:
- apiGroups: ["apps"]
kinds: ["Deployment", "StatefulSet"]
excludedNamespaces:
- kube-system
- gatekeeper-system
parameters:
labels:
- "cost-center"
- "team"
- "environment"
EOF
kubectl apply -f template-cost-labels.yaml
kubectl apply -f constraint-cost-labels.yaml
echo "Resource quotas and labels enforced"
Automated Cost Controls
???????????????????????????????????????????????????????????????????????????????????????
#!/usr/bin/env python3
# cost_controls.py ??? Automated Cost Controls
import json
import logging
from typing import Dict, List
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("controls")
class AutomatedCostControls:
def __init__(self):
self.rules = {}
def idle_resource_detection(self):
"""Detect and handle idle resources"""
return {
"detection_rules": {
"idle_pods": {
"condition": "CPU utilization < 5% for 24 hours",
"action": "Alert team, suggest scale down",
},
"idle_pvcs": {
"condition": "PVC not mounted to any pod for 7 days",
"action": "Alert, suggest deletion",
},
"oversized_pods": {
"condition": "CPU request > 2x actual usage (7-day avg)",
"action": "Suggest right-sizing",
},
"unused_load_balancers": {
"condition": "No traffic for 7 days",
"action": "Alert, suggest deletion",
"cost": "$15-20/month each",
},
},
"kubectl_commands": {
"find_oversized": "kubectl top pods --all-namespaces --sort-by=cpu",
"find_idle_pvcs": "kubectl get pvc --all-namespaces -o json | jq '.items[] | select(.status.phase==\"Bound\")'",
"check_hpa": "kubectl get hpa --all-namespaces",
},
}
def scheduling_policies(self):
return {
"dev_shutdown": {
"description": "????????? dev environments ????????????????????????????????????",
"schedule": "Scale to 0 at 20:00, scale up at 08:00 (Mon-Fri)",
"savings": "60% ????????? dev costs",
"tool": "kube-downscaler ???????????? CronJob",
},
"spot_instance_policy": {
"description": "????????? spot instances ?????????????????? non-critical workloads",
"eligible": ["dev", "staging", "batch", "ci-cd"],
"not_eligible": ["production databases", "stateful services"],
"savings": "60-70% vs on-demand",
},
"right_sizing": {
"description": "???????????? resource requests/limits ??????????????????????????????",
"tool": "Kubernetes VPA (Vertical Pod Autoscaler)",
"approach": "Monitor actual usage 7 days ??? adjust requests to P95",
"savings": "20-40%",
},
}
controls = AutomatedCostControls()
idle = controls.idle_resource_detection()
print("Idle Resource Detection:")
for rule, info in idle["detection_rules"].items():
print(f" {rule}: {info['condition']}")
sched = controls.scheduling_policies()
print("\nScheduling Policies:")
for policy, info in sched.items():
print(f" {policy}: {info['savings']}")
Monitoring ????????? Reporting
Dashboard ????????????????????????????????????????????????
#!/usr/bin/env python3
# cost_dashboard.py ??? Cost Monitoring Dashboard
import json
import logging
from typing import Dict
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("dashboard")
class CostDashboard:
def __init__(self):
self.data = {}
def dashboard(self):
return {
"monthly_overview": {
"total_cost": 30000,
"budget": 35000,
"budget_usage_pct": 85.7,
"trend": "decreasing (-12% MoM)",
"savings_from_policies": 8500,
},
"cost_by_namespace": {
"production": {"cost": 18000, "pct": 60},
"staging": {"cost": 4500, "pct": 15},
"development": {"cost": 3000, "pct": 10},
"ci-cd": {"cost": 2500, "pct": 8.3},
"monitoring": {"cost": 2000, "pct": 6.7},
},
"cost_by_resource": {
"compute": {"cost": 20000, "pct": 66.7},
"storage": {"cost": 5000, "pct": 16.7},
"network": {"cost": 3000, "pct": 10},
"load_balancers": {"cost": 2000, "pct": 6.6},
},
"policy_violations_30d": {
"missing_resource_limits": 12,
"missing_cost_labels": 8,
"exceeded_max_replicas": 3,
"blocked_expensive_storage": 2,
"total_blocked": 25,
"total_allowed": 1250,
},
"recommendations": [
{"action": "Right-size 15 oversized pods", "savings": "$2,100/month"},
{"action": "Delete 8 unused PVCs", "savings": "$320/month"},
{"action": "Enable spot for CI/CD", "savings": "$1,500/month"},
{"action": "Schedule dev shutdown nights/weekends", "savings": "$1,800/month"},
],
}
def tools(self):
return {
"kubecost": {
"description": "Kubernetes cost monitoring (open source)",
"install": "helm install kubecost kubecost/cost-analyzer",
"features": ["Real-time cost allocation", "Savings recommendations", "Alerts"],
},
"opencost": {
"description": "CNCF cost monitoring (fully open source)",
"install": "helm install opencost opencost/opencost",
"features": ["Cost allocation", "Prometheus integration", "API"],
},
"infracost": {
"description": "Cost estimation for Terraform",
"install": "brew install infracost",
"features": ["PR cost comments", "CI/CD integration", "Policy checks"],
},
}
dash = CostDashboard()
data = dash.dashboard()
print(f"Monthly Cost: ({data['monthly_overview']['budget_usage_pct']}% of budget)")
print(f"Savings from policies: ")
print(f"\nViolations blocked: {data['policy_violations_30d']['total_blocked']}")
print(f"\nRecommendations:")
for rec in data["recommendations"]:
print(f" {rec['action']}: {rec['savings']}")
tools = dash.tools()
print(f"\nCost Tools:")
for name, info in tools.items():
print(f" {name}: {info['description']}")
FAQ ??????????????????????????????????????????
Q: OPA Gatekeeper ????????? Kyverno ???????????????????????????????????????????
A: OPA Gatekeeper ????????? Rego language ??????????????? policies ?????????????????????????????? ?????????????????? complex logic ???????????????????????????????????????????????? OPA ecosystem ??????????????????????????? Kubernetes ???????????? learning curve ????????? Kyverno ????????? YAML ??????????????? policies ????????????????????????????????? Kubernetes-native ???????????????????????????????????????????????????????????? ?????? mutating policies (????????? resource ???????????? admit) ?????????????????? image verification ??????????????? Gatekeeper ?????????????????????????????? complex policies, ????????? OPA ?????????????????????????????????, ?????????????????????????????? Rego ??????????????? Kyverno ?????????????????????????????? ???????????? ???????????? ???????????????????????????????????? Rego, ????????????????????? mutating policies
Q: Gatekeeper ??????????????? deployment ?????????????????????????
A: ???????????????????????????????????? Gatekeeper ???????????? admission webhook ?????????????????????????????? create/update resources ???????????????????????? ???????????????????????? runtime performance Latency ??????????????? 5-50ms ????????? admission request (??????????????????????????????????????????????????????????????????????????????????????? policies) ?????????????????? deployment ?????????????????????????????????????????? seconds-minutes ??????????????? 50ms ????????????????????? tips ?????? latency ????????? constraint templates ????????? efficient, ?????????????????????????????? complex Rego logic, ???????????? audit interval ?????????????????????????????? (60s default), exclude namespaces ???????????????????????????????????? (kube-system)
Q: ???????????????????????? cost optimization ??????????????????????
A: ???????????????????????? 3 policies ??????????????? ??????????????????????????????????????????????????? Require resource limits ????????? container ?????????????????? requests/limits (?????? 15-25%), Require cost labels ??????????????????????????????????????????????????????????????????/project, Block oversized resources ??????????????? max CPU/memory ????????? container ???????????????????????????????????? Install Kubecost ???????????? OpenCost ?????? actual costs, Right-size pods ????????? actual usage, Schedule dev environments ??????????????????????????????, ????????? spot instances ?????????????????? non-critical workloads ?????????????????????????????? ?????????????????????????????? step ????????? cloud ??????????????? 20-40% ??????????????? 2-3 ???????????????
Q: Policy violation ????????????????????????????????????????
A: ?????? 2 ???????????? Enforce (default) reject resources ????????? violate policy ??????????????? ??????????????? production, Dry-run ?????????????????? violations ?????????????????? block ??????????????????????????????????????????????????? ??????????????? ???????????????????????? dry-run 1-2 ????????????????????? ??????????????? policies ???????????????????????????????????????, ???????????? teams ???????????????????????? ?????????????????????????????????, ????????????????????????????????? enforce ???????????? policy, ???????????? exception process ?????????????????? legitimate cases, Monitor violations ???????????? Gatekeeper audit, Report violations ?????? Slack/Teams ????????? teams ???????????? ?????????????????? urgent cases ????????? excludedNamespaces ???????????? exemptions ????????????????????????