Betteruptime Zero Downtime Deployment — Deploy

Zero Downtime Deployment

Better Uptime Monitoring Zero Downtime Blue-Green Canary Rolling Update Load Balancer Health Check Status Page Incident Management On-call

Strategy	Downtime	Rollback	Risk	Cost
Blue-Green	0	ทันที (สลับ)	ต่ำ	สูง (2x Infra)
Canary	0	เร็ว (ลด Traffic)	ต่ำมาก	ปานกลาง
Rolling	0	ช้า (Rollback ทีละ Pod)	ปานกลาง	ต่ำ
Recreate	มี	Deploy ใหม่	สูง	ต่ำ

Deployment Strategies

=== Zero Downtime Deployment Strategies ===

Kubernetes Blue-Green Deployment

apiVersion: v1

kind: Service

metadata:

spec:

selector:

app: my-app

version: green # Switch between blue/green

ports:

port: 80

targetPort: 8080

---

apiVersion: apps/v1

kind: Deployment

metadata:

spec:

replicas: 3

selector:

matchLabels:

app: my-app

version: green

template:

metadata:

labels:

app: my-app

version: green

spec:

containers:

name: my-app

image: my-app:2.0.0

ports:

containerPort: 8080

readinessProbe:

httpGet:

path: /health

port: 8080

initialDelaySeconds: 5

periodSeconds: 10

เนื้อหาเกี่ยวข้อง — ทำความเข้าใจ Distributed Tracing Edge Deployment — วิธีตั้งค่าและใช้งานจริงพร้อมตัวอย่าง

livenessProbe:

httpGet:

path: /health

port: 8080

initialDelaySeconds: 15

periodSeconds: 20

Rolling Update (Kubernetes Default)

apiVersion: apps/v1

แนะนำเพิ่มเติม — SiamCafeBook

kind: Deployment

spec:

strategy:

type: RollingUpdate

rollingUpdate:

maxSurge: 25%

maxUnavailable: 25%

from dataclasses import dataclass

from typing import List

@dataclass

class DeploymentStrategy:

downtime: str

rollback_time: str

traffic_control: str

complexity: str

use_case: str

strategies = [

DeploymentStrategy("Blue-Green", "0", "< 1 min", "All-or-nothing", "Medium",

"Critical apps ต้อง Rollback เร็ว"),

DeploymentStrategy("Canary", "0", "< 5 min", "Gradual %", "High",

เนื้อหาเกี่ยวข้อง — ทำความเข้าใจ REST API Design Audit Trail Logging

"Apps ที่ต้องทดสอบกับ Real Traffic"),

DeploymentStrategy("Rolling Update", "0", "5-15 min", "Per Instance", "Low",

"Kubernetes Default ทั่วไป"),

DeploymentStrategy("Feature Flag", "0", "Instant", "Per Feature", "Medium",

"Toggle Features ไม่ต้อง Deploy ใหม่"),

]

print("=== Deployment Strategies ===")

for s in strategies:

print(f"\n [{s.name}]")

print(f" Downtime: {s.downtime} | Rollback: {s.rollback_time}")

print(f" Traffic: {s.traffic_control} | Complexity: {s.complexity}")

print(f" Use: {s.use_case}")

Monitoring Setup

=== Better Uptime + Monitoring Setup ===

Better Uptime API

แนะนำเพิ่มเติม — ดูสัญญาณเทรดที่ XM Signal

curl -X POST https://betteruptime.com/api/v2/monitors \

-H "Authorization: Bearer YOUR_API_TOKEN" \

-H "Content-Type: application/json" \

-d '{

"monitor_type": "status",

"url": "https://example.com",

"pronounceable_name": "Main Website",

"check_frequency": 30,

"http_method": "get",

"expected_status_codes": [200],

"regions": ["us", "eu", "as"],

เนื้อหาเกี่ยวข้อง — Monte Carlo Observability Batch Processing

"confirmation_period": 60,

"call": true,

"sms": true,

"email": true

Deployment Health Check Script

import requests

import time

def wait_for_healthy(url, timeout=300, interval=5):

start = time.time()

while time.time() - start < timeout:

try:

response = requests.get(f"{url}/health", timeout=5)

if response.status_code == 200:

data = response.json()

if data.get("status") == "healthy":

print(f"Healthy after {time.time()-start:.0f}s")

return True

except requests.RequestException:

pass

time.sleep(interval)

raise TimeoutError(f"Not healthy after {timeout}s")

# Canary Deployment Script

def canary_deploy(service, new_version, steps=[5, 25, 50, 100]):

for pct in steps:

print(f"Setting traffic to {pct}% for {new_version}")

set_traffic_split(service, new_version, pct)

time.sleep(300) # Wait 5 min

metrics = get_metrics(service, window="5m")

if metrics["error_rate"] > 0.01:

print(f"Error rate {metrics['error_rate']:.2%} > 1%, rolling back")

set_traffic_split(service, new_version, 0)

return False

เนื้อหาเกี่ยวข้อง — อ่านต่อ: Embedding Model Agile Scrum Kanban

if metrics["p99_latency"] > 2000:

print(f"P99 {metrics['p99_latency']}ms > 2000ms, rolling back")

set_traffic_split(service, new_version, 0)

return False

print(f"Metrics OK at {pct}%")

return True

monitors = {

"Website": {"type": "HTTPS", "interval": 30, "regions": 3, "uptime": "99.95%"},

"API": {"type": "HTTPS", "interval": 30, "regions": 3, "uptime": "99.98%"},

"Database": {"type": "TCP", "interval": 60, "regions": 1, "uptime": "99.99%"},

"CDN": {"type": "HTTPS", "interval": 60, "regions": 5, "uptime": "99.99%"},

"Cron Jobs": {"type": "Heartbeat", "interval": 300, "regions": 1, "uptime": "99.90%"},

}

print("\nMonitoring Dashboard:")

for name, info in monitors.items():

print(f" [{info['uptime']}] {name} — {info['type']} every {info['interval']}s "

f"({info['regions']} regions)")

Status Page และ Incident

# === Status Page & Incident Management ===



@dataclass

class Incident:

    title: str

    severity: str

    status: str

    duration_min: int

    root_cause: str

    resolution: str



incidents = [

    Incident("API Latency Spike", "Minor", "Resolved", 12,

        "Database connection pool exhaustion",

        "Increased pool size from 20 to 50"),

    Incident("CDN Cache Miss", "Major", "Resolved", 25,

        "Cache invalidation after deployment",

        "Pre-warm cache before switching traffic"),

    Incident("SSL Certificate Renewal", "Maintenance", "Completed", 5,

        "Scheduled certificate rotation",

        "Auto-renewed via Let's Encrypt"),

]



print("Incident History:")

for inc in incidents:

    print(f"\n  [{inc.severity}] {inc.title} — {inc.status}")

    print(f"    Duration: {inc.duration_min} min")

    print(f"    Cause: {inc.root_cause}")

    print(f"    Fix: {inc.resolution}")



# Deploy Checklist

checklist = [

    "Pre-deploy: Run full test suite",

    "Pre-deploy: Check database migrations",

    "Pre-deploy: Notify team via Slack",

    "Deploy: Use chosen strategy (Blue-Green/Canary/Rolling)",

    "Post-deploy: Verify health checks pass",

    "Post-deploy: Check error rates < 0.1%",

    "Post-deploy: Check P99 latency < 2s",

    "Post-deploy: Verify monitoring alerts normal",

    "Post-deploy: Run smoke tests",

    "Rollback: If metrics exceed thresholds, rollback immediately",

]



print(f"\n\nDeploy Checklist:")

for i, item in enumerate(checklist, 1):

    print(f"  {i}. {item}")

เคล็ดลับ

Health Check: ทุก Instance ต้องมี Health Check Endpoint
Canary: เริ่มที่ 5% รอ 5 นาที ดู Metrics ก่อนเพิ่ม
Rollback: Automate Rollback เมื่อ Error Rate สูง
Status Page: แจ้งลูกค้าเมื่อมี Incident สร้างความเชื่อมั่น
Pre-warm: Pre-warm Cache หลัง Deploy ก่อนรับ Traffic

การนำความรู้ไปประยุกต์ใช้งานจริง

แหล่งเรียนรู้ที่แนะนำ ได้แก่ Official Documentation ที่อัพเดทล่าสุดเสมอ Online Course จาก Coursera Udemy edX ช่อง YouTube คุณภาพทั้งไทยและอังกฤษ และ Community อย่าง Discord Reddit Stack Overflow ที่ช่วยแลกเปลี่ยนประสบการณ์กับนักพัฒนาทั่วโลก

เปรียบเทียบข้อดีและข้อเสีย

ข้อดี	ข้อเสีย
ประสิทธิภาพสูง ทำงานได้เร็วและแม่นยำ ลดเวลาทำงานซ้ำซ้อน	ต้องใช้เวลาเรียนรู้เบื้องต้นพอสมควร มี Learning Curve สูง
มี Community ขนาดใหญ่ มีคนช่วยเหลือและแหล่งเรียนรู้มากมาย	บางฟีเจอร์อาจยังไม่เสถียร หรือมีการเปลี่ยนแปลงบ่อยในเวอร์ชันใหม่
รองรับ Integration กับเครื่องมือและบริการอื่นได้หลากหลาย	ต้นทุนอาจสูงสำหรับ Enterprise License หรือ Cloud Service
เป็น Open Source หรือมีเวอร์ชันฟรีให้เริ่มต้นใช้งาน	ต้องการ Hardware หรือ Infrastructure ที่เพียงพอ

จากตารางเปรียบเทียบจะเห็นว่าข้อดีมีมากกว่าข้อเสียอย่างชัดเจน โดยเฉพาะในแง่ของประสิทธิภาพและความสามารถในการ Scale สำหรับข้อเสียส่วนใหญ่สามารถแก้ไขได้ด้วยการเรียนรู้อย่างเป็นระบบและวางแผนทรัพยากรให้เหมาะสม

Better Uptime คืออะไร

Monitoring Platform ตรวจ Uptime ทุก 30 วินาที Phone SMS Slack Email Status Page Incident Management On-call Heartbeat

Zero Downtime Deployment คืออะไร

Deploy ไม่มี Downtime Blue-Green Canary Rolling Update Load Balancer Health Check ผู้ใช้ไม่รู้สึกเปลี่ยนแปลง

Blue-Green กับ Canary ต่างกันอย่างไร

Blue-Green 2 Environment สลับ Traffic ทั้งหมด Rollback เร็ว Canary ค่อยๆเพิ่ม 5% 25% 50% 100% ตรวจ Metrics ลดความเสี่ยง

Rolling Update ทำงานอย่างไร

อัปเดตทีละ Instance Health Check ก่อนถัดไป Kubernetes Default maxSurge maxUnavailable Rollback ถ้ามีปัญหา

สรุป

Better Uptime Monitoring Zero Downtime Blue-Green Canary Rolling Update Health Check Status Page Incident Management Rollback Kubernetes Load Balancer Pre-warm Cache Deploy Checklist

Betteruptime Zero Downtime Deployment — Deploy

Zero Downtime Deployment

Deployment Strategies

metadata:

spec:

selector:

ports:

metadata:

spec:

selector:

matchLabels:

template:

metadata:

labels:

spec:

containers:

ports:

readinessProbe:

livenessProbe:

spec:

strategy:

rollingUpdate:

class DeploymentStrategy:

for s in strategies:

Monitoring Setup

def wait_for_healthy(url, timeout=300, interval=5):

while time.time() - start < timeout:

try:

if response.status_code == 200:

if data.get("status") == "healthy":

except requests.RequestException:

def canary_deploy(service, new_version, steps=[5, 25, 50, 100]):

for pct in steps:

if metrics["error_rate"] > 0.01:

if metrics["p99_latency"] > 2000:

for name, info in monitors.items():

Status Page และ Incident

เคล็ดลับ

การนำความรู้ไปประยุกต์ใช้งานจริง

เปรียบเทียบข้อดีและข้อเสีย

Better Uptime คืออะไร

Zero Downtime Deployment คืออะไร

Blue-Green กับ Canary ต่างกันอย่างไร

Rolling Update ทำงานอย่างไร

สรุป

บทความที่เกี่ยวข้อง

แนะนำจากเครือข่าย SiamCafe

บทความที่เกี่ยวข้อง