SiamCafe.net Blog
Cybersecurity

Uptime Kuma Monitoring Load Testing Strategy

uptime kuma monitoring load testing strategy
Uptime Kuma Monitoring Load Testing Strategy | SiamCafe Blog
2025-08-26· อ. บอม — SiamCafe.net· 10,888 คำ

Load Testing

Uptime Kuma Monitoring Load Testing Strategy k6 Locust JMeter Response Time Throughput Error Rate SLA Baseline Stress Spike Soak Performance

ToolLanguageProtocolDistributedCI/CDเหมาะกับ
k6JavaScriptHTTP gRPC WSk6 CloudดีมากDevOps CI/CD
LocustPythonHTTP CustomBuilt-inดีPython Team
JMeterJava/GUIHTTP FTP JDBCJMeter RemoteปานกลางEnterprise QA
GatlingScala/JavaHTTP WSGatling EnterpriseดีHigh Performance
ArtilleryYAML/JSHTTP WS SocketArtillery ProดีQuick Start
wrkCLI + LuaHTTPไม่มีง่ายQuick Bench

k6 Load Test

# === k6 Load Testing Script ===

# pip install k6  (or download binary)
# k6 run load-test.js

# // load-test.js
# import http from 'k6/http';
# import { check, sleep } from 'k6';
# import { Rate, Trend } from 'k6/metrics';
#
# const errorRate = new Rate('errors');
# const responseTime = new Trend('response_time');
#
# export const options = {
#   stages: [
#     { duration: '2m', target: 50 },   // Ramp up to 50 users
#     { duration: '5m', target: 50 },   // Stay at 50
#     { duration: '2m', target: 100 },  // Ramp up to 100
#     { duration: '5m', target: 100 },  // Stay at 100
#     { duration: '2m', target: 200 },  // Ramp up to 200
#     { duration: '5m', target: 200 },  // Stay at 200
#     { duration: '3m', target: 0 },    // Ramp down
#   ],
#   thresholds: {
#     http_req_duration: ['p(95)<500', 'p(99)<1000'],
#     errors: ['rate<0.01'],
#     http_req_failed: ['rate<0.01'],
#   },
# };
#
# export default function () {
#   const res = http.get('https://api.example.com/health');
#   check(res, {
#     'status is 200': (r) => r.status === 200,
#     'response time < 500ms': (r) => r.timings.duration < 500,
#   });
#   errorRate.add(res.status !== 200);
#   responseTime.add(res.timings.duration);
#   sleep(1);
# }

from dataclasses import dataclass

@dataclass
class LoadTestPhase:
    phase: str
    duration: str
    vusers: int
    rps_target: str
    purpose: str

phases = [
    LoadTestPhase("Baseline", "5 min", 10, "50 rps", "วัดค่าปกติ"),
    LoadTestPhase("Ramp Up", "5 min", 50, "250 rps", "ค่อยๆ เพิ่ม Load"),
    LoadTestPhase("Steady State", "10 min", 100, "500 rps", "วัด Performance ที่ Expected Load"),
    LoadTestPhase("Stress", "5 min", 200, "1000 rps", "เกิน Expected Load 2x"),
    LoadTestPhase("Spike", "1 min", 500, "2500 rps", "ทดสอบ Burst Traffic"),
    LoadTestPhase("Recovery", "5 min", 50, "250 rps", "ดูว่า Recover กลับปกติหรือไม่"),
]

print("=== Load Test Plan ===")
for p in phases:
    print(f"  [{p.phase}] Duration: {p.duration} | VUsers: {p.vusers}")
    print(f"    Target: {p.rps_target} | Purpose: {p.purpose}")

Monitoring During Test

# === Uptime Kuma + Load Test Integration ===

# Uptime Kuma Monitors during Load Test:
# 1. API Health — HTTP GET /health every 10s
# 2. API Response Time — HTTP GET /api/products every 20s
# 3. Database — TCP :5432 every 30s
# 4. Redis Cache — TCP :6379 every 30s
# 5. Load Balancer — HTTP GET / every 10s

# Locust Load Test Script
# from locust import HttpUser, task, between
#
# class APIUser(HttpUser):
#     wait_time = between(1, 3)
#     host = "https://api.example.com"
#
#     @task(3)
#     def get_products(self):
#         self.client.get("/api/products")
#
#     @task(2)
#     def get_product_detail(self):
#         self.client.get("/api/products/1")
#
#     @task(1)
#     def create_order(self):
#         self.client.post("/api/orders", json={
#             "product_id": 1, "quantity": 1
#         })
#
# # Run: locust -f locustfile.py --headless -u 100 -r 10 -t 10m

@dataclass
class TestResult:
    metric: str
    baseline: str
    load_50: str
    load_100: str
    load_200: str
    sla: str
    status: str

results = [
    TestResult("Response p50", "45ms", "52ms", "68ms", "120ms", "<200ms", "PASS"),
    TestResult("Response p95", "120ms", "180ms", "350ms", "680ms", "<500ms", "FAIL at 200"),
    TestResult("Response p99", "250ms", "380ms", "650ms", "1200ms", "<1000ms", "FAIL at 200"),
    TestResult("Throughput", "50 rps", "245 rps", "480 rps", "620 rps", ">500 rps", "PASS"),
    TestResult("Error Rate", "0%", "0.1%", "0.3%", "2.5%", "<1%", "FAIL at 200"),
    TestResult("CPU Usage", "15%", "35%", "65%", "92%", "<80%", "FAIL at 200"),
    TestResult("Memory", "2.1GB", "2.4GB", "3.2GB", "4.8GB", "<4GB", "FAIL at 200"),
    TestResult("Uptime Kuma", "100%", "100%", "100%", "98.5%", "100%", "FAIL at 200"),
]

print("\n=== Load Test Results ===")
for r in results:
    print(f"  [{r.status}] {r.metric}")
    print(f"    Base: {r.baseline} | 50u: {r.load_50} | 100u: {r.load_100} | 200u: {r.load_200}")
    print(f"    SLA: {r.sla}")

Analysis and Optimization

# === Performance Analysis ===

@dataclass
class Bottleneck:
    component: str
    symptom: str
    metric: str
    root_cause: str
    fix: str
    impact: str

bottlenecks = [
    Bottleneck("Database", "Slow queries at 100+ users", "Query time >100ms",
        "Missing index on products.category", "CREATE INDEX idx_category", "p95 -40%"),
    Bottleneck("App Server", "CPU 92% at 200 users", "CPU saturated",
        "Single instance, no horizontal scale", "Add 2 more replicas", "Capacity 3x"),
    Bottleneck("Memory", "4.8GB at 200 users", "Memory leak suspected",
        "Connection pool not releasing", "Fix pool config max_idle=10", "Memory -30%"),
    Bottleneck("Cache", "Low hit rate 25%", "Cache miss high",
        "TTL too short, no warming", "Increase TTL to 5min + cache warming", "DB load -50%"),
]

print("Bottleneck Analysis:")
for b in bottlenecks:
    print(f"  [{b.component}] {b.symptom}")
    print(f"    Metric: {b.metric}")
    print(f"    Root Cause: {b.root_cause}")
    print(f"    Fix: {b.fix} | Impact: {b.impact}")

# After optimization — retest
after_fix = {
    "Response p95 (100u)": "350ms → 180ms (-49%)",
    "Response p95 (200u)": "680ms → 320ms (-53%)",
    "Error Rate (200u)": "2.5% → 0.2%",
    "CPU (200u)": "92% → 45% (3 replicas)",
    "Memory (200u)": "4.8GB → 2.8GB (pool fix)",
    "Max Capacity": "~150 users → ~400 users",
    "Uptime Kuma": "98.5% → 100% at 200 users",
}

print(f"\n\nAfter Optimization:")
for k, v in after_fix.items():
    print(f"  [{k}]: {v}")

เคล็ดลับ

การดูแลระบบในสภาพแวดล้อม Production

การบริหารจัดการระบบ Production ที่ดีต้องมี Monitoring ครอบคลุม ใช้เครื่องมืออย่าง Prometheus + Grafana สำหรับ Metrics Collection และ Dashboard หรือ ELK Stack สำหรับ Log Management ตั้ง Alert ให้แจ้งเตือนเมื่อ CPU เกิน 80% RAM ใกล้เต็ม หรือ Disk Usage สูง

Backup Strategy ต้องวางแผนให้ดี ใช้หลัก 3-2-1 คือ มี Backup อย่างน้อย 3 ชุด เก็บใน Storage 2 ประเภทต่างกัน และ 1 ชุดต้องอยู่ Off-site ทดสอบ Restore Backup เป็นประจำ อย่างน้อยเดือนละครั้ง เพราะ Backup ที่ Restore ไม่ได้ก็เหมือนไม่มี Backup

เรื่อง Security Hardening ต้องทำตั้งแต่เริ่มต้น ปิด Port ที่ไม่จำเป็น ใช้ SSH Key แทน Password ตั้ง Fail2ban ป้องกัน Brute Force อัพเดท Security Patch สม่ำเสมอ และทำ Vulnerability Scanning อย่างน้อยเดือนละครั้ง ใช้หลัก Principle of Least Privilege ให้สิทธิ์น้อยที่สุดที่จำเป็น

ใช้ Uptime Kuma กับ Load Testing อย่างไร

Monitor Response Time ระหว่าง Load Test เปรียบเทียบก่อนหลัง Alert Threshold Status Page Service Degradation k6 Locust JMeter

Load Testing Tools มีอะไรบ้าง

k6 JavaScript CI/CD Locust Python Distributed JMeter GUI Protocol Gatling Scala Artillery YAML wrk CLI ab Apache Bench

วัดผล Load Test อย่างไร

Response Time p50 p95 p99 Throughput RPS Error Rate Concurrent Users CPU Memory Disk Uptime Kuma Graph Saturation Point

วางแผน Load Test อย่างไร

SLA กำหนด Baseline วัดปกติ Ramp Up Stress Spike Soak Monitor ตลอด วิเคราะห์ผล เปรียบเทียบ SLA Iterate Fix Retest

สรุป

Uptime Kuma Monitoring Load Testing k6 Locust Response Time Throughput Error Rate SLA Baseline Stress Spike Bottleneck Optimization Production

📖 บทความที่เกี่ยวข้อง

Uptime Kuma Monitoring Edge Deploymentอ่านบทความ → Uptime Kuma Monitoring Disaster Recovery Planอ่านบทความ → Uptime Kuma Monitoring Pub Sub Architectureอ่านบทความ → Uptime Kuma Monitoring Stream Processingอ่านบทความ →

📚 ดูบทความทั้งหมด →