Prompt Service Mesh
Midjourney Prompt Service Mesh Microservices Istio mTLS Circuit Breaking Load Balancing Queue Worker Scaling gRPC Event-driven
| Service | Function | Protocol | Scale Strategy |
|---|---|---|---|
| API Gateway | Route Request Auth | HTTP/REST | HPA (CPU) |
| Prompt Service | Generate Prompt | gRPC | HPA (Request Rate) |
| Template Service | CRUD Template | gRPC | HPA (CPU) |
| Queue Service | Manage Job Queue | AMQP/Kafka | Fixed (2-3 replicas) |
| Generation Worker | Call Midjourney API | HTTP | KEDA (Queue Depth) |
| Rating Service | Rate Image Quality | gRPC | HPA (CPU) |
Microservices Architecture
# === Prompt System Microservices ===
from dataclasses import dataclass, field
@dataclass
class MicroService:
name: str
responsibility: str
tech_stack: str
dependencies: list
api: str
services = [
MicroService("api-gateway",
"รับ Request จาก Client, Auth, Rate Limit, Route",
"Kong / Envoy / Nginx",
[],
"REST: POST /api/generate, GET /api/templates, GET /api/images"),
MicroService("prompt-service",
"สร้าง Prompt จาก Template + Subject, Validate Quality",
"Python FastAPI + gRPC",
["template-service"],
"gRPC: GeneratePrompt(template_id, subject) → Prompt"),
MicroService("template-service",
"CRUD Template, Search, Version Control",
"Python FastAPI + PostgreSQL + Redis Cache",
[],
"gRPC: GetTemplate, CreateTemplate, SearchTemplates"),
MicroService("queue-service",
"จัดคิว Generation Job, Priority Queue, Status Tracking",
"RabbitMQ / Kafka + Redis",
[],
"AMQP: Publish Job, Consume Job, Get Status"),
MicroService("generation-worker",
"ดึง Job จาก Queue, เรียก Midjourney API, Save Result",
"Python + Discord API / Midjourney API + S3",
["queue-service"],
"Consumer: Process Job → Call API → Save Image → Notify"),
MicroService("rating-service",
"ประเมินคุณภาพภาพ, เก็บ Feedback, A/B Test Results",
"Python FastAPI + PostgreSQL",
[],
"gRPC: RateImage(image_id, score, feedback)"),
]
print("=== Microservices ===")
for s in services:
print(f"\n [{s.name}]")
print(f" Role: {s.responsibility}")
print(f" Stack: {s.tech_stack}")
print(f" Deps: {s.dependencies}")
print(f" API: {s.api}")
Istio Configuration
# === Istio Service Mesh Configuration ===
# PeerAuthentication - mTLS STRICT
# apiVersion: security.istio.io/v1beta1
# kind: PeerAuthentication
# metadata:
# name: default
# namespace: prompt-system
# spec:
# mtls:
# mode: STRICT
# DestinationRule - Circuit Breaking + Load Balancing
# apiVersion: networking.istio.io/v1beta1
# kind: DestinationRule
# metadata:
# name: generation-worker
# spec:
# host: generation-worker
# trafficPolicy:
# connectionPool:
# tcp:
# maxConnections: 100
# http:
# h2UpgradePolicy: UPGRADE
# maxRequestsPerConnection: 10
# outlierDetection:
# consecutive5xxErrors: 3
# interval: 30s
# baseEjectionTime: 60s
# maxEjectionPercent: 50
# VirtualService - Retry + Timeout
# apiVersion: networking.istio.io/v1beta1
# kind: VirtualService
# metadata:
# name: generation-worker
# spec:
# hosts:
# - generation-worker
# http:
# - timeout: 120s # Generation takes time
# retries:
# attempts: 3
# perTryTimeout: 45s
# retryOn: 5xx, reset, connect-failure
@dataclass
class IstioConfig:
resource: str
purpose: str
key_settings: str
impact: str
istio_configs = [
IstioConfig("PeerAuthentication",
"mTLS เข้ารหัส Service-to-Service Traffic",
"mode: STRICT ทุก Namespace",
"ป้องกัน Traffic Sniffing Man-in-the-Middle"),
IstioConfig("AuthorizationPolicy",
"จำกัด Service-to-Service Access",
"prompt-service → template-service เท่านั้น",
"ป้องกัน Unauthorized Service Access"),
IstioConfig("DestinationRule (Circuit Breaking)",
"ป้องกัน Cascading Failure",
"consecutive5xxErrors: 3, baseEjectionTime: 60s",
"Worker ที่พังจะถูก Eject ไม่รับ Traffic"),
IstioConfig("VirtualService (Retry)",
"Retry อัตโนมัติเมื่อ API Fail",
"attempts: 3, perTryTimeout: 45s, retryOn: 5xx",
"ลด Failure Rate เมื่อ Midjourney API ไม่เสถียร"),
IstioConfig("VirtualService (Timeout)",
"Timeout สำหรับ Long-running Request",
"timeout: 120s สำหรับ Generation",
"ป้องกัน Request ค้าง ใช้ Resource ไม่จำเป็น"),
IstioConfig("EnvoyFilter (Rate Limit)",
"จำกัด Request ต่อ User",
"10 req/min Free, 100 req/min Premium",
"ป้องกัน API Abuse ลด Cost"),
]
print("=== Istio Configs ===")
for c in istio_configs:
print(f" [{c.resource}] {c.purpose}")
print(f" Settings: {c.key_settings}")
print(f" Impact: {c.impact}")
Scaling & Monitoring
# === Auto-scaling Configuration ===
@dataclass
class ScaleConfig:
service: str
scaler: str
metric: str
min_max: str
cost_note: str
scaling = [
ScaleConfig("prompt-service",
"HPA (Horizontal Pod Autoscaler)",
"CPU > 70% หรือ Request Rate > 100/s",
"Min: 2, Max: 10",
"ใช้ Resource น้อย Scale เร็ว"),
ScaleConfig("template-service",
"HPA",
"CPU > 70%",
"Min: 2, Max: 5",
"Cache ใน Redis ลด Load มาก"),
ScaleConfig("generation-worker",
"KEDA (Queue-based)",
"Queue Depth > 10 Jobs",
"Min: 1, Max: 20",
"ใช้ Spot Instance ลดค่าใช้จ่าย 70%"),
ScaleConfig("rating-service",
"HPA",
"CPU > 70%",
"Min: 1, Max: 3",
"Load ต่ำ ไม่ค่อยต้อง Scale"),
ScaleConfig("queue-service",
"Fixed",
"-",
"Fixed: 3 replicas",
"RabbitMQ Cluster 3 nodes สำหรับ HA"),
]
print("=== Scaling Configs ===")
for s in scaling:
print(f" [{s.service}] {s.scaler}")
print(f" Metric: {s.metric}")
print(f" Replicas: {s.min_max}")
print(f" Cost: {s.cost_note}")
เคล็ดลับ
- KEDA: ใช้ KEDA Scale Worker ตาม Queue Depth ไม่ต้อง Guess
- Circuit Breaking: ตั้ง Circuit Breaking ป้องกัน Midjourney API ล่ม
- Cache: Cache Template ใน Redis ลด Load 80%+
- Spot: ใช้ Spot Instance สำหรับ Worker ลดค่าใช้จ่าย 70%
- Priority: ใช้ Priority Queue Premium User ได้ภาพเร็วกว่า
ทำไมต้องใช้ Service Mesh กับ Prompt System
Microservices mTLS Load Balancing Circuit Breaking Retry Rate Limiting Observability Metrics Tracing ป้องกัน Cascading Failure API Abuse
Architecture ออกแบบอย่างไร
API Gateway Prompt Service Template Service Queue Service Generation Worker Rating gRPC Event-driven RabbitMQ Kafka Redis Cache S3
Istio ตั้งค่าอย่างไร
PeerAuthentication mTLS AuthorizationPolicy DestinationRule Circuit Breaking VirtualService Retry Timeout EnvoyFilter Rate Limit STRICT
Scale อย่างไร
HPA CPU Request Rate KEDA Queue Depth Spot Instance Priority Queue Redis Cache Min Max Replicas Preemptive Scaling Off-peak Budget
สรุป
Midjourney Prompt Service Mesh Microservices Istio mTLS Circuit Breaking gRPC Queue KEDA HPA Spot Redis Cache Rate Limit Production
