MLOps Pipeline Container Orchestration — จัดการ

MLOps Container

MLOps Pipeline Container Orchestration Kubernetes Kubeflow MLflow Model Training Serving Docker GPU CI/CD Monitoring Drift Retraining Production

Platform	Type	K8s Required	Complexity	เหมาะกับ
Kubeflow	Full Platform	ใช่	สูง	Enterprise
MLflow	Tracking + Registry	ไม่จำเป็น	ต่ำ	All sizes
Vertex AI	Managed (GCP)	ไม่จำเป็น	ปานกลาง	GCP Users
SageMaker	Managed (AWS)	ไม่จำเป็น	ปานกลาง	AWS Users

ML Pipeline

=== MLOps Pipeline ===

อ่านเพิ่ม: TTS Coqui Data Pipeline ETL — วิธีตั้งค่าและใช้งานจริงพร้อมต · อ่านเพิ่ม: thailand unemployment rate | SiamCafe Blog · อ่านเพิ่ม: trend following pantip | SiamCafe Blog

Dockerfile — ML Model Serving

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY model/ ./model/

COPY app.py .

EXPOSE 8000

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

FastAPI — Model Serving

from fastapi import FastAPI

import joblib

import numpy as np

app = FastAPI()

model = joblib.load("model/model.pkl")

@app.post("/predict")

async def predict(features: list[float]):

prediction = model.predict([features])

return {"prediction": prediction.tolist()}

เนื้อหาเกี่ยวข้อง — โค้ดเงิน 1 ล้าน — ข้อมูลครบถ้วน 2026

@app.get("/health")

async def health():

return {"status": "healthy", "model_version": "v1.2.3"}

MLflow — Experiment Tracking

import mlflow

import mlflow.sklearn

mlflow.set_tracking_uri("http://mlflow-server:5000")

แนะนำเพิ่มเติม — ดูสัญญาณเทรดที่ XM Signal

mlflow.set_experiment("fraud-detection")

with mlflow.start_run():

mlflow.log_param("n_estimators", 100)

mlflow.log_param("max_depth", 10)

mlflow.log_metric("accuracy", 0.95)

mlflow.log_metric("f1_score", 0.92)

mlflow.sklearn.log_model(model, "model")

from dataclasses import dataclass

@dataclass

class MLPipeline:

stage: str

container: str

gpu: bool

duration: str

status: str

pipelines = [

เนื้อหาเกี่ยวข้อง — ทำความเข้าใจ กองทุนรวมเวียดนาม — ข้อมูลครบถ้วน 2026

MLPipeline("Data Ingestion", "Extract", "data-loader:v2", False, "15 min", "Completed"),

MLPipeline("Feature Engineering", "Transform", "feature-eng:v3", False, "30 min", "Completed"),

MLPipeline("Model Training", "Train", "trainer:v5-gpu", True, "2 hours", "Running"),

MLPipeline("Evaluation", "Evaluate", "evaluator:v2", False, "10 min", "Pending"),

MLPipeline("Model Registry", "Register", "mlflow:v2", False, "2 min", "Pending"),

MLPipeline("Deployment", "Deploy", "kserve:v1", True, "5 min", "Pending"),

]

print("=== ML Pipeline ===")

for p in pipelines:

gpu_str = "GPU" if p.gpu else "CPU"

print(f" [{p.status}] {p.name} ({p.stage})")

print(f" Container: {p.container} | {gpu_str} | Duration: {p.duration}")

แนะนำเพิ่มเติม — อ่านเพิ่มเติมที่ SiamCafeBook

Kubernetes Deployment

=== K8s ML Deployment ===

KServe — Model Serving

apiVersion: serving.kserve.io/v1beta1

kind: InferenceService

metadata:

spec:

predictor:

model:

modelFormat:

storageUri: s3://models/fraud-detector/v1.2.3

resources:

requests:

cpu: "2"

memory: "4Gi"

เนื้อหาเกี่ยวข้อง — บทความที่เกี่ยวข้อง: ราคาเหรียญ luna — ข้อมูลครบถ้วน 2026

limits:

cpu: "4"

memory: "8Gi"

nvidia.com/gpu: "1"

minReplicas: 2

maxReplicas: 10

scaleTarget: 10 # concurrent requests

Canary Deployment

spec:

predictor:

canaryTrafficPercent: 10

model:

storageUri: s3://models/fraud-detector/v1.3.0

# 90% -> v1.2.3, 10% -> v1.3.0

GPU Node Pool

kubectl label nodes gpu-node-1 accelerator=nvidia-a100

nodeSelector:

accelerator: nvidia-a100

tolerations:

key: nvidia.com/gpu

operator: Exists

effect: NoSchedule

@dataclass

class ModelDeployment:

model: str

version: str

replicas: int

gpu: str

rps: int

เนื้อหาเกี่ยวข้อง — ONNX Runtime CQRS Event Sourcing — Inference Engine และ Architecture Pattern

latency_p99_ms: int

accuracy: float

deployments = [

ModelDeployment("fraud-detector", "v1.2.3", 4, "T4", 500, 45, 0.952),

ModelDeployment("recommendation", "v3.1.0", 6, "A100", 2000, 25, 0.891),

ModelDeployment("sentiment-analysis", "v2.0.1", 2, "None", 300, 80, 0.934),

ModelDeployment("image-classifier", "v1.5.2", 3, "T4", 100, 120, 0.967),

ModelDeployment("text-embedding", "v1.0.0", 8, "A100", 5000, 15, 0.0),

]

print("\n=== Model Deployments ===")

for d in deployments:

print(f" [{d.model}] {d.version} x{d.replicas}")

print(f" GPU: {d.gpu} | RPS: {d.rps} | p99: {d.latency_p99_ms}ms | Acc: {d.accuracy:.1%}")

Monitoring และ Retraining

# === Model Monitoring ===

# Prometheus Metrics for ML
# - model_prediction_latency_seconds
# - model_prediction_total (by class)
# - model_accuracy_score
# - feature_drift_score
# - data_quality_score

# Grafana Dashboard
# - Prediction latency p50/p95/p99
# - Prediction volume per model
# - Feature distribution shift
# - Accuracy over time
# - GPU utilization per model

# Auto-retrain Trigger
# if feature_drift > threshold:
#     trigger_pipeline("retrain")
# if accuracy < sla_target:
#     trigger_pipeline("retrain")
# schedule: weekly retrain with fresh data

monitoring = {
    "Prediction Latency p99": "45ms (target < 100ms)",
    "Throughput": "3,500 predictions/sec",
    "Model Accuracy (7d)": "95.2% (target > 93%)",
    "Feature Drift Score": "0.12 (threshold: 0.3)",
    "Data Quality Score": "98.5%",
    "GPU Utilization": "72% average",
    "Failed Predictions": "0.05%",
    "Auto-retrain Count (30d)": "2 triggered",
}

print("ML Monitoring Dashboard:")
for k, v in monitoring.items():
    print(f"  {k}: {v}")

best_practices = [
    "Docker Image: Pin ทุก Library Version ใน requirements.txt",
    "MLflow: Track ทุก Experiment Parameter Metric Artifact",
    "KServe: ใช้ Canary สำหรับ Model ใหม่ทุกครั้ง",
    "GPU: ใช้ Node Pool แยก GPU/CPU Workload",
    "Monitoring: Alert เมื่อ Accuracy ต่ำกว่า SLA",
    "Drift: ตรวจ Feature Drift ทุกวัน Auto-retrain เมื่อ Drift",
    "Registry: Version ทุก Model ใน MLflow Registry",
]

print(f"\n\nBest Practices:")
for i, p in enumerate(best_practices, 1):
    print(f"  {i}. {p}")

เคล็ดลับ

MLflow: Track ทุก Experiment ไม่ว่าจะเล็กแค่ไหน
Docker: Pin Library Version ทุกตัวใน Image
Canary: ทดสอบ Model ใหม่ด้วย Canary ก่อน Full Deploy
GPU: แยก Node Pool GPU/CPU ประหยัดค่าใช้จ่าย
Drift: Monitor Feature Drift Auto-retrain เมื่อจำเป็น

การนำไปใช้งานจริงในองค์กร

สำหรับองค์กรขนาดกลางถึงใหญ่ แนะนำให้ใช้หลัก Three-Tier Architecture คือ Core Layer ที่เป็นแกนกลางของระบบ Distribution Layer ที่ทำหน้าที่กระจาย Traffic และ Access Layer ที่เชื่อมต่อกับผู้ใช้โดยตรง การแบ่ง Layer ชัดเจนช่วยให้การ Troubleshoot ง่ายขึ้นและสามารถ Scale ระบบได้ตามความต้องการ

เรื่อง Network Security ก็สำคัญไม่แพ้กัน ควรติดตั้ง Next-Generation Firewall ที่สามารถ Deep Packet Inspection ได้ ใช้ Network Segmentation แยก VLAN สำหรับแต่ละแผนก ติดตั้ง IDS/IPS เพื่อตรวจจับการโจมตี และทำ Regular Security Audit อย่างน้อยปีละ 2 ครั้ง

MLOps คืออะไร

ML Operations DevOps + ML CI/CD Model Version Control Data Monitoring Drift Retraining MLflow Kubeflow Vertex SageMaker Production

ทำไมต้องใช้ Container กับ ML

Reproducibility Dependency Scalability Portability GPU Sharing Isolation Kubernetes Auto-scaling Environment เหมือนกัน Local Cloud

Kubeflow กับ MLflow ต่างกันอย่างไร

Kubeflow Full Platform K8s Pipeline Notebook Serving หนัก MLflow Tracking Registry Serving เบา ไม่ต้อง K8s ใช้ร่วมกันได้

Deploy ML Model อย่างไร

Docker FastAPI Flask REST API Kubernetes KServe Seldon A/B Canary Auto-scaling GPU Node Pool Monitoring Latency Accuracy Drift

สรุป

MLOps Pipeline Container Orchestration Kubernetes Kubeflow MLflow Docker GPU KServe Canary Model Serving Training Monitoring Drift Retraining Production CI/CD

MLOps Pipeline Container Orchestration — จัดการ

MLOps Container

ML Pipeline

async def predict(features: list[float]):

async def health():

with mlflow.start_run():

class MLPipeline:

for p in pipelines:

Kubernetes Deployment

metadata:

spec:

predictor:

model:

modelFormat:

resources:

requests:

limits:

spec:

predictor:

model:

nodeSelector:

tolerations:

class ModelDeployment:

for d in deployments:

Monitoring และ Retraining

เคล็ดลับ

การนำไปใช้งานจริงในองค์กร

MLOps คืออะไร

ทำไมต้องใช้ Container กับ ML

Kubeflow กับ MLflow ต่างกันอย่างไร

Deploy ML Model อย่างไร

สรุป

บทความที่เกี่ยวข้อง

แนะนำจากเครือข่าย SiamCafe

บทความที่เกี่ยวข้อง