SiamCafe.net Blog
Technology

Model Registry Career Development IT

model registry career development it
Model Registry Career Development IT | SiamCafe Blog
2026-04-22· อ. บอม — SiamCafe.net· 10,798 คำ

Model Registry กับ Career Development

Model Registry เป็นมากกว่าแค่เครื่องมือจัดการ ML Models มันคือทักษะที่สะท้อนความเข้าใจใน MLOps Lifecycle ทั้งหมด ตั้งแต่ Experiment Tracking, Model Versioning, Deployment Pipeline ไปจนถึง Monitoring การเข้าใจ Model Registry อย่างลึกซึ้งช่วยให้เติบโตในสาย ML Engineering และ MLOps Engineering

ตลาดงานปัจจุบันต้องการคนที่ไม่ใช่แค่ Train Model ได้ แต่ต้อง Deploy และจัดการ Model ใน Production ได้ด้วย Model Registry เป็น Bridge ระหว่าง Data Science และ Production Engineering

Career Path สาย ML/MLOps

ระดับตำแหน่งประสบการณ์ทักษะหลักเงินเดือน (โดยประมาณ)
EntryJunior ML Engineer0-2 ปีPython, ML Basics, Git30-50K THB
MidML Engineer2-5 ปีMLflow, Docker, Cloud50-90K THB
SeniorSenior ML Engineer5-8 ปีML Platform, K8s, Architecture90-150K THB
StaffStaff ML Engineer8+ ปีStrategy, Cross-team Leadership150-250K THB
SpecialistMLOps Engineer3-5 ปีCI/CD, Infrastructure, Monitoring60-120K THB
LeadML Platform Lead5-8 ปีPlatform Design, Team Management120-200K THB

สร้าง Portfolio ด้วย MLflow Model Registry

# portfolio_project.py — End-to-end ML Project สำหรับ Portfolio
# แสดงทักษะ: Data Processing, Training, Experiment Tracking,
# Model Registry, Serving, Monitoring

import mlflow
import mlflow.sklearn
from mlflow.tracking import MlflowClient
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    classification_report
)
import pandas as pd
import numpy as np
import json
from datetime import datetime

class MLPortfolioProject:
    """End-to-end ML Project แสดงทักษะ MLOps"""

    def __init__(self, experiment_name="iris_classification"):
        mlflow.set_tracking_uri("http://localhost:5000")
        mlflow.set_experiment(experiment_name)
        self.client = MlflowClient()

    def prepare_data(self):
        """Step 1: Data Preparation"""
        iris = load_iris()
        df = pd.DataFrame(iris.data, columns=iris.feature_names)
        df["target"] = iris.target

        # Data Quality Check
        print("=== Data Quality Report ===")
        print(f"Shape: {df.shape}")
        print(f"Missing values:\n{df.isnull().sum()}")
        print(f"Class distribution:\n{df['target'].value_counts()}")

        X = df.drop("target", axis=1)
        y = df["target"]

        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=0.2, random_state=42, stratify=y
        )

        return X_train, X_test, y_train, y_test

    def train_and_evaluate(self, model_class, params, X_train, y_train,
                           X_test, y_test, run_name):
        """Step 2: Training กับ Experiment Tracking"""
        with mlflow.start_run(run_name=run_name):
            # Log Parameters
            mlflow.log_params(params)
            mlflow.log_param("model_class", model_class.__name__)

            # Train
            model = model_class(**params)
            model.fit(X_train, y_train)

            # Predict
            y_pred = model.predict(X_test)

            # Metrics
            metrics = {
                "accuracy": accuracy_score(y_test, y_pred),
                "precision_macro": precision_score(y_test, y_pred, average="macro"),
                "recall_macro": recall_score(y_test, y_pred, average="macro"),
                "f1_macro": f1_score(y_test, y_pred, average="macro"),
            }

            # Cross Validation
            cv_scores = cross_val_score(model, X_train, y_train, cv=5)
            metrics["cv_mean"] = cv_scores.mean()
            metrics["cv_std"] = cv_scores.std()

            mlflow.log_metrics(metrics)

            # Log Model
            mlflow.sklearn.log_model(
                model, "model",
                registered_model_name="iris_classifier",
            )

            # Log Classification Report
            report = classification_report(y_test, y_pred)
            mlflow.log_text(report, "classification_report.txt")

            print(f"\n{run_name}: accuracy={metrics['accuracy']:.4f} "
                  f"f1={metrics['f1_macro']:.4f}")

            return model, metrics

    def compare_models(self, X_train, y_train, X_test, y_test):
        """Step 3: Model Comparison"""
        experiments = [
            ("RandomForest_v1", RandomForestClassifier,
             {"n_estimators": 100, "max_depth": 5, "random_state": 42}),
            ("RandomForest_v2", RandomForestClassifier,
             {"n_estimators": 200, "max_depth": 10, "random_state": 42}),
            ("GradientBoosting_v1", GradientBoostingClassifier,
             {"n_estimators": 100, "learning_rate": 0.1, "random_state": 42}),
            ("GradientBoosting_v2", GradientBoostingClassifier,
             {"n_estimators": 200, "learning_rate": 0.05, "random_state": 42}),
        ]

        results = []
        for name, model_class, params in experiments:
            model, metrics = self.train_and_evaluate(
                model_class, params, X_train, y_train, X_test, y_test, name
            )
            results.append({"name": name, "metrics": metrics})

        # หา Best Model
        best = max(results, key=lambda r: r["metrics"]["f1_macro"])
        print(f"\nBest Model: {best['name']} "
              f"(F1: {best['metrics']['f1_macro']:.4f})")

        return results

    def promote_best_model(self, model_name="iris_classifier"):
        """Step 4: Model Registry Management"""
        # หา Latest Version
        versions = self.client.search_model_versions(
            f"name='{model_name}'"
        )

        if not versions:
            print("No model versions found")
            return

        # หา Version ที่มี F1 Score สูงสุด
        best_version = None
        best_f1 = 0

        for v in versions:
            run = self.client.get_run(v.run_id)
            f1 = run.data.metrics.get("f1_macro", 0)
            if f1 > best_f1:
                best_f1 = f1
                best_version = v

        if best_version:
            # Promote to Staging
            self.client.transition_model_version_stage(
                name=model_name,
                version=best_version.version,
                stage="Staging",
            )
            print(f"Model v{best_version.version} promoted to Staging "
                  f"(F1: {best_f1:.4f})")

    def run_pipeline(self):
        """รัน Full Pipeline"""
        print("=" * 50)
        print("ML Portfolio Project — End-to-end Pipeline")
        print("=" * 50)

        X_train, X_test, y_train, y_test = self.prepare_data()
        results = self.compare_models(X_train, y_train, X_test, y_test)
        self.promote_best_model()

        print("\nPipeline completed successfully!")

# รัน
# project = MLPortfolioProject()
# project.run_pipeline()

ทักษะที่ต้องเรียนรู้

# learning_roadmap.py — Learning Roadmap สำหรับ ML Engineer
# แสดงทักษะที่ต้องเรียนในแต่ละระดับ

roadmap = {
    "foundation": {
        "duration": "3-6 months",
        "skills": [
            "Python Programming (OOP, Data Structures)",
            "SQL & Database Fundamentals",
            "Git Version Control",
            "Linux Command Line",
            "Statistics & Probability",
            "Linear Algebra Basics",
        ],
        "projects": [
            "Data Analysis with Pandas",
            "SQL Query Optimization",
        ],
    },
    "ml_basics": {
        "duration": "3-6 months",
        "skills": [
            "Scikit-learn (Classification, Regression, Clustering)",
            "Feature Engineering",
            "Model Evaluation (CV, Metrics)",
            "Data Visualization (Matplotlib, Seaborn)",
            "Jupyter Notebooks",
        ],
        "projects": [
            "Kaggle Competition (Top 20%)",
            "End-to-end ML Project on GitHub",
        ],
    },
    "ml_engineering": {
        "duration": "6-12 months",
        "skills": [
            "Deep Learning (TensorFlow/PyTorch)",
            "MLflow Experiment Tracking & Model Registry",
            "Docker Containerization",
            "REST API (FastAPI/Flask)",
            "Unit Testing for ML",
            "CI/CD Basics (GitHub Actions)",
        ],
        "projects": [
            "ML API with FastAPI + Docker",
            "Automated Training Pipeline with MLflow",
        ],
    },
    "mlops": {
        "duration": "6-12 months",
        "skills": [
            "Kubernetes (Deployment, Services, ConfigMaps)",
            "Cloud ML Services (AWS SageMaker / GCP Vertex AI)",
            "Data Pipelines (Airflow/Prefect)",
            "Model Monitoring & Drift Detection",
            "Infrastructure as Code (Terraform)",
            "Observability (Prometheus, Grafana)",
        ],
        "projects": [
            "Full MLOps Pipeline on K8s",
            "Model Monitoring Dashboard",
            "Open-source Contribution",
        ],
    },
    "senior_staff": {
        "duration": "ongoing",
        "skills": [
            "System Design for ML",
            "ML Platform Architecture",
            "Cost Optimization",
            "Team Leadership & Mentoring",
            "Technical Writing & Communication",
            "Cross-functional Collaboration",
        ],
        "projects": [
            "Design ML Platform for Organization",
            "Tech Blog / Conference Talks",
            "Mentor Junior Engineers",
        ],
    },
}

# แสดง Roadmap
for level, info in roadmap.items():
    print(f"\n{'='*50}")
    print(f"Level: {level.upper()} ({info['duration']})")
    print(f"{'='*50}")
    print("Skills:")
    for skill in info["skills"]:
        print(f"  - {skill}")
    print("Projects:")
    for project in info["projects"]:
        print(f"  - {project}")

Docker + FastAPI สำหรับ Model Serving

# model_api.py — FastAPI Model Serving
# ตัวอย่าง Production-ready API สำหรับ Portfolio
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import mlflow
import numpy as np
import logging

app = FastAPI(title="ML Model API", version="1.0")
logger = logging.getLogger(__name__)

# Load Model จาก MLflow Registry
MODEL_URI = "models:/iris_classifier/Production"
model = None

@app.on_event("startup")
async def load_model():
    global model
    try:
        model = mlflow.sklearn.load_model(MODEL_URI)
        logger.info(f"Model loaded from {MODEL_URI}")
    except Exception as e:
        logger.error(f"Failed to load model: {e}")

class PredictRequest(BaseModel):
    features: list[list[float]]

class PredictResponse(BaseModel):
    predictions: list[int]
    probabilities: list[list[float]]
    model_version: str

@app.post("/predict", response_model=PredictResponse)
async def predict(req: PredictRequest):
    if model is None:
        raise HTTPException(500, "Model not loaded")

    X = np.array(req.features)
    predictions = model.predict(X).tolist()
    probabilities = model.predict_proba(X).tolist()

    return PredictResponse(
        predictions=predictions,
        probabilities=probabilities,
        model_version="production",
    )

@app.get("/health")
async def health():
    return {"status": "healthy", "model_loaded": model is not None}

# --- Dockerfile ---
# FROM python:3.12-slim
# WORKDIR /app
# COPY requirements.txt .
# RUN pip install --no-cache-dir -r requirements.txt
# COPY . .
# EXPOSE 8000
# HEALTHCHECK CMD curl -f http://localhost:8000/health || exit 1
# CMD ["uvicorn", "model_api:app", "--host", "0.0.0.0", "--port", "8000"]

# --- docker-compose.yml ---
# version: "3.8"
# services:
#   mlflow:
#     image: ghcr.io/mlflow/mlflow:latest
#     ports: ["5000:5000"]
#     command: mlflow server --host 0.0.0.0
#   model-api:
#     build: .
#     ports: ["8000:8000"]
#     environment:
#       MLFLOW_TRACKING_URI: http://mlflow:5000
#     depends_on: [mlflow]

เตรียมตัวสัมภาษณ์งาน

Model Registry เกี่ยวข้องกับ Career Development อย่างไร

Model Registry เป็นทักษะสำคัญของ ML Engineer และ MLOps Engineer แสดงความเข้าใจ MLOps Lifecycle ทั้งหมด เป็น Core Skills ที่ตลาดงานต้องการ ช่วยให้เติบโตในสาย ML/AI Engineering

ML Engineer ต้องมีทักษะอะไรบ้าง

Python, ML Frameworks (TensorFlow/PyTorch), MLOps Tools (MLflow/Kubeflow), Docker/Kubernetes, CI/CD, Cloud Services (AWS SageMaker/GCP Vertex AI), Data Engineering พื้นฐาน และ Software Engineering Best Practices

Career Path ของ ML Engineer เป็นอย่างไร

Junior ML Engineer (1-3 ปี) → ML Engineer (3-5 ปี) → Senior ML Engineer (5-8 ปี) → Staff/Principal ML Engineer (8+ ปี) หรือแยกไป MLOps Engineer, ML Platform Lead, ML Manager ตามความสนใจ

วิธีสร้าง Portfolio สาย ML ทำอย่างไร

สร้าง End-to-end Projects บน GitHub แสดง Data Processing, Training, MLflow Tracking, Model Serving ด้วย Docker เขียน Blog อธิบาย Technical Decisions มี Clean Code Tests Documentation และ CI/CD Pipeline

สรุป

Model Registry เป็นทักษะที่สำคัญสำหรับ Career Development ในสาย ML/AI แสดงความเข้าใจ MLOps Lifecycle ทั้งหมด สร้าง Portfolio ด้วย End-to-end Projects ที่ใช้ MLflow, Docker, FastAPI เรียนรู้ตาม Roadmap จาก Foundation ไป MLOps เตรียมตัวสัมภาษณ์ด้วย Technical Questions, System Design และ Portfolio ที่แสดงทักษะจริง

📖 บทความที่เกี่ยวข้อง

Model Registry Monitoring และ Alertingอ่านบทความ → Falco Runtime Security Career Development ITอ่านบทความ → Model Registry Network Segmentationอ่านบทความ → Model Registry Domain Driven Design DDDอ่านบทความ →

📚 ดูบทความทั้งหมด →