SiamCafe.net Blog
Technology

Stable Diffusion ComfyUI Metric Collection — เก็บ Metrics จาก AI Image Generation

stable diffusion comfyui metric collection
Stable Diffusion ComfyUI Metric Collection | SiamCafe Blog
2025-06-18· อ. บอม — SiamCafe.net· 1,620 คำ

Stable Diffusion และ ComfyUI คืออะไร

Stable Diffusion เป็น open source text-to-image AI model ที่สร้างภาพจากข้อความ (prompts) พัฒนาโดย Stability AI รองรับ text-to-image, image-to-image, inpainting, outpainting และ ControlNet สำหรับควบคุม pose และ composition

ComfyUI เป็น node-based GUI สำหรับ Stable Diffusion ที่ให้ผู้ใช้สร้าง image generation workflows ด้วยการลาก nodes มาต่อกัน ข้อดีคือ flexible มากกว่า Automatic1111, รองรับ complex workflows เช่น multi-model pipelines, conditional generation, batch processing และ custom nodes จาก community

Metric Collection สำหรับ AI image generation สำคัญเพราะช่วย track generation speed (images/second), GPU utilization (VRAM usage, compute), model performance (quality scores), cost per image (electricity, cloud GPU costs) และ workflow efficiency (bottleneck identification)

Use cases ที่ต้องการ metrics ได้แก่ production image generation services, A/B testing different models/samplers, capacity planning สำหรับ GPU infrastructure, billing และ usage tracking สำหรับ SaaS และ quality monitoring เพื่อ detect model degradation

ติดตั้ง ComfyUI และ Stable Diffusion

วิธีติดตั้ง ComfyUI พร้อม monitoring

# === ติดตั้ง ComfyUI ===

# Prerequisites
# - Python 3.10+
# - NVIDIA GPU with CUDA support
# - Git

# Clone ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI

# สร้าง virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows

# ติดตั้ง dependencies
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

# ติดตั้ง monitoring dependencies
pip install prometheus_client psutil gputil pynvml flask

# === Download Models ===
# SDXL 1.0
cd models/checkpoints
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

# SD 1.5
wget https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors

# VAE
cd ../vae
wget https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors

cd ../..

# === Docker Setup with Monitoring ===
# docker-compose.yml
# services:
#   comfyui:
#     build: .
#     ports:
#       - "8188:8188"
#       - "9090:9090"  # Metrics endpoint
#     volumes:
#       - ./models:/app/models
#       - ./output:/app/output
#       - ./custom_nodes:/app/custom_nodes
#     deploy:
#       resources:
#         reservations:
#           devices:
#             - driver: nvidia
#               count: 1
#               capabilities: [gpu]
#     environment:
#       PROMETHEUS_METRICS: "true"
#
#   prometheus:
#     image: prom/prometheus:latest
#     ports: ["9091:9090"]
#     volumes:
#       - ./prometheus.yml:/etc/prometheus/prometheus.yml
#
#   grafana:
#     image: grafana/grafana:latest
#     ports: ["3000:3000"]
#     environment:
#       GF_SECURITY_ADMIN_PASSWORD: admin

# Start ComfyUI
python main.py --listen 0.0.0.0 --port 8188

# เปิด browser: http://localhost:8188

สร้าง Workflow สำหรับ Image Generation

ComfyUI workflow ผ่าน API

#!/usr/bin/env python3
# comfyui_workflow.py — ComfyUI Workflow via API
import requests
import json
import uuid
import time
import logging
from pathlib import Path
from typing import Dict, Optional

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("comfyui")

class ComfyUIClient:
    def __init__(self, host="localhost", port=8188):
        self.base_url = f"http://{host}:{port}"
        self.client_id = str(uuid.uuid4())
    
    def queue_prompt(self, workflow: Dict) -> str:
        resp = requests.post(
            f"{self.base_url}/prompt",
            json={"prompt": workflow, "client_id": self.client_id},
        )
        resp.raise_for_status()
        return resp.json()["prompt_id"]
    
    def get_history(self, prompt_id: str) -> Dict:
        resp = requests.get(f"{self.base_url}/history/{prompt_id}")
        resp.raise_for_status()
        return resp.json()
    
    def get_image(self, filename: str, subfolder: str = "", folder_type: str = "output"):
        params = {"filename": filename, "subfolder": subfolder, "type": folder_type}
        resp = requests.get(f"{self.base_url}/view", params=params)
        return resp.content
    
    def wait_for_completion(self, prompt_id: str, timeout: int = 300):
        start = time.time()
        
        while time.time() - start < timeout:
            history = self.get_history(prompt_id)
            
            if prompt_id in history:
                outputs = history[prompt_id].get("outputs", {})
                if outputs:
                    return outputs
            
            time.sleep(1)
        
        raise TimeoutError(f"Generation timed out after {timeout}s")
    
    def generate_image(self, prompt: str, negative_prompt: str = "",
                       width: int = 1024, height: int = 1024,
                       steps: int = 20, cfg: float = 7.0,
                       seed: int = -1, model: str = "sd_xl_base_1.0.safetensors"):
        
        if seed == -1:
            seed = __import__("random").randint(0, 2**32)
        
        workflow = {
            "3": {
                "class_type": "KSampler",
                "inputs": {
                    "seed": seed,
                    "steps": steps,
                    "cfg": cfg,
                    "sampler_name": "euler",
                    "scheduler": "normal",
                    "denoise": 1.0,
                    "model": ["4", 0],
                    "positive": ["6", 0],
                    "negative": ["7", 0],
                    "latent_image": ["5", 0],
                },
            },
            "4": {
                "class_type": "CheckpointLoaderSimple",
                "inputs": {"ckpt_name": model},
            },
            "5": {
                "class_type": "EmptyLatentImage",
                "inputs": {"width": width, "height": height, "batch_size": 1},
            },
            "6": {
                "class_type": "CLIPTextEncode",
                "inputs": {"text": prompt, "clip": ["4", 1]},
            },
            "7": {
                "class_type": "CLIPTextEncode",
                "inputs": {"text": negative_prompt, "clip": ["4", 1]},
            },
            "8": {
                "class_type": "VAEDecode",
                "inputs": {"samples": ["3", 0], "vae": ["4", 2]},
            },
            "9": {
                "class_type": "SaveImage",
                "inputs": {"filename_prefix": "gen", "images": ["8", 0]},
            },
        }
        
        prompt_id = self.queue_prompt(workflow)
        logger.info(f"Queued generation: {prompt_id}")
        
        outputs = self.wait_for_completion(prompt_id)
        
        images = []
        for node_id, output in outputs.items():
            for img in output.get("images", []):
                images.append(img)
        
        return {"prompt_id": prompt_id, "seed": seed, "images": images}

# client = ComfyUIClient()
# result = client.generate_image(
#     prompt="beautiful sunset over mountains, photorealistic, 8k",
#     negative_prompt="blurry, low quality, text, watermark",
#     width=1024, height=1024, steps=25,
# )
# print(f"Generated: {result['images']}")

Metric Collection สำหรับ AI Image Generation

ระบบ collect metrics จาก ComfyUI

#!/usr/bin/env python3
# metrics_collector.py — ComfyUI Metric Collection System
import time
import json
import logging
import threading
from datetime import datetime
from pathlib import Path
from typing import Dict, List
from dataclasses import dataclass, field

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("metrics")

try:
    import psutil
    import pynvml
    pynvml.nvmlInit()
    HAS_GPU = True
except Exception:
    HAS_GPU = False

@dataclass
class GenerationMetrics:
    prompt_id: str
    timestamp: str
    model: str
    resolution: str
    steps: int
    sampler: str
    generation_time_ms: float
    vram_peak_mb: float
    gpu_utilization: float
    cpu_utilization: float
    seed: int
    prompt_tokens: int = 0

class MetricsCollector:
    def __init__(self, storage_path="metrics"):
        self.storage_path = Path(storage_path)
        self.storage_path.mkdir(exist_ok=True)
        self.metrics: List[GenerationMetrics] = []
        self._gpu_handle = None
        
        if HAS_GPU:
            try:
                self._gpu_handle = pynvml.nvmlDeviceGetHandleByIndex(0)
            except Exception:
                pass
    
    def get_gpu_stats(self):
        if not self._gpu_handle:
            return {"vram_used_mb": 0, "vram_total_mb": 0, "gpu_util": 0, "temp": 0}
        
        try:
            mem_info = pynvml.nvmlDeviceGetMemoryInfo(self._gpu_handle)
            util = pynvml.nvmlDeviceGetUtilizationRates(self._gpu_handle)
            temp = pynvml.nvmlDeviceGetTemperature(self._gpu_handle, pynvml.NVML_TEMPERATURE_GPU)
            
            return {
                "vram_used_mb": mem_info.used // (1024 * 1024),
                "vram_total_mb": mem_info.total // (1024 * 1024),
                "gpu_util": util.gpu,
                "temp": temp,
            }
        except Exception:
            return {"vram_used_mb": 0, "vram_total_mb": 0, "gpu_util": 0, "temp": 0}
    
    def get_system_stats(self):
        return {
            "cpu_percent": psutil.cpu_percent(interval=0.1) if HAS_GPU else 0,
            "ram_used_mb": psutil.virtual_memory().used // (1024 * 1024) if HAS_GPU else 0,
            "ram_total_mb": psutil.virtual_memory().total // (1024 * 1024) if HAS_GPU else 0,
        }
    
    def record_generation(self, prompt_id, model, width, height,
                          steps, sampler, generation_time_ms, seed):
        gpu_stats = self.get_gpu_stats()
        sys_stats = self.get_system_stats()
        
        metric = GenerationMetrics(
            prompt_id=prompt_id,
            timestamp=datetime.utcnow().isoformat(),
            model=model,
            resolution=f"{width}x{height}",
            steps=steps,
            sampler=sampler,
            generation_time_ms=round(generation_time_ms, 1),
            vram_peak_mb=gpu_stats["vram_used_mb"],
            gpu_utilization=gpu_stats["gpu_util"],
            cpu_utilization=sys_stats["cpu_percent"],
            seed=seed,
        )
        
        self.metrics.append(metric)
        self._save_metric(metric)
        
        logger.info(
            f"Gen: {metric.resolution} {metric.steps}steps "
            f"{metric.generation_time_ms:.0f}ms VRAM:{metric.vram_peak_mb}MB"
        )
        
        return metric
    
    def _save_metric(self, metric):
        date_str = datetime.utcnow().strftime("%Y-%m-%d")
        filepath = self.storage_path / f"metrics_{date_str}.jsonl"
        
        with open(filepath, "a") as f:
            f.write(json.dumps(metric.__dict__) + "\n")
    
    def get_summary(self, hours=24):
        recent = [m for m in self.metrics]  # filter by time in production
        
        if not recent:
            return {"total_generations": 0}
        
        gen_times = [m.generation_time_ms for m in recent]
        vram_usage = [m.vram_peak_mb for m in recent]
        
        models_used = {}
        for m in recent:
            models_used[m.model] = models_used.get(m.model, 0) + 1
        
        resolutions_used = {}
        for m in recent:
            resolutions_used[m.resolution] = resolutions_used.get(m.resolution, 0) + 1
        
        return {
            "total_generations": len(recent),
            "avg_generation_ms": round(sum(gen_times) / len(gen_times), 1),
            "min_generation_ms": round(min(gen_times), 1),
            "max_generation_ms": round(max(gen_times), 1),
            "avg_vram_mb": round(sum(vram_usage) / len(vram_usage), 0),
            "peak_vram_mb": round(max(vram_usage), 0),
            "models_used": models_used,
            "resolutions_used": resolutions_used,
            "images_per_hour": round(len(recent) / max(hours, 1), 1),
        }
    
    def export_prometheus_metrics(self):
        summary = self.get_summary()
        gpu = self.get_gpu_stats()
        
        lines = [
            f"comfyui_generations_total {summary['total_generations']}",
            f"comfyui_generation_avg_ms {summary.get('avg_generation_ms', 0)}",
            f"comfyui_gpu_vram_used_mb {gpu['vram_used_mb']}",
            f"comfyui_gpu_vram_total_mb {gpu['vram_total_mb']}",
            f"comfyui_gpu_utilization {gpu['gpu_util']}",
            f"comfyui_gpu_temperature {gpu['temp']}",
        ]
        
        return "\n".join(lines)

# collector = MetricsCollector()
# collector.record_generation("abc123", "sdxl", 1024, 1024, 20, "euler", 5200, 42)
# print(json.dumps(collector.get_summary(), indent=2))

Automation และ Batch Processing

Batch generation ด้วย metrics tracking

#!/usr/bin/env python3
# batch_generator.py — Batch Image Generation with Metrics
import json
import time
import logging
import csv
from datetime import datetime
from pathlib import Path
from typing import List, Dict

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("batch")

class BatchGenerator:
    def __init__(self, comfyui_client, metrics_collector):
        self.client = comfyui_client
        self.metrics = metrics_collector
        self.results = []
    
    def generate_batch(self, prompts: List[Dict], output_dir="batch_output"):
        Path(output_dir).mkdir(exist_ok=True)
        
        logger.info(f"Starting batch: {len(prompts)} images")
        batch_start = time.time()
        
        for i, prompt_config in enumerate(prompts):
            logger.info(f"Generating {i+1}/{len(prompts)}: {prompt_config['prompt'][:50]}...")
            
            gen_start = time.time()
            
            try:
                result = self.client.generate_image(
                    prompt=prompt_config["prompt"],
                    negative_prompt=prompt_config.get("negative", ""),
                    width=prompt_config.get("width", 1024),
                    height=prompt_config.get("height", 1024),
                    steps=prompt_config.get("steps", 20),
                    cfg=prompt_config.get("cfg", 7.0),
                    seed=prompt_config.get("seed", -1),
                    model=prompt_config.get("model", "sd_xl_base_1.0.safetensors"),
                )
                
                gen_time = (time.time() - gen_start) * 1000
                
                # Record metrics
                self.metrics.record_generation(
                    prompt_id=result["prompt_id"],
                    model=prompt_config.get("model", "sdxl"),
                    width=prompt_config.get("width", 1024),
                    height=prompt_config.get("height", 1024),
                    steps=prompt_config.get("steps", 20),
                    sampler=prompt_config.get("sampler", "euler"),
                    generation_time_ms=gen_time,
                    seed=result["seed"],
                )
                
                self.results.append({
                    "index": i,
                    "prompt": prompt_config["prompt"],
                    "status": "success",
                    "generation_ms": round(gen_time),
                    "seed": result["seed"],
                    "images": result["images"],
                })
                
            except Exception as e:
                logger.error(f"Failed: {e}")
                self.results.append({
                    "index": i,
                    "prompt": prompt_config["prompt"],
                    "status": "error",
                    "error": str(e),
                })
        
        batch_time = time.time() - batch_start
        
        # Save batch report
        report = {
            "batch_id": datetime.utcnow().strftime("%Y%m%d_%H%M%S"),
            "total_prompts": len(prompts),
            "successful": sum(1 for r in self.results if r["status"] == "success"),
            "failed": sum(1 for r in self.results if r["status"] == "error"),
            "total_time_s": round(batch_time, 1),
            "avg_time_per_image_s": round(batch_time / max(len(prompts), 1), 1),
            "results": self.results,
        }
        
        report_path = Path(output_dir) / "batch_report.json"
        report_path.write_text(json.dumps(report, indent=2))
        
        logger.info(
            f"Batch complete: {report['successful']}/{report['total_prompts']} "
            f"in {report['total_time_s']}s"
        )
        
        return report
    
    def generate_from_csv(self, csv_file, output_dir="batch_output"):
        prompts = []
        
        with open(csv_file) as f:
            reader = csv.DictReader(f)
            for row in reader:
                prompts.append({
                    "prompt": row["prompt"],
                    "negative": row.get("negative_prompt", ""),
                    "width": int(row.get("width", 1024)),
                    "height": int(row.get("height", 1024)),
                    "steps": int(row.get("steps", 20)),
                    "cfg": float(row.get("cfg", 7.0)),
                    "seed": int(row.get("seed", -1)),
                })
        
        return self.generate_batch(prompts, output_dir)
    
    def ab_test_samplers(self, prompt, samplers=None):
        if samplers is None:
            samplers = ["euler", "euler_ancestral", "dpmpp_2m", "dpmpp_sde"]
        
        results = {}
        seed = 42  # fixed seed for fair comparison
        
        for sampler in samplers:
            logger.info(f"Testing sampler: {sampler}")
            gen_start = time.time()
            
            result = self.client.generate_image(
                prompt=prompt,
                seed=seed,
                steps=20,
            )
            
            gen_time = (time.time() - gen_start) * 1000
            
            results[sampler] = {
                "generation_ms": round(gen_time),
                "images": result["images"],
            }
        
        # Rank by speed
        ranked = sorted(results.items(), key=lambda x: x[1]["generation_ms"])
        
        logger.info("=== Sampler Comparison ===")
        for name, data in ranked:
            logger.info(f"  {name}: {data['generation_ms']}ms")
        
        return results

# batch = BatchGenerator(comfyui_client, metrics_collector)
# prompts = [
#     {"prompt": "a cat sitting on a windowsill, watercolor painting"},
#     {"prompt": "futuristic city at night, cyberpunk, neon lights"},
#     {"prompt": "mountain landscape, sunrise, photorealistic"},
# ]
# batch.generate_batch(prompts)

Performance Monitoring และ Optimization

Monitor และ optimize generation performance

# === Prometheus Configuration ===
# prometheus.yml
# global:
#   scrape_interval: 15s
#
# scrape_configs:
#   - job_name: 'comfyui'
#     static_configs:
#       - targets: ['localhost:9090']
#
#   - job_name: 'nvidia_gpu'
#     static_configs:
#       - targets: ['localhost:9400']

# === NVIDIA GPU Exporter (dcgm-exporter) ===
# docker run -d --gpus all \
#   -p 9400:9400 \
#   nvcr.io/nvidia/k8s/dcgm-exporter:latest

# === Performance Optimization Tips ===

# 1. Model Loading Optimization
# - Keep models in VRAM (--highvram flag)
# - Use FP16 models instead of FP32 (half VRAM)
# python main.py --listen 0.0.0.0 --highvram --fp16-vae

# 2. Xformers / Flash Attention
pip install xformers
# python main.py --use-pytorch-cross-attention  # or xformers auto

# 3. TensorRT Acceleration
# pip install tensorrt
# Convert model to TensorRT for 2-3x speedup
# python convert_to_trt.py --model sdxl --height 1024 --width 1024

# 4. Batch Size Optimization
# Test different batch sizes to find optimal for your GPU
# VRAM usage scales linearly with batch size
# RTX 3090 (24GB): batch 1-4 for SDXL, 1-8 for SD1.5
# RTX 4090 (24GB): batch 1-4 for SDXL, 1-8 for SD1.5
# A100 (80GB): batch 1-16 for SDXL

# 5. Resolution Optimization
# Generate at lower resolution, then upscale
# 512x512 -> upscale 4x = 2048x2048 (faster than generating at 2048)

# === Grafana Dashboard Queries ===

# GPU Utilization
# rate(comfyui_gpu_utilization[5m])

# Generation Rate (images/hour)
# rate(comfyui_generations_total[1h]) * 3600

# Average Generation Time
# comfyui_generation_avg_ms

# VRAM Usage %
# comfyui_gpu_vram_used_mb / comfyui_gpu_vram_total_mb * 100

# === Alerting Rules ===
# alerting_rules.yml
# groups:
#   - name: comfyui
#     rules:
#       - alert: GPUHighTemperature
#         expr: comfyui_gpu_temperature > 85
#         for: 5m
#         labels:
#           severity: warning
#         annotations:
#           summary: "GPU temperature above 85C"
#
#       - alert: VRAMNearFull
#         expr: comfyui_gpu_vram_used_mb / comfyui_gpu_vram_total_mb > 0.95
#         for: 1m
#         labels:
#           severity: critical
#         annotations:
#           summary: "GPU VRAM usage above 95%"
#
#       - alert: SlowGeneration
#         expr: comfyui_generation_avg_ms > 30000
#         for: 10m
#         labels:
#           severity: warning
#         annotations:
#           summary: "Average generation time exceeds 30s"

echo "Performance monitoring configured"

FAQ คำถามที่พบบ่อย

Q: ComfyUI กับ Automatic1111 ต่างกันอย่างไร?

A: Automatic1111 (A1111) มี web UI ที่ใช้งานง่ายกว่าสำหรับผู้เริ่มต้น มี extensions ecosystem ใหญ่ ComfyUI ใช้ node-based workflow ที่ flexible กว่ามาก สร้าง complex pipelines ได้ ใช้ VRAM น้อยกว่า (efficient memory management) และเร็วกว่า สำหรับ production workloads ComfyUI เหมาะกว่าเพราะ API support ดี automation ง่าย และ reproducible workflows

Q: ต้องการ GPU อะไรสำหรับ Stable Diffusion?

A: ขั้นต่ำ NVIDIA GPU 8GB VRAM (RTX 3060) สำหรับ SD 1.5 แนะนำ 12GB+ VRAM (RTX 3060 12GB, RTX 4070) สำหรับ SDXL ต้อง 12GB+ VRAM สำหรับ production ที่ต้อง generate เร็ว RTX 4090 (24GB) หรือ A100 (40/80GB) เหมาะสม ราคา cloud GPU A100 ประมาณ $1-3/hr on-demand สำหรับ AMD GPU รองรับผ่าน ROCm แต่ performance ต่ำกว่า NVIDIA

Q: จะ track cost per image ได้อย่างไร?

A: คำนวณจาก GPU power consumption (watts) * generation time (hours) * electricity rate (THB/kWh) เช่น RTX 4090 ใช้ ~450W generation 5 วินาที = 0.000625 kWh * 5 THB/kWh = 0.003 THB/image สำหรับ cloud GPU คำนวณจาก hourly rate / images per hour เช่น A100 $3/hr generate 720 images/hr = $0.004/image ใช้ metrics collector track generation time แล้วคำนวณ cost อัตโนมัติ

Q: จะ monitor quality ของ generated images ได้อย่างไร?

A: ใช้ automated quality metrics เช่น FID (Frechet Inception Distance) วัดความคล้าย distribution กับ real images, CLIP Score วัดความตรงกับ prompt, Aesthetic Score วัดความสวย ใช้ LAION aesthetic predictor เก็บ scores ใน metrics database แล้ว track over time ถ้า scores ลดลงอาจบ่งบอกว่า model degradation หรือ prompt quality ลดลง สำหรับ production ใช้ human evaluation เสริมด้วย automated metrics

📖 บทความที่เกี่ยวข้อง

Stable Diffusion ComfyUI Multi-cloud Strategyอ่านบทความ → Stable Diffusion ComfyUI Message Queue Designอ่านบทความ → Stable Diffusion ComfyUI Certification Pathอ่านบทความ → Stable Diffusion ComfyUI Observability Stackอ่านบทความ →