SiamCafe · Blog
Stable Diffusion ComfyUI Metric Collection — เก็บ Metrics จาก AI Image Generation
บทความ

Stable Diffusion ComfyUI Metric Collection — เก็บ Metrics จาก AI Image Generation

เผยแพร่ 28 พฤษภาคม 2569

Stable Diffusion และ ComfyUI คืออะไร

Stable Diffusion เป็น open source text-to-image AI model ที่สร้างภาพจากข้อความ (prompts) พัฒนาโดย Stability AI รองรับ text-to-image, image-to-image, inpainting, outpainting และ ControlNet สำหรับควบคุม pose และ composition

ComfyUI เป็น node-based GUI สำหรับ Stable Diffusion ที่ให้ผู้ใช้สร้าง image generation workflows ด้วยการลาก nodes มาต่อกัน ข้อดีคือ flexible มากกว่า Automatic1111, รองรับ complex workflows เช่น multi-model pipelines, conditional generation, batch processing และ custom nodes จาก community

Metric Collection สำหรับ AI image generation สำคัญเพราะช่วย track generation speed (images/second), GPU utilization (VRAM usage, compute), model performance (quality scores), cost per image (electricity, cloud GPU costs) และ workflow efficiency (bottleneck identification)

Use cases ที่ต้องการ metrics ได้แก่ production image generation services, A/B testing different models/samplers, capacity planning สำหรับ GPU infrastructure, billing และ usage tracking สำหรับ SaaS และ quality monitoring เพื่อ detect model degradation

ติดตั้ง ComfyUI และ Stable Diffusion

วิธีติดตั้ง ComfyUI พร้อม monitoring

=== ติดตั้ง ComfyUI ===

Prerequisites

  • Python 3.10+
  • NVIDIA GPU with CUDA support
  • Git

Clone ComfyUI

git clone https://github.com/comfyanonymous/ComfyUI.git

cd ComfyUI

สร้าง virtual environment

python -m venv venv

source venv/bin/activate # Linux/Mac

venv\Scripts\activate # Windows

ติดตั้ง dependencies

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

pip install -r requirements.txt

ติดตั้ง monitoring dependencies

pip install prometheus_client psutil gputil pynvml flask

=== Download Models ===

SDXL 1.0

cd models/checkpoints

wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors

SD 1.5

wget https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors

VAE

cd ../vae

wget https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors

cd ../..

=== Docker Setup with Monitoring ===

docker-compose.yml

services:

comfyui:

build: .

ports:

  • "8188:8188"
  • "9090:9090" # Metrics endpoint

volumes:

  • ./models:/app/models
  • ./output:/app/output
  • ./custom_nodes:/app/custom_nodes

deploy:

resources:

reservations:

devices:

  • driver: nvidia

count: 1

capabilities: [gpu]

environment:

PROMETHEUS_METRICS: "true"

prometheus:

image: prom/prometheus:latest

ports: ["9091:9090"]

volumes:

  • ./prometheus.yml:/etc/prometheus/prometheus.yml

grafana:

image: grafana/grafana:latest

ports: ["3000:3000"]

environment:

GF_SECURITY_ADMIN_PASSWORD: admin

Start ComfyUI

python main.py --listen 0.0.0.0 --port 8188

เปิด browser: http://localhost:8188

สร้าง Workflow สำหรับ Image Generation

ComfyUI workflow ผ่าน API

#!/usr/bin/env python3
# comfyui_workflow.py — ComfyUI Workflow via API
import requests
import json
import uuid
import time
import logging
from pathlib import Path
from typing import Dict, Optional

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("comfyui")

class ComfyUIClient:
 def __init__(self, host="localhost", port=8188):
 self.base_url = f"http://{host}:{port}"
 self.client_id = str(uuid.uuid4())
 
 def queue_prompt(self, workflow: Dict) -> str:
 resp = requests.post(
 f"{self.base_url}/prompt",
 json={"prompt": workflow, "client_id": self.client_id},
 )
 resp.raise_for_status()
 return resp.json()["prompt_id"]
 
 def get_history(self, prompt_id: str) -> Dict:
 resp = requests.get(f"{self.base_url}/history/{prompt_id}")
 resp.raise_for_status()
 return resp.json()
 
 def get_image(self, filename: str, subfolder: str = "", folder_type: str = "output"):
 params = {"filename": filename, "subfolder": subfolder, "type": folder_type}
 resp = requests.get(f"{self.base_url}/view", params=params)
 return resp.content
 
 def wait_for_completion(self, prompt_id: str, timeout: int = 300):
 start = time.time()
 
 while time.time() - start < timeout:
 history = self.get_history(prompt_id)
 
 if prompt_id in history:
 outputs = history[prompt_id].get("outputs", {})
 if outputs:
 return outputs
 
 time.sleep(1)
 
 raise TimeoutError(f"Generation timed out after {timeout}s")
 
 def generate_image(self, prompt: str, negative_prompt: str = "",
 width: int = 1024, height: int = 1024,
 steps: int = 20, cfg: float = 7.0,
 seed: int = -1, model: str = "sd_xl_base_1.0.safetensors"):
 
 if seed == -1:
 seed = __import__("random").randint(0, 2**32)
 
 workflow = {
 "3": {
 "class_type": "KSampler",
 "inputs": {
 "seed": seed,
 "steps": steps,
 "cfg": cfg,
 "sampler_name": "euler",
 "scheduler": "normal",
 "denoise": 1.0,
 "model": ["4", 0],
 "positive": ["6", 0],
 "negative": ["7", 0],
 "latent_image": ["5", 0],
 },
 },
 "4": {
 "class_type": "CheckpointLoaderSimple",
 "inputs": {"ckpt_name": model},
 },
 "5": {
 "class_type": "EmptyLatentImage",
 "inputs": {"width": width, "height": height, "batch_size": 1},
 },
 "6": {
 "class_type": "CLIPTextEncode",
 "inputs": {"text": prompt, "clip": ["4", 1]},
 },
 "7": {
 "class_type": "CLIPTextEncode",
 "inputs": {"text": negative_prompt, "clip": ["4", 1]},
 },
 "8": {
 "class_type": "VAEDecode",
 "inputs": {"samples": ["3", 0], "vae": ["4", 2]},
 },
 "9": {
 "class_type": "SaveImage",
 "inputs": {"filename_prefix": "gen", "images": ["8", 0]},
 },
 }
 
 prompt_id = self.queue_prompt(workflow)
 logger.info(f"Queued generation: {prompt_id}")
 
 outputs = self.wait_for_completion(prompt_id)
 
 images = []
 for node_id, output in outputs.items():
 for img in output.get("images", []):
 images.append(img)
 
 return {"prompt_id": prompt_id, "seed": seed, "images": images}

# client = ComfyUIClient()
# result = client.generate_image(
# prompt="beautiful sunset over mountains, photorealistic, 8k",
# negative_prompt="blurry, low quality, text, watermark",
# width=1024, height=1024, steps=25,
# )
# print(f"Generated: {result['images']}")

Metric Collection สำหรับ AI Image Generation

ระบบ collect metrics จาก ComfyUI

#!/usr/bin/env python3
# metrics_collector.py — ComfyUI Metric Collection System
import time
import json
import logging
import threading
from datetime import datetime
from pathlib import Path
from typing import Dict, List
from dataclasses import dataclass, field

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("metrics")

try:
 import psutil
 import pynvml
 pynvml.nvmlInit()
 HAS_GPU = True
except Exception:
 HAS_GPU = False

@dataclass
class GenerationMetrics:
 prompt_id: str
 timestamp: str
 model: str
 resolution: str
 steps: int
 sampler: str
 generation_time_ms: float
 vram_peak_mb: float
 gpu_utilization: float
 cpu_utilization: float
 seed: int
 prompt_tokens: int = 0

class MetricsCollector:
 def __init__(self, storage_path="metrics"):
 self.storage_path = Path(storage_path)
 self.storage_path.mkdir(exist_ok=True)
 self.metrics: List[GenerationMetrics] = []
 self._gpu_handle = None
 
 if HAS_GPU:
 try:
 self._gpu_handle = pynvml.nvmlDeviceGetHandleByIndex(0)
 except Exception:
 pass
 
 def get_gpu_stats(self):
 if not self._gpu_handle:
 return {"vram_used_mb": 0, "vram_total_mb": 0, "gpu_util": 0, "temp": 0}
 
 try:
 mem_info = pynvml.nvmlDeviceGetMemoryInfo(self._gpu_handle)
 util = pynvml.nvmlDeviceGetUtilizationRates(self._gpu_handle)
 temp = pynvml.nvmlDeviceGetTemperature(self._gpu_handle, pynvml.NVML_TEMPERATURE_GPU)
 
 return {
 "vram_used_mb": mem_info.used // (1024 * 1024),
 "vram_total_mb": mem_info.total // (1024 * 1024),
 "gpu_util": util.gpu,
 "temp": temp,
 }
 except Exception:
 return {"vram_used_mb": 0, "vram_total_mb": 0, "gpu_util": 0, "temp": 0}
 
 def get_system_stats(self):
 return {
 "cpu_percent": psutil.cpu_percent(interval=0.1) if HAS_GPU else 0,
 "ram_used_mb": psutil.virtual_memory().used // (1024 * 1024) if HAS_GPU else 0,
 "ram_total_mb": psutil.virtual_memory().total // (1024 * 1024) if HAS_GPU else 0,
 }
 
 def record_generation(self, prompt_id, model, width, height,
 steps, sampler, generation_time_ms, seed):
 gpu_stats = self.get_gpu_stats()
 sys_stats = self.get_system_stats()
 
 metric = GenerationMetrics(
 prompt_id=prompt_id,
 timestamp=datetime.utcnow().isoformat(),
 model=model,
 resolution=f"{width}x{height}",
 steps=steps,
 sampler=sampler,
 generation_time_ms=round(generation_time_ms, 1),
 vram_peak_mb=gpu_stats["vram_used_mb"],
 gpu_utilization=gpu_stats["gpu_util"],
 cpu_utilization=sys_stats["cpu_percent"],
 seed=seed,
 )
 
 self.metrics.append(metric)
 self._save_metric(metric)
 
 logger.info(
 f"Gen: {metric.resolution} {metric.steps}steps "
 f"{metric.generation_time_ms:.0f}ms VRAM:{metric.vram_peak_mb}MB"
 )
 
 return metric
 
 def _save_metric(self, metric):
 date_str = datetime.utcnow().strftime("%Y-%m-%d")
 filepath = self.storage_path / f"metrics_{date_str}.jsonl"
 
 with open(filepath, "a") as f:
 f.write(json.dumps(metric.__dict__) + "\n")
 
 def get_summary(self, hours=24):
 recent = [m for m in self.metrics] # filter by time in production
 
 if not recent:
 return {"total_generations": 0}
 
 gen_times = [m.generation_time_ms for m in recent]
 vram_usage = [m.vram_peak_mb for m in recent]
 
 models_used = {}
 for m in recent:
 models_used[m.model] = models_used.get(m.model, 0) + 1
 
 resolutions_used = {}
 for m in recent:
 resolutions_used[m.resolution] = resolutions_used.get(m.resolution, 0) + 1
 
 return {
 "total_generations": len(recent),
 "avg_generation_ms": round(sum(gen_times) / len(gen_times), 1),
 "min_generation_ms": round(min(gen_times), 1),
 "max_generation_ms": round(max(gen_times), 1),
 "avg_vram_mb": round(sum(vram_usage) / len(vram_usage), 0),
 "peak_vram_mb": round(max(vram_usage), 0),
 "models_used": models_used,
 "resolutions_used": resolutions_used,
 "images_per_hour": round(len(recent) / max(hours, 1), 1),
 }
 
 def export_prometheus_metrics(self):
 summary = self.get_summary()
 gpu = self.get_gpu_stats()
 
 lines = [
 f"comfyui_generations_total {summary['total_generations']}",
 f"comfyui_generation_avg_ms {summary.get('avg_generation_ms', 0)}",
 f"comfyui_gpu_vram_used_mb {gpu['vram_used_mb']}",
 f"comfyui_gpu_vram_total_mb {gpu['vram_total_mb']}",
 f"comfyui_gpu_utilization {gpu['gpu_util']}",
 f"comfyui_gpu_temperature {gpu['temp']}",
 ]
 
 return "\n".join(lines)

# collector = MetricsCollector()
# collector.record_generation("abc123", "sdxl", 1024, 1024, 20, "euler", 5200, 42)
# print(json.dumps(collector.get_summary(), indent=2))

Automation และ Batch Processing

Batch generation ด้วย metrics tracking

#!/usr/bin/env python3
# batch_generator.py — Batch Image Generation with Metrics
import json
import time
import logging
import csv
from datetime import datetime
from pathlib import Path
from typing import List, Dict

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("batch")

class BatchGenerator:
 def __init__(self, comfyui_client, metrics_collector):
 self.client = comfyui_client
 self.metrics = metrics_collector
 self.results = []
 
 def generate_batch(self, prompts: List[Dict], output_dir="batch_output"):
 Path(output_dir).mkdir(exist_ok=True)
 
 logger.info(f"Starting batch: {len(prompts)} images")
 batch_start = time.time()
 
 for i, prompt_config in enumerate(prompts):
 logger.info(f"Generating {i+1}/{len(prompts)}: {prompt_config['prompt'][:50]}...")
 
 gen_start = time.time()
 
 try:
 result = self.client.generate_image(
 prompt=prompt_config["prompt"],
 negative_prompt=prompt_config.get("negative", ""),
 width=prompt_config.get("width", 1024),
 height=prompt_config.get("height", 1024),
 steps=prompt_config.get("steps", 20),
 cfg=prompt_config.get("cfg", 7.0),
 seed=prompt_config.get("seed", -1),
 model=prompt_config.get("model", "sd_xl_base_1.0.safetensors"),
 )
 
 gen_time = (time.time() - gen_start) * 1000
 
 # Record metrics
 self.metrics.record_generation(
 prompt_id=result["prompt_id"],
 model=prompt_config.get("model", "sdxl"),
 width=prompt_config.get("width", 1024),
 height=prompt_config.get("height", 1024),
 steps=prompt_config.get("steps", 20),
 sampler=prompt_config.get("sampler", "euler"),
 generation_time_ms=gen_time,
 seed=result["seed"],
 )
 
 self.results.append({
 "index": i,
 "prompt": prompt_config["prompt"],
 "status": "success",
 "generation_ms": round(gen_time),
 "seed": result["seed"],
 "images": result["images"],
 })
 
 except Exception as e:
 logger.error(f"Failed: {e}")
 self.results.append({
 "index": i,
 "prompt": prompt_config["prompt"],
 "status": "error",
 "error": str(e),
 })
 
 batch_time = time.time() - batch_start
 
 # Save batch report
 report = {
 "batch_id": datetime.utcnow().strftime("%Y%m%d_%H%M%S"),
 "total_prompts": len(prompts),
 "successful": sum(1 for r in self.results if r["status"] == "success"),
 "failed": sum(1 for r in self.results if r["status"] == "error"),
 "total_time_s": round(batch_time, 1),
 "avg_time_per_image_s": round(batch_time / max(len(prompts), 1), 1),
 "results": self.results,
 }
 
 report_path = Path(output_dir) / "batch_report.json"
 report_path.write_text(json.dumps(report, indent=2))
 
 logger.info(
 f"Batch complete: {report['successful']}/{report['total_prompts']} "
 f"in {report['total_time_s']}s"
 )
 
 return report
 
 def generate_from_csv(self, csv_file, output_dir="batch_output"):
 prompts = []
 
 with open(csv_file) as f:
 reader = csv.DictReader(f)
 for row in reader:
 prompts.append({
 "prompt": row["prompt"],
 "negative": row.get("negative_prompt", ""),
 "width": int(row.get("width", 1024)),
 "height": int(row.get("height", 1024)),
 "steps": int(row.get("steps", 20)),
 "cfg": float(row.get("cfg", 7.0)),
 "seed": int(row.get("seed", -1)),
 })
 
 return self.generate_batch(prompts, output_dir)
 
 def ab_test_samplers(self, prompt, samplers=None):
 if samplers is None:
 samplers = ["euler", "euler_ancestral", "dpmpp_2m", "dpmpp_sde"]
 
 results = {}
 seed = 42 # fixed seed for fair comparison
 
 for sampler in samplers:
 logger.info(f"Testing sampler: {sampler}")
 gen_start = time.time()
 
 result = self.client.generate_image(
 prompt=prompt,
 seed=seed,
 steps=20,
 )
 
 gen_time = (time.time() - gen_start) * 1000
 
 results[sampler] = {
 "generation_ms": round(gen_time),
 "images": result["images"],
 }
 
 # Rank by speed
 ranked = sorted(results.items(), key=lambda x: x[1]["generation_ms"])
 
 logger.info("=== Sampler Comparison ===")
 for name, data in ranked:
 logger.info(f" {name}: {data['generation_ms']}ms")
 
 return results

# batch = BatchGenerator(comfyui_client, metrics_collector)
# prompts = [
# {"prompt": "a cat sitting on a windowsill, watercolor painting"},
# {"prompt": "futuristic city at night, cyberpunk, neon lights"},
# {"prompt": "mountain landscape, sunrise, photorealistic"},
# ]
# batch.generate_batch(prompts)

Performance Monitoring และ Optimization

Monitor และ optimize generation performance

=== Prometheus Configuration ===

prometheus.yml

global:

scrape_interval: 15s

scrape_configs:

  • job_name: 'comfyui'

static_configs:

  • targets: ['localhost:9090']
  • job_name: 'nvidia_gpu'

static_configs:

  • targets: ['localhost:9400']

=== NVIDIA GPU Exporter (dcgm-exporter) ===

docker run -d --gpus all \

-p 9400:9400 \

nvcr.io/nvidia/k8s/dcgm-exporter:latest

=== Performance Optimization Tips ===

1. Model Loading Optimization

  • Keep models in VRAM (--highvram flag)
  • Use FP16 models instead of FP32 (half VRAM)

python main.py --listen 0.0.0.0 --highvram --fp16-vae

2. Xformers / Flash Attention

pip install xformers

python main.py --use-pytorch-cross-attention # or xformers auto

3. TensorRT Acceleration

pip install tensorrt

Convert model to TensorRT for 2-3x speedup

python convert_to_trt.py --model sdxl --height 1024 --width 1024

4. Batch Size Optimization

Test different batch sizes to find optimal for your GPU

VRAM usage scales linearly with batch size

RTX 3090 (24GB): batch 1-4 for SDXL, 1-8 for SD1.5

RTX 4090 (24GB): batch 1-4 for SDXL, 1-8 for SD1.5

A100 (80GB): batch 1-16 for SDXL

5. Resolution Optimization

Generate at lower resolution, then upscale

512x512 -&gt; upscale 4x = 2048x2048 (faster than generating at 2048)

=== Grafana Dashboard Queries ===

GPU Utilization

rate(comfyui_gpu_utilization[5m])

Generation Rate (images/hour)

rate(comfyui_generations_total[1h]) * 3600

Average Generation Time

comfyui_generation_avg_ms

VRAM Usage %

comfyui_gpu_vram_used_mb / comfyui_gpu_vram_total_mb * 100

=== Alerting Rules ===

alerting_rules.yml

groups:

  • name: comfyui

rules:

  • alert: GPUHighTemperature

expr: comfyui_gpu_temperature &gt; 85

for: 5m

labels:

severity: warning

annotations:

summary: "GPU temperature above 85C"

  • alert: VRAMNearFull

expr: comfyui_gpu_vram_used_mb / comfyui_gpu_vram_total_mb &gt; 0.95

for: 1m

labels:

severity: critical

annotations:

summary: "GPU VRAM usage above 95%"

  • alert: SlowGeneration

expr: comfyui_generation_avg_ms &gt; 30000

for: 10m

labels:

severity: warning

annotations:

summary: "Average generation time exceeds 30s"

echo "Performance monitoring configured"

FAQ คำถามที่พบบ่อย

Q: ComfyUI กับ Automatic1111 ต่างกันอย่างไร?

A: Automatic1111 (A1111) มี web UI ที่ใช้งานง่ายกว่าสำหรับผู้เริ่มต้น มี extensions ecosystem ใหญ่ ComfyUI ใช้ node-based workflow ที่ flexible กว่ามาก สร้าง complex pipelines ได้ ใช้ VRAM น้อยกว่า (efficient memory management) และเร็วกว่า สำหรับ production workloads ComfyUI เหมาะกว่าเพราะ API support ดี automation ง่าย และ reproducible workflows

Q: ต้องการ GPU อะไรสำหรับ Stable Diffusion?

A: ขั้นต่ำ NVIDIA GPU 8GB VRAM (RTX 3060) สำหรับ SD 1.5 แนะนำ 12GB+ VRAM (RTX 3060 12GB, RTX 4070) สำหรับ SDXL ต้อง 12GB+ VRAM สำหรับ production ที่ต้อง generate เร็ว RTX 4090 (24GB) หรือ A100 (40/80GB) เหมาะสม ราคา cloud GPU A100 ประมาณ $1-3/hr on-demand สำหรับ AMD GPU รองรับผ่าน ROCm แต่ performance ต่ำกว่า NVIDIA

Q: จะ track cost per image ได้อย่างไร?

A: คำนวณจาก GPU power consumption (watts) * generation time (hours) * electricity rate (THB/kWh) เช่น RTX 4090 ใช้ ~450W generation 5 วินาที = 0.000625 kWh * 5 THB/kWh = 0.003 THB/image สำหรับ cloud GPU คำนวณจาก hourly rate / images per hour เช่น A100 $3/hr generate 720 images/hr = $0.004/image ใช้ metrics collector track generation time แล้วคำนวณ cost อัตโนมัติ

Q: จะ monitor quality ของ generated images ได้อย่างไร?

A: ใช้ automated quality metrics เช่น FID (Frechet Inception Distance) วัดความคล้าย distribution กับ real images, CLIP Score วัดความตรงกับ prompt, Aesthetic Score วัดความสวย ใช้ LAION aesthetic predictor เก็บ scores ใน metrics database แล้ว track over time ถ้า scores ลดลงอาจบ่งบอกว่า model degradation หรือ prompt quality ลดลง สำหรับ production ใช้ human evaluation เสริมด้วย automated metrics