SiamCafe.net Blog
Technology

SigNoz Observability Clean Architecture

signoz observability clean architecture
SigNoz Observability Clean Architecture | SiamCafe Blog
2025-06-30· อ. บอม — SiamCafe.net· 1,585 คำ

SigNoz Observability Clean Architecture คืออะไร

SigNoz เป็น open-source observability platform ที่รวม metrics, traces และ logs ไว้ในที่เดียว เป็นทางเลือกแทน Datadog, New Relic ที่ไม่ต้องจ่ายค่า license แพง ใช้ OpenTelemetry เป็น standard สำหรับเก็บข้อมูล และ ClickHouse เป็น storage backend ที่เร็วมาก Clean Architecture คือหลักการออกแบบ software ที่แยก business logic ออกจาก infrastructure ทำให้ testable, maintainable และ flexible การรวม SigNoz observability เข้ากับ Clean Architecture ช่วยให้ระบบ observable โดยไม่ทำลายโครงสร้าง code

SigNoz Architecture

# signoz_arch.py — SigNoz architecture overview
import json

class SigNozArchitecture:
    COMPONENTS = {
        "otel_collector": {
            "name": "OpenTelemetry Collector",
            "role": "รับ telemetry data (metrics, traces, logs) จาก applications",
            "protocol": "OTLP (gRPC/HTTP), Jaeger, Zipkin, Prometheus",
        },
        "query_service": {
            "name": "Query Service",
            "role": "API สำหรับ query data จาก ClickHouse — serve ให้ frontend",
            "tech": "Go-based, REST API",
        },
        "clickhouse": {
            "name": "ClickHouse",
            "role": "Storage backend — column-oriented DB ที่เร็วมากสำหรับ time-series",
            "benefit": "Query traces/logs เร็ว, compression ดี, scale ได้",
        },
        "frontend": {
            "name": "Frontend (React)",
            "role": "Dashboard UI — แสดง metrics, traces, logs, alerts",
            "features": "Service map, trace waterfall, log explorer, custom dashboards",
        },
        "alertmanager": {
            "name": "Alert Manager",
            "role": "จัดการ alerts — ส่ง notifications ผ่าน Slack, PagerDuty, email",
        },
    }

    PILLARS = {
        "metrics": {
            "name": "Metrics",
            "description": "ตัวเลขที่วัดได้: request count, latency, error rate, CPU, memory",
            "signoz": "ส่งผ่าน OTLP → เก็บใน ClickHouse → แสดงบน dashboard",
        },
        "traces": {
            "name": "Traces (Distributed Tracing)",
            "description": "ติดตาม request ข้าม services — ดู latency แต่ละ service",
            "signoz": "OpenTelemetry auto-instrumentation → trace waterfall view",
        },
        "logs": {
            "name": "Logs",
            "description": "ข้อความ log จาก applications — structured/unstructured",
            "signoz": "ส่งผ่าน OTLP/Fluentd → ClickHouse → Log Explorer",
        },
    }

    def show_components(self):
        print("=== SigNoz Components ===\n")
        for key, comp in self.COMPONENTS.items():
            print(f"[{comp['name']}]")
            print(f"  {comp['role']}")
            print()

    def show_pillars(self):
        print("=== Three Pillars ===")
        for key, pillar in self.PILLARS.items():
            print(f"\n[{pillar['name']}] {pillar['description']}")

arch = SigNozArchitecture()
arch.show_components()
arch.show_pillars()

Clean Architecture Principles

# clean_arch.py — Clean Architecture with observability
import json

class CleanArchitecture:
    LAYERS = {
        "entities": {
            "name": "Entities (Domain)",
            "description": "Business rules และ domain objects — ไม่ขึ้นกับ framework หรือ DB",
            "observability": "ไม่ควรมี observability code ใน layer นี้",
        },
        "use_cases": {
            "name": "Use Cases (Application)",
            "description": "Application business rules — orchestrate entities",
            "observability": "Trace spans สำหรับ business operations, business metrics",
        },
        "interface_adapters": {
            "name": "Interface Adapters",
            "description": "Controllers, Presenters, Gateways — แปลงระหว่าง layers",
            "observability": "HTTP metrics, request/response logging, error tracking",
        },
        "frameworks": {
            "name": "Frameworks & Drivers",
            "description": "Web framework, DB, external services — outermost layer",
            "observability": "Auto-instrumentation: HTTP, DB queries, external calls",
        },
    }

    DEPENDENCY_RULE = {
        "rule": "Dependencies point inward — outer layers depend on inner layers, never the reverse",
        "observability_approach": "Inject observability via interfaces — inner layers define interfaces, outer layers implement with tracing/metrics",
    }

    def show_layers(self):
        print("=== Clean Architecture Layers ===\n")
        for key, layer in self.LAYERS.items():
            print(f"[{layer['name']}]")
            print(f"  {layer['description']}")
            print(f"  Observability: {layer['observability']}")
            print()

    def show_rule(self):
        print("=== Dependency Rule ===")
        print(f"  Rule: {self.DEPENDENCY_RULE['rule']}")
        print(f"  Approach: {self.DEPENDENCY_RULE['observability_approach']}")

ca = CleanArchitecture()
ca.show_layers()
ca.show_rule()

Python Implementation

# implementation.py — Clean Architecture with SigNoz observability
import json

class CleanArchObservability:
    CODE = """
# clean_arch_observability.py — Clean Architecture with OpenTelemetry
from opentelemetry import trace, metrics
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Optional
import time

# === Setup OpenTelemetry (Frameworks layer) ===
def setup_telemetry(service_name="my-service", endpoint="http://signoz:4317"):
    # Traces
    trace_provider = TracerProvider()
    trace_provider.add_span_processor(
        BatchSpanProcessor(OTLPSpanExporter(endpoint=endpoint, insecure=True))
    )
    trace.set_tracer_provider(trace_provider)
    
    # Metrics
    metric_reader = PeriodicExportingMetricReader(
        OTLPMetricExporter(endpoint=endpoint, insecure=True)
    )
    meter_provider = MeterProvider(metric_readers=[metric_reader])
    metrics.set_meter_provider(meter_provider)

# === Entities Layer (no observability) ===
@dataclass
class Order:
    id: str
    user_id: str
    amount: float
    status: str = "pending"
    
    def validate(self):
        if self.amount <= 0:
            raise ValueError("Amount must be positive")
        return True

# === Use Case Layer (observability via interface) ===
class OrderRepository(ABC):
    @abstractmethod
    def save(self, order: Order) -> bool:
        pass
    
    @abstractmethod
    def find_by_id(self, order_id: str) -> Optional[Order]:
        pass

class PaymentGateway(ABC):
    @abstractmethod
    def charge(self, user_id: str, amount: float) -> bool:
        pass

class Telemetry(ABC):
    '''Observability interface — defined in use case layer'''
    @abstractmethod
    def start_span(self, name: str):
        pass
    
    @abstractmethod
    def record_metric(self, name: str, value: float, labels: dict = None):
        pass

class CreateOrderUseCase:
    def __init__(self, repo: OrderRepository, payment: PaymentGateway, telemetry: Telemetry):
        self.repo = repo
        self.payment = payment
        self.telemetry = telemetry
    
    def execute(self, user_id: str, amount: float) -> Order:
        with self.telemetry.start_span("create_order") as span:
            # Create order
            order = Order(id="order_123", user_id=user_id, amount=amount)
            order.validate()
            
            # Process payment
            with self.telemetry.start_span("process_payment"):
                success = self.payment.charge(user_id, amount)
                if not success:
                    self.telemetry.record_metric("order_payment_failed", 1)
                    raise Exception("Payment failed")
            
            # Save order
            with self.telemetry.start_span("save_order"):
                order.status = "confirmed"
                self.repo.save(order)
            
            self.telemetry.record_metric("order_created", 1, {"status": "success"})
            self.telemetry.record_metric("order_amount", amount)
            
            return order

# === Interface Adapter Layer (implements observability) ===
class OpenTelemetryAdapter(Telemetry):
    '''Implements Telemetry interface using OpenTelemetry'''
    def __init__(self, service_name="order-service"):
        self.tracer = trace.get_tracer(service_name)
        self.meter = metrics.get_meter(service_name)
        self._counters = {}
        self._histograms = {}
    
    def start_span(self, name: str):
        return self.tracer.start_as_current_span(name)
    
    def record_metric(self, name: str, value: float, labels: dict = None):
        if name not in self._counters:
            self._counters[name] = self.meter.create_counter(name)
        self._counters[name].add(value, labels or {})

# === Frameworks Layer (wiring) ===
# setup_telemetry("order-service", "http://signoz:4317")
# telemetry = OpenTelemetryAdapter("order-service")
# use_case = CreateOrderUseCase(repo, payment, telemetry)
# order = use_case.execute("user_1", 99.99)
"""

    def show_code(self):
        print("=== Clean Architecture + Observability ===")
        print(self.CODE[:600])

impl = CleanArchObservability()
impl.show_code()

SigNoz Dashboard & Alerts

# dashboard.py — SigNoz dashboard and alerting
import json

class SigNozDashboard:
    DASHBOARDS = {
        "service_overview": {
            "name": "Service Overview",
            "panels": [
                "P99 Latency by service",
                "Error rate by service",
                "Request rate (RPM)",
                "Service dependency map",
            ],
        },
        "business_metrics": {
            "name": "Business Metrics",
            "panels": [
                "Orders created per minute",
                "Payment success/failure rate",
                "Average order amount",
                "Revenue per hour",
            ],
        },
        "infrastructure": {
            "name": "Infrastructure",
            "panels": [
                "CPU/Memory usage by pod",
                "Database query latency",
                "External API response times",
                "Queue depth and processing time",
            ],
        },
    }

    ALERTS = {
        "high_error_rate": {
            "name": "High Error Rate",
            "condition": "error_rate > 5% for 5 minutes",
            "severity": "critical",
            "channel": "PagerDuty + Slack #incidents",
        },
        "high_latency": {
            "name": "High P99 Latency",
            "condition": "p99_latency > 2000ms for 10 minutes",
            "severity": "warning",
            "channel": "Slack #alerts",
        },
        "payment_failures": {
            "name": "Payment Failure Spike",
            "condition": "payment_failed_rate > 3% for 3 minutes",
            "severity": "critical",
            "channel": "PagerDuty + Slack #payments",
        },
    }

    def show_dashboards(self):
        print("=== SigNoz Dashboards ===\n")
        for key, dash in self.DASHBOARDS.items():
            print(f"[{dash['name']}]")
            for panel in dash['panels'][:3]:
                print(f"  • {panel}")
            print()

    def show_alerts(self):
        print("=== Alerts ===")
        for key, alert in self.ALERTS.items():
            print(f"\n[{alert['name']}] ({alert['severity']})")
            print(f"  Condition: {alert['condition']}")
            print(f"  Channel: {alert['channel']}")

dash = SigNozDashboard()
dash.show_dashboards()
dash.show_alerts()

Docker Compose Setup

# setup.py — SigNoz Docker Compose setup
import json

class SigNozSetup:
    DOCKER_COMPOSE = """
# docker-compose.yaml — SigNoz with application
version: '3.8'

services:
  # SigNoz (all-in-one for dev)
  signoz:
    image: signoz/signoz:latest
    ports:
      - "3301:3301"   # Frontend
      - "4317:4317"   # OTLP gRPC
      - "4318:4318"   # OTLP HTTP
    volumes:
      - signoz-data:/var/lib/signoz
    environment:
      - SIGNOZ_INSTRUMENTATION_KEY=your-key

  # Application with OpenTelemetry
  app:
    build: .
    ports:
      - "8000:8000"
    environment:
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://signoz:4317
      - OTEL_SERVICE_NAME=order-service
      - OTEL_RESOURCE_ATTRIBUTES=deployment.environment=dev
    depends_on:
      - signoz

volumes:
  signoz-data:
"""

    INSTRUMENTATION = """
# requirements.txt additions for OpenTelemetry
opentelemetry-api
opentelemetry-sdk
opentelemetry-exporter-otlp-proto-grpc
opentelemetry-instrumentation-fastapi
opentelemetry-instrumentation-sqlalchemy
opentelemetry-instrumentation-requests
opentelemetry-instrumentation-redis
"""

    def show_compose(self):
        print("=== Docker Compose ===")
        print(self.DOCKER_COMPOSE[:400])

    def show_instrumentation(self):
        print("\n=== Dependencies ===")
        print(self.INSTRUMENTATION[:300])

setup = SigNozSetup()
setup.show_compose()
setup.show_instrumentation()

FAQ - คำถามที่พบบ่อย

Q: SigNoz กับ Grafana Stack (Loki + Tempo + Mimir) อันไหนดีกว่า?

A: SigNoz: all-in-one, ง่ายกว่า deploy, ClickHouse backend เร็ว, correlated view (metrics+traces+logs) Grafana Stack: flexible มาก, community ใหญ่, plugins เยอะ, แต่ต้อง setup หลาย components เลือก SigNoz: ถ้าต้องการ simplicity + correlated observability + ไม่อยาก manage หลาย tools เลือก Grafana: ถ้าต้องการ flexibility + มี team ที่ manage ได้ + ใช้ Grafana อยู่แล้ว

Q: Clean Architecture เพิ่ม overhead ไหม?

A: Code มากขึ้น: ต้องสร้าง interfaces, adapters, แยก layers — boilerplate เยอะกว่า Performance: overhead น้อยมาก — function call overhead เล็กน้อย ข้อดี: testable (mock ได้ทุก dependency), maintainable, เปลี่ยน infrastructure ง่าย คุ้มเมื่อ: project ใหญ่ + ทีม > 3 คน + อายุ > 1 ปี ไม่คุ้มเมื่อ: prototype, hackathon, project เล็กที่ทำคนเดียว

Q: Observability code ควรอยู่ layer ไหน?

A: Entities: ไม่ควรมี observability code เลย Use Cases: define Telemetry interface + ใช้ record business metrics Interface Adapters: implement Telemetry interface ด้วย OpenTelemetry Frameworks: auto-instrumentation (HTTP, DB, external calls) + setup หลัก: ใช้ Dependency Injection — inner layers ไม่รู้จัก OpenTelemetry ตรงๆ

Q: SigNoz ฟรีไหม?

A: SigNoz Community: ฟรี 100% open-source — self-host ได้ SigNoz Cloud: มี free tier (ข้อมูล 30 วัน) + paid plans Self-host: ต้อง manage infrastructure เอง (ClickHouse + SigNoz) Cloud: managed — ไม่ต้อง manage infrastructure, แต่มีค่าใช้จ่ายตาม data volume เริ่มต้น: Docker Compose สำหรับ dev → Kubernetes Helm chart สำหรับ production

📖 บทความที่เกี่ยวข้อง

SigNoz Observability Troubleshooting แก้ปัญหาอ่านบทความ → SigNoz Observability 12 Factor Appอ่านบทความ → TTS Coqui Clean Architectureอ่านบทความ → Linux Systemd Advanced Clean Architectureอ่านบทความ → External Secrets Operator Clean Architectureอ่านบทความ →

📚 ดูบทความทั้งหมด →