Text Generation WebUI Compliance Automation — ระบบ Compliance อัตโนมัติด้วย Local LLM

Text Generation WebUI คืออะไร

Text Generation WebUI (oobabooga) เป็น open source web interface สำหรับรัน Large Language Models (LLMs) locally รองรับ models หลากหลาย เช่น LLaMA, Mistral, Phi, Qwen, Gemma ผ่าน backends ต่างๆ ได้แก่ Transformers, llama.cpp, ExLlamaV2, AutoGPTQ ให้ผู้ใช้รัน AI text generation บนเครื่องตัวเองโดยไม่ต้องส่งข้อมูลไป cloud

Compliance Automation คือการใช้ LLMs ช่วย automate งาน compliance ได้แก่ document review ตรวจสอบเอกสารว่าเป็นไปตาม regulations, policy checking ตรวจ content ว่าเป็นไปตาม company policies, data classification จัดประเภทข้อมูลตาม sensitivity level, report generation สร้าง compliance reports อัตโนมัติ และ risk assessment ประเมินความเสี่ยงจาก text content

ข้อดีของการรัน LLM locally สำหรับ compliance คือ data privacy ข้อมูลไม่ออกจากองค์กร, no API costs ไม่มีค่าใช้จ่าย per-token, customizable fine-tune model สำหรับ domain-specific compliance, offline capability ทำงานได้แม้ไม่มี internet และ audit trail ควบคุม logging ได้ทั้งหมด

ติดตั้ง Text Generation WebUI

วิธีติดตั้งและตั้งค่า

=== ติดตั้ง Text Generation WebUI ===

Prerequisites

Python 3.11
NVIDIA GPU with CUDA (recommended 12GB+ VRAM)
Git

Clone repository

git clone https://github.com/oobabooga/text-generation-webui.git

cd text-generation-webui

Linux/macOS — One-click installer

chmod +x start_linux.sh

./start_linux.sh

Windows

start_windows.bat

Manual installation

python -m venv venv

source venv/bin/activate

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

pip install -r requirements.txt

=== Download Models ===

Mistral 7B (good for compliance tasks)

python download-model.py TheBloke/Mistral-7B-Instruct-v0.2-GPTQ

Llama 3 8B

python download-model.py meta-llama/Meta-Llama-3-8B-Instruct

Phi-3 Mini (lightweight)

python download-model.py microsoft/Phi-3-mini-4k-instruct

=== Start with API enabled ===

เนื้อหาเกี่ยวข้อง — บทความที่เกี่ยวข้อง: DALL-E API Agile Scrum Kanban

python server.py --api --listen --model Mistral-7B-Instruct-v0.2-GPTQ

API available at: http://localhost:5000/v1

WebUI at: http://localhost:7860

=== Docker Setup ===

docker-compose.yml

services:

text-gen:

image: atinoda/text-generation-webui:latest

ports:

"7860:7860"
"5000:5000"

volumes:

./models:/app/models
./characters:/app/characters
./loras:/app/loras

deploy:

resources:

reservations:

devices:

driver: nvidia

count: 1

capabilities: [gpu]

แนะนำเพิ่มเติม — ติดตาม XM Signal

environment:

EXTRA_LAUNCH_ARGS=--api --listen

docker compose up -d

=== API Test ===

curl -s http://localhost:5000/v1/models | jq .

curl -s http://localhost:5000/v1/chat/completions \

-H "Content-Type: application/json" \

-d '{

"model": "Mistral-7B-Instruct-v0.2-GPTQ",

"messages": [{"role": "user", "content": "Hello"}],

"max_tokens": 100

}' | jq .choices[0].message.content

echo "Text Generation WebUI installed"

สร้าง Compliance Automation Pipeline

Pipeline สำหรับ compliance checking อัตโนมัติ

#!/usr/bin/env python3
# compliance_pipeline.py — Automated Compliance Checking
import requests
import json
import logging
from datetime import datetime
from typing import Dict, List, Optional
from dataclasses import dataclass, field
from pathlib import Path

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("compliance")

@dataclass
class ComplianceResult:
 document_id: str
 check_type: str
 status: str # pass, fail, warning
 findings: List[str]
 score: float
 timestamp: str = field(default_factory=lambda: datetime.utcnow().isoformat())

class LocalLLMClient:
 def __init__(self, base_url="http://localhost:5000/v1"):
 self.base_url = base_url
 
 def generate(self, prompt, system_prompt="", max_tokens=2000, temperature=0.1):
 messages = []
 if system_prompt:
 messages.append({"role": "system", "content": system_prompt})
 messages.append({"role": "user", "content": prompt})
 
 resp = requests.post(
 f"{self.base_url}/chat/completions",
 json={
 "messages": messages,
 "max_tokens": max_tokens,
 "temperature": temperature,
 },
 )
 resp.raise_for_status()
 return resp.json()["choices"][0]["message"]["content"]

class CompliancePipeline:
 SYSTEM_PROMPT = """You are a compliance analyst. Analyze documents for regulatory compliance.
Respond in JSON format with: status (pass/fail/warning), findings (list of issues), score (0-100).
Be precise and cite specific sections that violate policies."""

 def __init__(self, llm_client: LocalLLMClient):
 self.llm = llm_client
 self.results: List[ComplianceResult] = []
 
 def check_pii(self, document: str, doc_id: str) -> ComplianceResult:
 prompt = f"""Analyze this document for Personally Identifiable Information (PII).
Check for: names, email addresses, phone numbers, ID numbers, addresses, 
financial information, health records.

Document:
{document[:3000]}

Respond in JSON: {{"status": "pass/fail", "findings": ["list of PII found"], "score": 0-100}}"""
 
 response = self.llm.generate(prompt, self.SYSTEM_PROMPT)
 
 try:
 data = json.loads(response)
 except json.JSONDecodeError:
 data = {"status": "warning", "findings": ["Could not parse LLM response"], "score": 50}
 
 result = ComplianceResult(
 document_id=doc_id,
 check_type="pii_check",
 status=data.get("status", "warning"),
 findings=data.get("findings", []),
 score=data.get("score", 50),
 )
 
 self.results.append(result)
 logger.info(f"PII Check [{doc_id}]: {result.status} (score: {result.score})")
 return result
 
 def check_policy(self, document: str, doc_id: str, policy_rules: List[str]) -> ComplianceResult:
 rules_text = "\n".join(f"- {r}" for r in policy_rules)
 
 prompt = f"""Check if this document complies with these policies:
{rules_text}

Document:
{document[:3000]}

Respond in JSON: {{"status": "pass/fail/warning", "findings": ["violations found"], "score": 0-100}}"""
 
 response = self.llm.generate(prompt, self.SYSTEM_PROMPT)
 
 try:
 data = json.loads(response)
 except json.JSONDecodeError:
 data = {"status": "warning", "findings": ["Parse error"], "score": 50}
 
 result = ComplianceResult(
 document_id=doc_id,
 check_type="policy_check",
 status=data.get("status", "warning"),
 findings=data.get("findings", []),
 score=data.get("score", 50),
 )
 
 self.results.append(result)
 return result
 
 def classify_data(self, document: str, doc_id: str) -> ComplianceResult:
 prompt = f"""Classify this document's data sensitivity level:
- PUBLIC: No sensitive information
- INTERNAL: Business information, not for external sharing
- CONFIDENTIAL: Contains sensitive business data
- RESTRICTED: Contains PII, financial data, or trade secrets

Document:
{document[:3000]}

Respond in JSON: {{"status": "classification_level", "findings": ["reasons"], "score": confidence_0-100}}"""
 
 response = self.llm.generate(prompt, self.SYSTEM_PROMPT)
 
 try:
 data = json.loads(response)
 except json.JSONDecodeError:
 data = {"status": "INTERNAL", "findings": ["Default classification"], "score": 50}
 
 result = ComplianceResult(
 document_id=doc_id,
 check_type="data_classification",
 status=data.get("status", "INTERNAL"),
 findings=data.get("findings", []),
 score=data.get("score", 50),
 )
 
 self.results.append(result)
 return result
 
 def run_full_check(self, document: str, doc_id: str):
 policies = [
 "No sharing of customer data without consent",
 "All financial figures must have source citations",
 "No discriminatory language",
 "Must include data retention period",
 "Must specify data processing purpose",
 ]
 
 pii = self.check_pii(document, doc_id)
 policy = self.check_policy(document, doc_id, policies)
 classification = self.classify_data(document, doc_id)
 
 overall_score = (pii.score + policy.score + classification.score) / 3
 
 return {
 "document_id": doc_id,
 "overall_score": round(overall_score, 1),
 "pii_check": pii.__dict__,
 "policy_check": policy.__dict__,
 "classification": classification.__dict__,
 }

# llm = LocalLLMClient()
# pipeline = CompliancePipeline(llm)
# result = pipeline.run_full_check("Sample document text...", "DOC-001")
# print(json.dumps(result, indent=2))

Content Moderation ด้วย LLM

ระบบ content moderation อัตโนมัติ

#!/usr/bin/env python3
# content_moderator.py — LLM-Based Content Moderation
import requests
import json
import logging
import re
from datetime import datetime
from typing import Dict, List
from dataclasses import dataclass

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("moderator")

@dataclass
class ModerationResult:
 content_id: str
 action: str # approve, reject, review
 categories: Dict[str, bool]
 confidence: float
 reason: str
 timestamp: str

class ContentModerator:
 CATEGORIES = [
 "hate_speech", "harassment", "violence", "sexual_content",
 "self_harm", "misinformation", "spam", "illegal_activity",
 ]
 
 def __init__(self, api_url="http://localhost:5000/v1"):
 self.api_url = api_url
 
 def _call_llm(self, prompt, system_prompt="", max_tokens=1000):
 resp = requests.post(
 f"{self.api_url}/chat/completions",
 json={
 "messages": [
 {"role": "system", "content": system_prompt},
 {"role": "user", "content": prompt},
 ],
 "max_tokens": max_tokens,
 "temperature": 0.05,
 },
 )
 resp.raise_for_status()
 return resp.json()["choices"][0]["message"]["content"]
 
 def moderate(self, content: str, content_id: str) -> ModerationResult:
 system = """You are a content moderation system. Analyze content for policy violations.
Respond ONLY in valid JSON format."""
 
 categories_str = ", ".join(self.CATEGORIES)
 
 prompt = f"""Analyze this content for policy violations.

Categories to check: {categories_str}

Content:
{content[:2000]}

Respond in JSON:
{{
 "action": "approve" or "reject" or "review",
 "categories": {{"category_name": true/false for each}},
 "confidence": 0.0-1.0,
 "reason": "brief explanation"
}}"""
 
 response = self._call_llm(prompt, system)
 
 try:
 # Extract JSON from response
 json_match = re.search(r'\{.*\}', response, re.DOTALL)
 if json_match:
 data = json.loads(json_match.group())
 else:
 data = json.loads(response)
 except (json.JSONDecodeError, AttributeError):
 data = {
 "action": "review",
 "categories": {c: False for c in self.CATEGORIES},
 "confidence": 0.0,
 "reason": "Failed to parse moderation response",
 }
 
 result = ModerationResult(
 content_id=content_id,
 action=data.get("action", "review"),
 categories=data.get("categories", {}),
 confidence=data.get("confidence", 0.0),
 reason=data.get("reason", ""),
 timestamp=datetime.utcnow().isoformat(),
 )
 
 flagged = [k for k, v in result.categories.items() if v]
 
 if flagged:
 logger.warning(f"Content {content_id}: {result.action} — {', '.join(flagged)}")
 else:
 logger.info(f"Content {content_id}: {result.action} (confidence: {result.confidence:.0%})")
 
 return result
 
 def batch_moderate(self, items: List[Dict]) -> List[ModerationResult]:
 results = []
 
 for item in items:
 result = self.moderate(item["content"], item["id"])
 results.append(result)
 
 summary = {
 "total": len(results),
 "approved": sum(1 for r in results if r.action == "approve"),
 "rejected": sum(1 for r in results if r.action == "reject"),
 "review": sum(1 for r in results if r.action == "review"),
 }
 
 logger.info(f"Batch complete: {json.dumps(summary)}")
 return results
 
 def generate_report(self, results: List[ModerationResult]):
 category_counts = {c: 0 for c in self.CATEGORIES}
 
 for r in results:
 for cat, flagged in r.categories.items():
 if flagged:
 category_counts[cat] = category_counts.get(cat, 0) + 1
 
 return {
 "generated_at": datetime.utcnow().isoformat(),
 "total_reviewed": len(results),
 "actions": {
 "approved": sum(1 for r in results if r.action == "approve"),
 "rejected": sum(1 for r in results if r.action == "reject"),
 "manual_review": sum(1 for r in results if r.action == "review"),
 },
 "category_breakdown": category_counts,
 "avg_confidence": round(
 sum(r.confidence for r in results) / max(len(results), 1), 2
 ),
 }

# moderator = ContentModerator()
# result = moderator.moderate("Sample content to check...", "MSG-001")
# print(f"Action: {result.action}, Reason: {result.reason}")

Audit Logging และ Compliance Reporting

ระบบ audit log และ compliance reports

#!/usr/bin/env python3
# audit_system.py — Compliance Audit Logging
import json
import sqlite3
import logging
from datetime import datetime, timedelta
from pathlib import Path
from typing import Dict, List, Optional

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("audit")

class AuditLogger:
 def __init__(self, db_path="compliance_audit.db"):
 self.db_path = db_path
 self._init_db()
 
 def _init_db(self):
 conn = sqlite3.connect(self.db_path)
 conn.execute("""
 CREATE TABLE IF NOT EXISTS audit_log (
 id INTEGER PRIMARY KEY AUTOINCREMENT,
 timestamp TEXT NOT NULL,
 event_type TEXT NOT NULL,
 document_id TEXT,
 user_id TEXT,
 action TEXT NOT NULL,
 details TEXT,
 result TEXT,
 score REAL,
 model_used TEXT,
 processing_time_ms REAL
 )
 """)
 conn.execute("""
 CREATE TABLE IF NOT EXISTS compliance_reports (
 id INTEGER PRIMARY KEY AUTOINCREMENT,
 report_date TEXT NOT NULL,
 report_type TEXT NOT NULL,
 total_checks INTEGER,
 passed INTEGER,
 failed INTEGER,
 warnings INTEGER,
 avg_score REAL,
 details TEXT
 )
 """)
 conn.execute("CREATE INDEX IF NOT EXISTS idx_audit_timestamp ON audit_log(timestamp)")
 conn.execute("CREATE INDEX IF NOT EXISTS idx_audit_type ON audit_log(event_type)")
 conn.commit()
 conn.close()
 
 def log_event(self, event_type, document_id, action, details=None,
 result=None, score=None, model=None, processing_ms=None, user_id=None):
 conn = sqlite3.connect(self.db_path)
 conn.execute(
 """INSERT INTO audit_log 
 (timestamp, event_type, document_id, user_id, action, details, result, score, model_used, processing_time_ms)
 VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
 (
 datetime.utcnow().isoformat(),
 event_type,
 document_id,
 user_id,
 action,
 json.dumps(details) if details else None,
 result,
 score,
 model,
 processing_ms,
 ),
 )
 conn.commit()
 conn.close()
 
 logger.info(f"Audit: {event_type} | {document_id} | {action} | {result}")
 
 def get_events(self, event_type=None, days=30, limit=100):
 conn = sqlite3.connect(self.db_path)
 conn.row_factory = sqlite3.Row
 
 cutoff = (datetime.utcnow() - timedelta(days=days)).isoformat()
 
 if event_type:
 rows = conn.execute(
 "SELECT * FROM audit_log WHERE event_type = ? AND timestamp > ? ORDER BY timestamp DESC LIMIT ?",
 (event_type, cutoff, limit),
 ).fetchall()
 else:
 rows = conn.execute(
 "SELECT * FROM audit_log WHERE timestamp > ? ORDER BY timestamp DESC LIMIT ?",
 (cutoff, limit),
 ).fetchall()
 
 conn.close()
 return [dict(r) for r in rows]
 
 def generate_compliance_report(self, days=30):
 events = self.get_events(days=days, limit=10000)
 
 if not events:
 return {"error": "No events in period"}
 
 total = len(events)
 by_type = {}
 scores = []
 
 for e in events:
 t = e["event_type"]
 by_type.setdefault(t, {"total": 0, "pass": 0, "fail": 0, "warning": 0})
 by_type[t]["total"] += 1
 
 result = e.get("result", "")
 if result == "pass":
 by_type[t]["pass"] += 1
 elif result == "fail":
 by_type[t]["fail"] += 1
 elif result == "warning":
 by_type[t]["warning"] += 1
 
 if e.get("score") is not None:
 scores.append(e["score"])
 
 report = {
 "period_days": days,
 "generated_at": datetime.utcnow().isoformat(),
 "total_events": total,
 "avg_score": round(sum(scores) / max(len(scores), 1), 1),
 "by_check_type": by_type,
 "compliance_rate": round(
 sum(1 for e in events if e.get("result") == "pass") / max(total, 1) * 100, 1
 ),
 }
 
 # Save report to DB
 conn = sqlite3.connect(self.db_path)
 conn.execute(
 "INSERT INTO compliance_reports (report_date, report_type, total_checks, passed, failed, warnings, avg_score, details) VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
 (
 datetime.utcnow().isoformat(), "monthly",
 total,
 sum(t["pass"] for t in by_type.values()),
 sum(t["fail"] for t in by_type.values()),
 sum(t["warning"] for t in by_type.values()),
 report["avg_score"],
 json.dumps(report),
 ),
 )
 conn.commit()
 conn.close()
 
 return report

# audit = AuditLogger()
# audit.log_event("pii_check", "DOC-001", "scan", result="pass", score=95, model="mistral-7b")
# report = audit.generate_compliance_report(days=30)
# print(json.dumps(report, indent=2))

Production Deployment และ Security

แนวทาง deploy สำหรับ production

=== Production Deployment Guide ===

1. Security Hardening

Restrict API access (nginx reverse proxy)

เนื้อหาเกี่ยวข้อง — ดูเพิ่มเติมเรื่อง GCP Vertex AI Incident Management

/etc/nginx/sites-available/text-gen-api

server {

listen 443 ssl;

server_name llm-api.internal.example.com;

ssl_certificate /etc/ssl/certs/server.crt;

ssl_certificate_key /etc/ssl/private/server.key;

# Allow only internal network

allow 10.0.0.0/8;

deny all;

# Rate limiting

limit_req_zone $binary_remote_addr zone=llm:10m rate=10r/s;

location /v1/ {

limit_req zone=llm burst=20;

proxy_pass http://127.0.0.1:5000;

proxy_set_header X-Real-IP $remote_addr;

แนะนำเพิ่มเติม — อ่านเพิ่มเติมที่ SiamCafeBook

# Authentication

auth_basic "LLM API";

auth_basic_user_file /etc/nginx/.htpasswd;

}

Create auth credentials

sudo htpasswd -c /etc/nginx/.htpasswd api_user

2. Resource Management

Systemd service

/etc/systemd/system/text-gen.service

[Unit]

Description=Text Generation WebUI

เนื้อหาเกี่ยวข้อง — อ่านต่อ: data science คือ คณะ อะไร — คู่มือฉบับสมบูรณ์ 2026

After=network.target

[Service]

User=llm

WorkingDirectory=/opt/text-generation-webui

ExecStart=/opt/text-generation-webui/venv/bin/python server.py \

--api --listen --model Mistral-7B-Instruct-v0.2-GPTQ \

--api-port 5000 --no-webui

Restart=always

RestartSec=10

Environment=CUDA_VISIBLE_DEVICES=0

# Resource limits

MemoryMax=32G

CPUQuota=400%

[Install]

WantedBy=multi-user.target

sudo systemctl enable text-gen

sudo systemctl start text-gen

3. Monitoring

Health check script

#!/bin/bash

RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \

http://localhost:5000/v1/models)

if [ "$RESPONSE" != "200" ]; then

echo "LLM API unhealthy: $RESPONSE"

systemctl restart text-gen

เนื้อหาเกี่ยวข้อง — อ่านต่อ: Kotlin Compose Multiplatform Shift Left Security — คู่มือฉบับสมบูรณ์ 2026

Crontab health check every 5 minutes

*/5 * * * * /opt/scripts/health_check.sh >> /var/log/llm_health.log 2>&1

4. Data Protection

Never log full document content in production
Encrypt audit database at rest
Implement data retention policies
Use network isolation (no internet access for LLM server)
Regular security audits of API access logs

Encrypt SQLite database

pip install sqlcipher3

Or use PostgreSQL with encryption

5. Backup Strategy

Daily backup of audit logs and models

0 2 * * * /opt/scripts/backup_compliance.sh

!/bin/bash

backup_compliance.sh

BACKUP_DIR="/backup/compliance/$(date +%Y%m%d)"

mkdir -p "$BACKUP_DIR"

cp /opt/compliance/compliance_audit.db "$BACKUP_DIR/"

tar czf "$BACKUP_DIR/models.tar.gz" /opt/text-generation-webui/models/

find /backup/compliance -mtime +30 -delete # Keep 30 days

echo "Production deployment configured"

FAQ คำถามที่พบบ่อย

Q: GPU อะไรเหมาะสำหรับรัน LLM compliance?

A: สำหรับ 7B model (Mistral, Llama 3 8B) ต้องการ 8-12GB VRAM ขั้นต่ำ RTX 3060 12GB หรือ RTX 4060 Ti 16GB เพียงพอ สำหรับ 13B model ต้อง 16GB+ VRAM (RTX 4080, A4000) สำหรับ production ที่ต้อง process หลาย requests พร้อมกัน แนะนำ RTX 4090 24GB หรือ A100 ใช้ quantized models (GPTQ, AWQ, GGUF) ลด VRAM usage 50-70% โดย quality ลดลงเล็กน้อย

Q: Local LLM แม่นยำพอสำหรับ compliance ไหม?

A: สำหรับ tasks ที่มี clear criteria เช่น PII detection, keyword matching models 7B-13B ทำได้ดี (90%+ accuracy) สำหรับ nuanced analysis เช่น policy interpretation, legal compliance models ใหญ่กว่า (70B+) หรือ fine-tuned models ดีกว่า แนะนำ hybrid approach ใช้ local LLM สำหรับ initial screening แล้ว escalate cases ที่ไม่แน่ใจไป human review ไม่ควรใช้ LLM เป็น sole decision maker สำหรับ critical compliance decisions

Q: จะ fine-tune model สำหรับ compliance ได้อย่างไร?

A: ใช้ LoRA/QLoRA fine-tuning ที่ต้องการ VRAM น้อย (16GB พอ) สร้าง training data จาก labeled compliance examples (ต้อง 500-2000 examples) ใช้ Text Generation WebUI built-in training tab หรือ Hugging Face PEFT library Fine-tune บน domain-specific data เช่น regulatory documents, policy violations, data classification examples Evaluate ด้วย held-out test set ก่อน deploy

Q: GDPR compliance สำหรับ LLM processing มีอะไรบ้าง?

A: สำหรับ GDPR ต้องมี lawful basis สำหรับ processing personal data ผ่าน LLM, data minimization ส่งเฉพาะข้อมูลที่จำเป็นไป LLM, transparency แจ้ง data subjects ว่าใช้ AI ในการ process, right to explanation อธิบาย AI decisions ได้, data protection impact assessment (DPIA) สำหรับ high-risk processing และ no data retention ไม่เก็บ personal data ใน model weights การรัน LLM locally ช่วย compliance เรื่อง data transfer restrictions