SiamCafe.net Blog
Cybersecurity

Text Generation WebUI Compliance Automation — ระบบ Compliance อัตโนมัติด้วย Local LLM

text generation webui compliance automation
Text Generation WebUI Compliance Automation | SiamCafe Blog
2025-06-14· อ. บอม — SiamCafe.net· 1,626 คำ

Text Generation WebUI คืออะไร

Text Generation WebUI (oobabooga) เป็น open source web interface สำหรับรัน Large Language Models (LLMs) locally รองรับ models หลากหลาย เช่น LLaMA, Mistral, Phi, Qwen, Gemma ผ่าน backends ต่างๆ ได้แก่ Transformers, llama.cpp, ExLlamaV2, AutoGPTQ ให้ผู้ใช้รัน AI text generation บนเครื่องตัวเองโดยไม่ต้องส่งข้อมูลไป cloud

Compliance Automation คือการใช้ LLMs ช่วย automate งาน compliance ได้แก่ document review ตรวจสอบเอกสารว่าเป็นไปตาม regulations, policy checking ตรวจ content ว่าเป็นไปตาม company policies, data classification จัดประเภทข้อมูลตาม sensitivity level, report generation สร้าง compliance reports อัตโนมัติ และ risk assessment ประเมินความเสี่ยงจาก text content

ข้อดีของการรัน LLM locally สำหรับ compliance คือ data privacy ข้อมูลไม่ออกจากองค์กร, no API costs ไม่มีค่าใช้จ่าย per-token, customizable fine-tune model สำหรับ domain-specific compliance, offline capability ทำงานได้แม้ไม่มี internet และ audit trail ควบคุม logging ได้ทั้งหมด

ติดตั้ง Text Generation WebUI

วิธีติดตั้งและตั้งค่า

# === ติดตั้ง Text Generation WebUI ===

# Prerequisites
# - Python 3.11
# - NVIDIA GPU with CUDA (recommended 12GB+ VRAM)
# - Git

# Clone repository
git clone https://github.com/oobabooga/text-generation-webui.git
cd text-generation-webui

# Linux/macOS — One-click installer
chmod +x start_linux.sh
./start_linux.sh

# Windows
# start_windows.bat

# Manual installation
python -m venv venv
source venv/bin/activate
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

# === Download Models ===
# Mistral 7B (good for compliance tasks)
python download-model.py TheBloke/Mistral-7B-Instruct-v0.2-GPTQ

# Llama 3 8B
python download-model.py meta-llama/Meta-Llama-3-8B-Instruct

# Phi-3 Mini (lightweight)
python download-model.py microsoft/Phi-3-mini-4k-instruct

# === Start with API enabled ===
python server.py --api --listen --model Mistral-7B-Instruct-v0.2-GPTQ

# API available at: http://localhost:5000/v1
# WebUI at: http://localhost:7860

# === Docker Setup ===
# docker-compose.yml
# services:
# text-gen:
# image: atinoda/text-generation-webui:latest
# ports:
# - "7860:7860"
# - "5000:5000"
# volumes:
# - ./models:/app/models
# - ./characters:/app/characters
# - ./loras:/app/loras
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
# environment:
# - EXTRA_LAUNCH_ARGS=--api --listen
#
# docker compose up -d

# === API Test ===
curl -s http://localhost:5000/v1/models | jq .

curl -s http://localhost:5000/v1/chat/completions \
 -H "Content-Type: application/json" \
 -d '{
 "model": "Mistral-7B-Instruct-v0.2-GPTQ",
 "messages": [{"role": "user", "content": "Hello"}],
 "max_tokens": 100
 }' | jq .choices[0].message.content

echo "Text Generation WebUI installed"

สร้าง Compliance Automation Pipeline

Pipeline สำหรับ compliance checking อัตโนมัติ

#!/usr/bin/env python3
# compliance_pipeline.py — Automated Compliance Checking
import requests
import json
import logging
from datetime import datetime
from typing import Dict, List, Optional
from dataclasses import dataclass, field
from pathlib import Path

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("compliance")

@dataclass
class ComplianceResult:
 document_id: str
 check_type: str
 status: str # pass, fail, warning
 findings: List[str]
 score: float
 timestamp: str = field(default_factory=lambda: datetime.utcnow().isoformat())

class LocalLLMClient:
 def __init__(self, base_url="http://localhost:5000/v1"):
 self.base_url = base_url
 
 def generate(self, prompt, system_prompt="", max_tokens=2000, temperature=0.1):
 messages = []
 if system_prompt:
 messages.append({"role": "system", "content": system_prompt})
 messages.append({"role": "user", "content": prompt})
 
 resp = requests.post(
 f"{self.base_url}/chat/completions",
 json={
 "messages": messages,
 "max_tokens": max_tokens,
 "temperature": temperature,
 },
 )
 resp.raise_for_status()
 return resp.json()["choices"][0]["message"]["content"]

class CompliancePipeline:
 SYSTEM_PROMPT = """You are a compliance analyst. Analyze documents for regulatory compliance.
Respond in JSON format with: status (pass/fail/warning), findings (list of issues), score (0-100).
Be precise and cite specific sections that violate policies."""

 def __init__(self, llm_client: LocalLLMClient):
 self.llm = llm_client
 self.results: List[ComplianceResult] = []
 
 def check_pii(self, document: str, doc_id: str) -> ComplianceResult:
 prompt = f"""Analyze this document for Personally Identifiable Information (PII).
Check for: names, email addresses, phone numbers, ID numbers, addresses, 
financial information, health records.

Document:
{document[:3000]}

Respond in JSON: {{"status": "pass/fail", "findings": ["list of PII found"], "score": 0-100}}"""
 
 response = self.llm.generate(prompt, self.SYSTEM_PROMPT)
 
 try:
 data = json.loads(response)
 except json.JSONDecodeError:
 data = {"status": "warning", "findings": ["Could not parse LLM response"], "score": 50}
 
 result = ComplianceResult(
 document_id=doc_id,
 check_type="pii_check",
 status=data.get("status", "warning"),
 findings=data.get("findings", []),
 score=data.get("score", 50),
 )
 
 self.results.append(result)
 logger.info(f"PII Check [{doc_id}]: {result.status} (score: {result.score})")
 return result
 
 def check_policy(self, document: str, doc_id: str, policy_rules: List[str]) -> ComplianceResult:
 rules_text = "\n".join(f"- {r}" for r in policy_rules)
 
 prompt = f"""Check if this document complies with these policies:
{rules_text}

Document:
{document[:3000]}

Respond in JSON: {{"status": "pass/fail/warning", "findings": ["violations found"], "score": 0-100}}"""
 
 response = self.llm.generate(prompt, self.SYSTEM_PROMPT)
 
 try:
 data = json.loads(response)
 except json.JSONDecodeError:
 data = {"status": "warning", "findings": ["Parse error"], "score": 50}
 
 result = ComplianceResult(
 document_id=doc_id,
 check_type="policy_check",
 status=data.get("status", "warning"),
 findings=data.get("findings", []),
 score=data.get("score", 50),
 )
 
 self.results.append(result)
 return result
 
 def classify_data(self, document: str, doc_id: str) -> ComplianceResult:
 prompt = f"""Classify this document's data sensitivity level:
- PUBLIC: No sensitive information
- INTERNAL: Business information, not for external sharing
- CONFIDENTIAL: Contains sensitive business data
- RESTRICTED: Contains PII, financial data, or trade secrets

Document:
{document[:3000]}

Respond in JSON: {{"status": "classification_level", "findings": ["reasons"], "score": confidence_0-100}}"""
 
 response = self.llm.generate(prompt, self.SYSTEM_PROMPT)
 
 try:
 data = json.loads(response)
 except json.JSONDecodeError:
 data = {"status": "INTERNAL", "findings": ["Default classification"], "score": 50}
 
 result = ComplianceResult(
 document_id=doc_id,
 check_type="data_classification",
 status=data.get("status", "INTERNAL"),
 findings=data.get("findings", []),
 score=data.get("score", 50),
 )
 
 self.results.append(result)
 return result
 
 def run_full_check(self, document: str, doc_id: str):
 policies = [
 "No sharing of customer data without consent",
 "All financial figures must have source citations",
 "No discriminatory language",
 "Must include data retention period",
 "Must specify data processing purpose",
 ]
 
 pii = self.check_pii(document, doc_id)
 policy = self.check_policy(document, doc_id, policies)
 classification = self.classify_data(document, doc_id)
 
 overall_score = (pii.score + policy.score + classification.score) / 3
 
 return {
 "document_id": doc_id,
 "overall_score": round(overall_score, 1),
 "pii_check": pii.__dict__,
 "policy_check": policy.__dict__,
 "classification": classification.__dict__,
 }

# llm = LocalLLMClient()
# pipeline = CompliancePipeline(llm)
# result = pipeline.run_full_check("Sample document text...", "DOC-001")
# print(json.dumps(result, indent=2))

Content Moderation ด้วย LLM

ระบบ content moderation อัตโนมัติ

#!/usr/bin/env python3
# content_moderator.py — LLM-Based Content Moderation
import requests
import json
import logging
import re
from datetime import datetime
from typing import Dict, List
from dataclasses import dataclass

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("moderator")

@dataclass
class ModerationResult:
 content_id: str
 action: str # approve, reject, review
 categories: Dict[str, bool]
 confidence: float
 reason: str
 timestamp: str

class ContentModerator:
 CATEGORIES = [
 "hate_speech", "harassment", "violence", "sexual_content",
 "self_harm", "misinformation", "spam", "illegal_activity",
 ]
 
 def __init__(self, api_url="http://localhost:5000/v1"):
 self.api_url = api_url
 
 def _call_llm(self, prompt, system_prompt="", max_tokens=1000):
 resp = requests.post(
 f"{self.api_url}/chat/completions",
 json={
 "messages": [
 {"role": "system", "content": system_prompt},
 {"role": "user", "content": prompt},
 ],
 "max_tokens": max_tokens,
 "temperature": 0.05,
 },
 )
 resp.raise_for_status()
 return resp.json()["choices"][0]["message"]["content"]
 
 def moderate(self, content: str, content_id: str) -> ModerationResult:
 system = """You are a content moderation system. Analyze content for policy violations.
Respond ONLY in valid JSON format."""
 
 categories_str = ", ".join(self.CATEGORIES)
 
 prompt = f"""Analyze this content for policy violations.

Categories to check: {categories_str}

Content:
{content[:2000]}

Respond in JSON:
{{
 "action": "approve" or "reject" or "review",
 "categories": {{"category_name": true/false for each}},
 "confidence": 0.0-1.0,
 "reason": "brief explanation"
}}"""
 
 response = self._call_llm(prompt, system)
 
 try:
 # Extract JSON from response
 json_match = re.search(r'\{.*\}', response, re.DOTALL)
 if json_match:
 data = json.loads(json_match.group())
 else:
 data = json.loads(response)
 except (json.JSONDecodeError, AttributeError):
 data = {
 "action": "review",
 "categories": {c: False for c in self.CATEGORIES},
 "confidence": 0.0,
 "reason": "Failed to parse moderation response",
 }
 
 result = ModerationResult(
 content_id=content_id,
 action=data.get("action", "review"),
 categories=data.get("categories", {}),
 confidence=data.get("confidence", 0.0),
 reason=data.get("reason", ""),
 timestamp=datetime.utcnow().isoformat(),
 )
 
 flagged = [k for k, v in result.categories.items() if v]
 
 if flagged:
 logger.warning(f"Content {content_id}: {result.action} — {', '.join(flagged)}")
 else:
 logger.info(f"Content {content_id}: {result.action} (confidence: {result.confidence:.0%})")
 
 return result
 
 def batch_moderate(self, items: List[Dict]) -> List[ModerationResult]:
 results = []
 
 for item in items:
 result = self.moderate(item["content"], item["id"])
 results.append(result)
 
 summary = {
 "total": len(results),
 "approved": sum(1 for r in results if r.action == "approve"),
 "rejected": sum(1 for r in results if r.action == "reject"),
 "review": sum(1 for r in results if r.action == "review"),
 }
 
 logger.info(f"Batch complete: {json.dumps(summary)}")
 return results
 
 def generate_report(self, results: List[ModerationResult]):
 category_counts = {c: 0 for c in self.CATEGORIES}
 
 for r in results:
 for cat, flagged in r.categories.items():
 if flagged:
 category_counts[cat] = category_counts.get(cat, 0) + 1
 
 return {
 "generated_at": datetime.utcnow().isoformat(),
 "total_reviewed": len(results),
 "actions": {
 "approved": sum(1 for r in results if r.action == "approve"),
 "rejected": sum(1 for r in results if r.action == "reject"),
 "manual_review": sum(1 for r in results if r.action == "review"),
 },
 "category_breakdown": category_counts,
 "avg_confidence": round(
 sum(r.confidence for r in results) / max(len(results), 1), 2
 ),
 }

# moderator = ContentModerator()
# result = moderator.moderate("Sample content to check...", "MSG-001")
# print(f"Action: {result.action}, Reason: {result.reason}")

Audit Logging และ Compliance Reporting

ระบบ audit log และ compliance reports

#!/usr/bin/env python3
# audit_system.py — Compliance Audit Logging
import json
import sqlite3
import logging
from datetime import datetime, timedelta
from pathlib import Path
from typing import Dict, List, Optional

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("audit")

class AuditLogger:
 def __init__(self, db_path="compliance_audit.db"):
 self.db_path = db_path
 self._init_db()
 
 def _init_db(self):
 conn = sqlite3.connect(self.db_path)
 conn.execute("""
 CREATE TABLE IF NOT EXISTS audit_log (
 id INTEGER PRIMARY KEY AUTOINCREMENT,
 timestamp TEXT NOT NULL,
 event_type TEXT NOT NULL,
 document_id TEXT,
 user_id TEXT,
 action TEXT NOT NULL,
 details TEXT,
 result TEXT,
 score REAL,
 model_used TEXT,
 processing_time_ms REAL
 )
 """)
 conn.execute("""
 CREATE TABLE IF NOT EXISTS compliance_reports (
 id INTEGER PRIMARY KEY AUTOINCREMENT,
 report_date TEXT NOT NULL,
 report_type TEXT NOT NULL,
 total_checks INTEGER,
 passed INTEGER,
 failed INTEGER,
 warnings INTEGER,
 avg_score REAL,
 details TEXT
 )
 """)
 conn.execute("CREATE INDEX IF NOT EXISTS idx_audit_timestamp ON audit_log(timestamp)")
 conn.execute("CREATE INDEX IF NOT EXISTS idx_audit_type ON audit_log(event_type)")
 conn.commit()
 conn.close()
 
 def log_event(self, event_type, document_id, action, details=None,
 result=None, score=None, model=None, processing_ms=None, user_id=None):
 conn = sqlite3.connect(self.db_path)
 conn.execute(
 """INSERT INTO audit_log 
 (timestamp, event_type, document_id, user_id, action, details, result, score, model_used, processing_time_ms)
 VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)""",
 (
 datetime.utcnow().isoformat(),
 event_type,
 document_id,
 user_id,
 action,
 json.dumps(details) if details else None,
 result,
 score,
 model,
 processing_ms,
 ),
 )
 conn.commit()
 conn.close()
 
 logger.info(f"Audit: {event_type} | {document_id} | {action} | {result}")
 
 def get_events(self, event_type=None, days=30, limit=100):
 conn = sqlite3.connect(self.db_path)
 conn.row_factory = sqlite3.Row
 
 cutoff = (datetime.utcnow() - timedelta(days=days)).isoformat()
 
 if event_type:
 rows = conn.execute(
 "SELECT * FROM audit_log WHERE event_type = ? AND timestamp > ? ORDER BY timestamp DESC LIMIT ?",
 (event_type, cutoff, limit),
 ).fetchall()
 else:
 rows = conn.execute(
 "SELECT * FROM audit_log WHERE timestamp > ? ORDER BY timestamp DESC LIMIT ?",
 (cutoff, limit),
 ).fetchall()
 
 conn.close()
 return [dict(r) for r in rows]
 
 def generate_compliance_report(self, days=30):
 events = self.get_events(days=days, limit=10000)
 
 if not events:
 return {"error": "No events in period"}
 
 total = len(events)
 by_type = {}
 scores = []
 
 for e in events:
 t = e["event_type"]
 by_type.setdefault(t, {"total": 0, "pass": 0, "fail": 0, "warning": 0})
 by_type[t]["total"] += 1
 
 result = e.get("result", "")
 if result == "pass":
 by_type[t]["pass"] += 1
 elif result == "fail":
 by_type[t]["fail"] += 1
 elif result == "warning":
 by_type[t]["warning"] += 1
 
 if e.get("score") is not None:
 scores.append(e["score"])
 
 report = {
 "period_days": days,
 "generated_at": datetime.utcnow().isoformat(),
 "total_events": total,
 "avg_score": round(sum(scores) / max(len(scores), 1), 1),
 "by_check_type": by_type,
 "compliance_rate": round(
 sum(1 for e in events if e.get("result") == "pass") / max(total, 1) * 100, 1
 ),
 }
 
 # Save report to DB
 conn = sqlite3.connect(self.db_path)
 conn.execute(
 "INSERT INTO compliance_reports (report_date, report_type, total_checks, passed, failed, warnings, avg_score, details) VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
 (
 datetime.utcnow().isoformat(), "monthly",
 total,
 sum(t["pass"] for t in by_type.values()),
 sum(t["fail"] for t in by_type.values()),
 sum(t["warning"] for t in by_type.values()),
 report["avg_score"],
 json.dumps(report),
 ),
 )
 conn.commit()
 conn.close()
 
 return report

# audit = AuditLogger()
# audit.log_event("pii_check", "DOC-001", "scan", result="pass", score=95, model="mistral-7b")
# report = audit.generate_compliance_report(days=30)
# print(json.dumps(report, indent=2))

Production Deployment และ Security

แนวทาง deploy สำหรับ production

# === Production Deployment Guide ===

# 1. Security Hardening
# ===================================

# Restrict API access (nginx reverse proxy)
# /etc/nginx/sites-available/text-gen-api
# server {
# listen 443 ssl;
# server_name llm-api.internal.example.com;
#
# ssl_certificate /etc/ssl/certs/server.crt;
# ssl_certificate_key /etc/ssl/private/server.key;
#
# # Allow only internal network
# allow 10.0.0.0/8;
# deny all;
#
# # Rate limiting
# limit_req_zone $binary_remote_addr zone=llm:10m rate=10r/s;
#
# location /v1/ {
# limit_req zone=llm burst=20;
# proxy_pass http://127.0.0.1:5000;
# proxy_set_header X-Real-IP $remote_addr;
#
# # Authentication
# auth_basic "LLM API";
# auth_basic_user_file /etc/nginx/.htpasswd;
# }
# }

# Create auth credentials
# sudo htpasswd -c /etc/nginx/.htpasswd api_user

# 2. Resource Management
# ===================================
# Systemd service
# /etc/systemd/system/text-gen.service
# [Unit]
# Description=Text Generation WebUI
# After=network.target
#
# [Service]
# User=llm
# WorkingDirectory=/opt/text-generation-webui
# ExecStart=/opt/text-generation-webui/venv/bin/python server.py \
# --api --listen --model Mistral-7B-Instruct-v0.2-GPTQ \
# --api-port 5000 --no-webui
# Restart=always
# RestartSec=10
# Environment=CUDA_VISIBLE_DEVICES=0
#
# # Resource limits
# MemoryMax=32G
# CPUQuota=400%
#
# [Install]
# WantedBy=multi-user.target

sudo systemctl enable text-gen
sudo systemctl start text-gen

# 3. Monitoring
# ===================================
# Health check script
# #!/bin/bash
# RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
# http://localhost:5000/v1/models)
# if [ "$RESPONSE" != "200" ]; then
# echo "LLM API unhealthy: $RESPONSE"
# systemctl restart text-gen
# fi

# Crontab health check every 5 minutes
# */5 * * * * /opt/scripts/health_check.sh >> /var/log/llm_health.log 2>&1

# 4. Data Protection
# ===================================
# - Never log full document content in production
# - Encrypt audit database at rest
# - Implement data retention policies
# - Use network isolation (no internet access for LLM server)
# - Regular security audits of API access logs

# Encrypt SQLite database
# pip install sqlcipher3
# Or use PostgreSQL with encryption

# 5. Backup Strategy
# ===================================
# Daily backup of audit logs and models
# 0 2 * * * /opt/scripts/backup_compliance.sh

#!/bin/bash
# backup_compliance.sh
BACKUP_DIR="/backup/compliance/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"
cp /opt/compliance/compliance_audit.db "$BACKUP_DIR/"
tar czf "$BACKUP_DIR/models.tar.gz" /opt/text-generation-webui/models/
find /backup/compliance -mtime +30 -delete # Keep 30 days

echo "Production deployment configured"

FAQ คำถามที่พบบ่อย

Q: GPU อะไรเหมาะสำหรับรัน LLM compliance?

A: สำหรับ 7B model (Mistral, Llama 3 8B) ต้องการ 8-12GB VRAM ขั้นต่ำ RTX 3060 12GB หรือ RTX 4060 Ti 16GB เพียงพอ สำหรับ 13B model ต้อง 16GB+ VRAM (RTX 4080, A4000) สำหรับ production ที่ต้อง process หลาย requests พร้อมกัน แนะนำ RTX 4090 24GB หรือ A100 ใช้ quantized models (GPTQ, AWQ, GGUF) ลด VRAM usage 50-70% โดย quality ลดลงเล็กน้อย

Q: Local LLM แม่นยำพอสำหรับ compliance ไหม?

A: สำหรับ tasks ที่มี clear criteria เช่น PII detection, keyword matching models 7B-13B ทำได้ดี (90%+ accuracy) สำหรับ nuanced analysis เช่น policy interpretation, legal compliance models ใหญ่กว่า (70B+) หรือ fine-tuned models ดีกว่า แนะนำ hybrid approach ใช้ local LLM สำหรับ initial screening แล้ว escalate cases ที่ไม่แน่ใจไป human review ไม่ควรใช้ LLM เป็น sole decision maker สำหรับ critical compliance decisions

Q: จะ fine-tune model สำหรับ compliance ได้อย่างไร?

A: ใช้ LoRA/QLoRA fine-tuning ที่ต้องการ VRAM น้อย (16GB พอ) สร้าง training data จาก labeled compliance examples (ต้อง 500-2000 examples) ใช้ Text Generation WebUI built-in training tab หรือ Hugging Face PEFT library Fine-tune บน domain-specific data เช่น regulatory documents, policy violations, data classification examples Evaluate ด้วย held-out test set ก่อน deploy

Q: GDPR compliance สำหรับ LLM processing มีอะไรบ้าง?

A: สำหรับ GDPR ต้องมี lawful basis สำหรับ processing personal data ผ่าน LLM, data minimization ส่งเฉพาะข้อมูลที่จำเป็นไป LLM, transparency แจ้ง data subjects ว่าใช้ AI ในการ process, right to explanation อธิบาย AI decisions ได้, data protection impact assessment (DPIA) สำหรับ high-risk processing และ no data retention ไม่เก็บ personal data ใน model weights การรัน LLM locally ช่วย compliance เรื่อง data transfer restrictions

📖 บทความที่เกี่ยวข้อง

Text Generation WebUI API Integration เชื่อมต่อระบบอ่านบทความ → Text Generation WebUI Incident Managementอ่านบทความ → Text Generation WebUI สำหรับมือใหม่ Step by Stepอ่านบทความ → Text Generation WebUI Code Review Best Practiceอ่านบทความ → Text Generation WebUI CI CD Automation Pipelineอ่านบทความ →

📚 ดูบทความทั้งหมด →