SiamCafe · Blog
LocalAI Self-hosted Citizen Developer — รัน AI
บทความ

LocalAI Self-hosted Citizen Developer — รัน AI

เผยแพร่ 28 พฤษภาคม 2569

LocalAI Self-hosted

LocalAI Self-hosted Citizen Developer LLM Open Source OpenAI API Compatible Privacy GGUF Hugging Face CPU GPU Docker Model Selection Chatbot Automation

PlatformAPI CompatibleGPU RequiredCostPrivacyเหมาะกับ
LocalAIOpenAIไม่จำเป็นFree100%Self-hosted
OllamaCustom + OpenAIไม่จำเป็นFree100%Developer
LM StudioOpenAIไม่จำเป็นFree100%Desktop
vLLMOpenAIจำเป็นFree100%Production GPU
OpenAI APINativeCloudPay-per-useส่งออกBest Quality

LocalAI Setup

=== LocalAI Installation ===

Docker — Quick Start

docker run -p 8080:8080 localai/localai:latest

Docker Compose — Production

version: '3.8'

services:

localai:

image: localai/localai:latest-aio-cpu

ports:

  • "8080:8080"

volumes:

  • ./models:/build/models
  • ./config:/build/config

environment:

  • THREADS=8
  • CONTEXT_SIZE=4096
  • DEBUG=false

restart: unless-stopped

Download Models

curl -L "https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf" \

-o models/mistral-7b.gguf

Model Config — models/mistral.yaml

name: mistral

backend: llama-cpp

parameters:

model: mistral-7b.gguf

temperature: 0.7

top_p: 0.9

top_k: 40

context_size: 4096

threads: 8

Test API — OpenAI Compatible

curl http://localhost:8080/v1/chat/completions \

-H "Content-Type: application/json" \

-d '{

"model": "mistral",

"messages": [{"role": "user", "content": "Hello!"}],

"temperature": 0.7

}'

Python — Use OpenAI SDK

pip install openai

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="not-needed")

response = client.chat.completions.create(

model="mistral",

messages=[

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", "content": "อธิบาย Python list comprehension"},

],

temperature=0.7,

max_tokens=1000,

)

print(response.choices[0].message.content)

from dataclasses import dataclass

@dataclass

class ModelChoice:

model: str

size: str

ram_required: str

speed: str

quality: str

use_case: str

models = [

ModelChoice("Phi-3 Mini", "2.3GB", "4GB", "เร็วมาก", "ดี", "Chat Code"),

ModelChoice("Mistral 7B", "4.1GB", "8GB", "เร็ว", "ดีมาก", "General Purpose"),

ModelChoice("Llama 3 8B", "4.7GB", "8GB", "เร็ว", "ดีมาก", "General Purpose"),

ModelChoice("CodeLlama 7B", "4.1GB", "8GB", "เร็ว", "ดีมาก (Code)", "Code Generation"),

ModelChoice("Mixtral 8x7B", "26GB", "32GB", "ปานกลาง", "ดีเยี่ยม", "Complex Tasks"),

]

print("=== Model Choices ===")

for m in models:

print(f" [{m.model}] Size: {m.size} | RAM: {m.ram_required}")

print(f" Speed: {m.speed} | Quality: {m.quality}")

print(f" Use Case: {m.use_case}")

Citizen Developer Workflows

=== AI Workflows for Citizen Developers ===

Workflow 1: Document Summarizer

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8080/v1", api_key="x")

def summarize_document(text, max_words=200):

response = client.chat.completions.create(

model="mistral",

messages=[

{"role": "system", "content": f"Summarize in {max_words} words. Thai language."},

{"role": "user", "content": text},

],

)

return response.choices[0].message.content

Workflow 2: Email Draft Generator

def draft_email(topic, tone="professional"):

response = client.chat.completions.create(

model="mistral",

messages=[

{"role": "system", "content": f"Draft a {tone} email in Thai."},

{"role": "user", "content": f"Topic: {topic}"},

],

)

return response.choices[0].message.content

Workflow 3: Data Analysis Helper

def analyze_data(data_description, question):

response = client.chat.completions.create(

model="mistral",

messages=[

{"role": "system", "content": "You are a data analyst. Give Python code."},

{"role": "user", "content": f"Data: {data_description}\nQuestion: {question}"},

],

)

return response.choices[0].message.content

@dataclass

class Workflow:

name: str

input_type: str

output: str

model: str

time: str

difficulty: str

workflows = [

Workflow("Document Summary", "Text/PDF", "สรุปย่อ", "Mistral 7B", "5-15 sec", "ง่าย"),

Workflow("Email Draft", "Topic + Tone", "Draft Email", "Mistral 7B", "5-10 sec", "ง่าย"),

Workflow("Code Generator", "Description", "Python Code", "CodeLlama 7B", "10-30 sec", "ง่าย"),

Workflow("Data Analysis", "Data + Question", "Analysis + Code", "Mistral 7B", "10-20 sec", "ปานกลาง"),

Workflow("Chatbot", "User Message", "AI Response", "Llama 3 8B", "3-10 sec", "ปานกลาง"),

Workflow("Translation", "Text + Language", "Translated Text", "Mistral 7B", "5-15 sec", "ง่าย"),

]

print("\n=== Citizen Developer Workflows ===")

for w in workflows:

print(f" [{w.name}] Difficulty: {w.difficulty}")

print(f" Input: {w.input_type} → Output: {w.output}")

print(f" Model: {w.model} | Time: {w.time}")

Production Considerations

# === Production LocalAI Setup ===

# Hardware Requirements
@dataclass
class HardwareReq:
    model_size: str
    ram: str
    cpu: str
    gpu: str
    users: str
    cost_estimate: str

requirements = [
    HardwareReq("3B (Phi-3)", "8GB", "4 cores", "ไม่จำเป็น", "1-5", "$500 PC"),
    HardwareReq("7B (Mistral)", "16GB", "8 cores", "แนะนำ", "5-20", "$1,000 PC"),
    HardwareReq("13B (Llama)", "32GB", "8+ cores", "แนะนำ", "10-50", "$2,000 PC"),
    HardwareReq("8x7B (Mixtral)", "64GB", "16 cores", "จำเป็น", "20-100", "GPU Server"),
]

print("Hardware Requirements:")
for h in requirements:
    print(f"  [{h.model_size}] RAM: {h.ram} | CPU: {h.cpu}")
    print(f"    GPU: {h.gpu} | Users: {h.users} | Cost: {h.cost_estimate}")

# Cost Comparison
comparison = {
    "LocalAI (Mistral 7B)": {"setup": "$1,000 one-time", "monthly": "$50 electricity", "per_1k_tokens": "$0"},
    "OpenAI GPT-4o": {"setup": "$0", "monthly": "Variable", "per_1k_tokens": "$0.005-$0.015"},
    "OpenAI GPT-3.5": {"setup": "$0", "monthly": "Variable", "per_1k_tokens": "$0.0005-$0.0015"},
}

print(f"\n\nCost Comparison (10K requests/month):")
for name, costs in comparison.items():
    print(f"  [{name}]")
    for k, v in costs.items():
        print(f"    {k}: {v}")

เคล็ดลับ

  • Start Small: เริ่มจาก Phi-3 Mini ทดลองก่อน แล้วค่อยขยาย
  • Q4_K_M: ใช้ Quantization Q4_K_M ดีทั้งคุณภาพและขนาด
  • OpenAI SDK: ใช้ OpenAI SDK เปลี่ยนแค่ base_url ง่าย
  • Privacy: เหมาะกับข้อมูลลับ ไม่ส่งออกนอกเครื่อง
  • Docker: ใช้ Docker Compose สำหรับ Production Setup

LocalAI คืออะไร

Open Source AI Server รันเครื่องตัวเอง OpenAI API Compatible LLM Image Audio Embeddings GGUF Hugging Face CPU GPU Docker Privacy Free