LocalAI Self-hosted Citizen Developer — รัน AI
LocalAI Self-hosted
LocalAI Self-hosted Citizen Developer LLM Open Source OpenAI API Compatible Privacy GGUF Hugging Face CPU GPU Docker Model Selection Chatbot Automation
| Platform | API Compatible | GPU Required | Cost | Privacy | เหมาะกับ |
|---|---|---|---|---|---|
| LocalAI | OpenAI | ไม่จำเป็น | Free | 100% | Self-hosted |
| Ollama | Custom + OpenAI | ไม่จำเป็น | Free | 100% | Developer |
| LM Studio | OpenAI | ไม่จำเป็น | Free | 100% | Desktop |
| vLLM | OpenAI | จำเป็น | Free | 100% | Production GPU |
| OpenAI API | Native | Cloud | Pay-per-use | ส่งออก | Best Quality |
LocalAI Setup
=== LocalAI Installation ===
Docker — Quick Start
docker run -p 8080:8080 localai/localai:latest
Docker Compose — Production
version: '3.8'
services:
localai:
image: localai/localai:latest-aio-cpu
ports:
- "8080:8080"
volumes:
- ./models:/build/models
- ./config:/build/config
environment:
- THREADS=8
- CONTEXT_SIZE=4096
- DEBUG=false
restart: unless-stopped
Download Models
curl -L "https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GGUF/resolve/main/mistral-7b-instruct-v0.2.Q4_K_M.gguf" \
-o models/mistral-7b.gguf
Model Config — models/mistral.yaml
name: mistral
backend: llama-cpp
parameters:
model: mistral-7b.gguf
temperature: 0.7
top_p: 0.9
top_k: 40
context_size: 4096
threads: 8
Test API — OpenAI Compatible
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mistral",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.7
}'
Python — Use OpenAI SDK
pip install openai
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="not-needed")
response = client.chat.completions.create(
model="mistral",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "อธิบาย Python list comprehension"},
],
temperature=0.7,
max_tokens=1000,
)
print(response.choices[0].message.content)
from dataclasses import dataclass
@dataclass
class ModelChoice:
model: str
size: str
ram_required: str
speed: str
quality: str
use_case: str
models = [
ModelChoice("Phi-3 Mini", "2.3GB", "4GB", "เร็วมาก", "ดี", "Chat Code"),
ModelChoice("Mistral 7B", "4.1GB", "8GB", "เร็ว", "ดีมาก", "General Purpose"),
ModelChoice("Llama 3 8B", "4.7GB", "8GB", "เร็ว", "ดีมาก", "General Purpose"),
ModelChoice("CodeLlama 7B", "4.1GB", "8GB", "เร็ว", "ดีมาก (Code)", "Code Generation"),
ModelChoice("Mixtral 8x7B", "26GB", "32GB", "ปานกลาง", "ดีเยี่ยม", "Complex Tasks"),
]
print("=== Model Choices ===")
for m in models:
print(f" [{m.model}] Size: {m.size} | RAM: {m.ram_required}")
print(f" Speed: {m.speed} | Quality: {m.quality}")
print(f" Use Case: {m.use_case}")
Citizen Developer Workflows
=== AI Workflows for Citizen Developers ===
Workflow 1: Document Summarizer
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1", api_key="x")
def summarize_document(text, max_words=200):
response = client.chat.completions.create(
model="mistral",
messages=[
{"role": "system", "content": f"Summarize in {max_words} words. Thai language."},
{"role": "user", "content": text},
],
)
return response.choices[0].message.content
Workflow 2: Email Draft Generator
def draft_email(topic, tone="professional"):
response = client.chat.completions.create(
model="mistral",
messages=[
{"role": "system", "content": f"Draft a {tone} email in Thai."},
{"role": "user", "content": f"Topic: {topic}"},
],
)
return response.choices[0].message.content
Workflow 3: Data Analysis Helper
def analyze_data(data_description, question):
response = client.chat.completions.create(
model="mistral",
messages=[
{"role": "system", "content": "You are a data analyst. Give Python code."},
{"role": "user", "content": f"Data: {data_description}\nQuestion: {question}"},
],
)
return response.choices[0].message.content
@dataclass
class Workflow:
name: str
input_type: str
output: str
model: str
time: str
difficulty: str
workflows = [
Workflow("Document Summary", "Text/PDF", "สรุปย่อ", "Mistral 7B", "5-15 sec", "ง่าย"),
Workflow("Email Draft", "Topic + Tone", "Draft Email", "Mistral 7B", "5-10 sec", "ง่าย"),
Workflow("Code Generator", "Description", "Python Code", "CodeLlama 7B", "10-30 sec", "ง่าย"),
Workflow("Data Analysis", "Data + Question", "Analysis + Code", "Mistral 7B", "10-20 sec", "ปานกลาง"),
Workflow("Chatbot", "User Message", "AI Response", "Llama 3 8B", "3-10 sec", "ปานกลาง"),
Workflow("Translation", "Text + Language", "Translated Text", "Mistral 7B", "5-15 sec", "ง่าย"),
]
print("\n=== Citizen Developer Workflows ===")
for w in workflows:
print(f" [{w.name}] Difficulty: {w.difficulty}")
print(f" Input: {w.input_type} → Output: {w.output}")
print(f" Model: {w.model} | Time: {w.time}")
Production Considerations
# === Production LocalAI Setup ===
# Hardware Requirements
@dataclass
class HardwareReq:
model_size: str
ram: str
cpu: str
gpu: str
users: str
cost_estimate: str
requirements = [
HardwareReq("3B (Phi-3)", "8GB", "4 cores", "ไม่จำเป็น", "1-5", "$500 PC"),
HardwareReq("7B (Mistral)", "16GB", "8 cores", "แนะนำ", "5-20", "$1,000 PC"),
HardwareReq("13B (Llama)", "32GB", "8+ cores", "แนะนำ", "10-50", "$2,000 PC"),
HardwareReq("8x7B (Mixtral)", "64GB", "16 cores", "จำเป็น", "20-100", "GPU Server"),
]
print("Hardware Requirements:")
for h in requirements:
print(f" [{h.model_size}] RAM: {h.ram} | CPU: {h.cpu}")
print(f" GPU: {h.gpu} | Users: {h.users} | Cost: {h.cost_estimate}")
# Cost Comparison
comparison = {
"LocalAI (Mistral 7B)": {"setup": "$1,000 one-time", "monthly": "$50 electricity", "per_1k_tokens": "$0"},
"OpenAI GPT-4o": {"setup": "$0", "monthly": "Variable", "per_1k_tokens": "$0.005-$0.015"},
"OpenAI GPT-3.5": {"setup": "$0", "monthly": "Variable", "per_1k_tokens": "$0.0005-$0.0015"},
}
print(f"\n\nCost Comparison (10K requests/month):")
for name, costs in comparison.items():
print(f" [{name}]")
for k, v in costs.items():
print(f" {k}: {v}")
เคล็ดลับ
- Start Small: เริ่มจาก Phi-3 Mini ทดลองก่อน แล้วค่อยขยาย
- Q4_K_M: ใช้ Quantization Q4_K_M ดีทั้งคุณภาพและขนาด
- OpenAI SDK: ใช้ OpenAI SDK เปลี่ยนแค่ base_url ง่าย
- Privacy: เหมาะกับข้อมูลลับ ไม่ส่งออกนอกเครื่อง
- Docker: ใช้ Docker Compose สำหรับ Production Setup
LocalAI คืออะไร
Open Source AI Server รันเครื่องตัวเอง OpenAI API Compatible LLM Image Audio Embeddings GGUF Hugging Face CPU GPU Docker Privacy Free