SiamCafe.net Blog
Technology

Ollama Local LLM Internal Developer Platform

ollama local llm internal developer platform
Ollama Local LLM Internal Developer Platform | SiamCafe Blog
2025-11-18· อ. บอม — SiamCafe.net· 10,796 คำ

Ollama Local LLM + IDP

Ollama Local LLM Internal Developer Platform IDP LLaMA Mistral Gemma API REST Privacy Code Review Documentation ChatOps GPU

ModelSizeVRAMUse Case
LLaMA 3 8B4.7GB (Q4)6GB+General Chat Code Review
Mistral 7B4.1GB (Q4)6GB+Fast Inference Coding
CodeLlama 13B7.4GB (Q4)10GB+Code Generation Review
Gemma 2 9B5.4GB (Q4)8GB+General Purpose Compact
LLaMA 3 70B40GB (Q4)24GB+Complex Analysis Expert
Qwen 2 72B41GB (Q4)24GB+Multilingual Code

Ollama Setup & API

# === Ollama Installation & API ===

# Install (Linux/Mac)
# curl -fsSL https://ollama.com/install.sh | sh
#
# Install (Windows)
# Download from https://ollama.com/download
#
# Run Model
# ollama run llama3        # Interactive chat
# ollama run codellama     # Code generation
# ollama run mistral       # Fast inference
# ollama pull llama3       # Download only
# ollama list              # List installed models
#
# Docker
# docker run -d --gpus all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
# docker exec -it ollama ollama run llama3

# REST API Examples
# curl http://localhost:11434/api/generate -d '{
#   "model": "llama3",
#   "prompt": "Review this Python code:\ndef add(a, b): return a+b",
#   "stream": false
# }'
#
# curl http://localhost:11434/api/chat -d '{
#   "model": "llama3",
#   "messages": [
#     {"role": "system", "content": "You are a senior code reviewer."},
#     {"role": "user", "content": "Review this function..."}
#   ],
#   "stream": false
# }'

from dataclasses import dataclass

@dataclass
class OllamaEndpoint:
    method: str
    path: str
    description: str
    example_body: str

endpoints = [
    OllamaEndpoint("POST", "/api/generate",
        "Generate text completion",
        '{"model":"llama3","prompt":"Explain Docker","stream":false}'),
    OllamaEndpoint("POST", "/api/chat",
        "Chat conversation (multi-turn)",
        '{"model":"llama3","messages":[{"role":"user","content":"Hello"}]}'),
    OllamaEndpoint("POST", "/api/embeddings",
        "Generate text embeddings (RAG)",
        '{"model":"llama3","prompt":"text to embed"}'),
    OllamaEndpoint("GET", "/api/tags",
        "List installed models",
        "No body required"),
    OllamaEndpoint("POST", "/api/pull",
        "Download model from library",
        '{"name":"codellama:13b"}'),
    OllamaEndpoint("DELETE", "/api/delete",
        "Delete installed model",
        '{"name":"llama3"}'),
]

print("=== Ollama API Endpoints ===")
for e in endpoints:
    print(f"  [{e.method}] {e.path}")
    print(f"    Desc: {e.description}")
    print(f"    Body: {e.example_body}")

IDP Integration

# === IDP + Ollama Integration ===

# Modelfile (Custom Configuration)
# FROM llama3
# SYSTEM """You are a senior developer at our company.
# You review code following our coding standards:
# - PEP 8 for Python
# - ESLint Airbnb for JavaScript
# - Always suggest tests
# - Flag security issues
# """
# PARAMETER temperature 0.3
# PARAMETER num_ctx 4096
#
# ollama create code-reviewer -f Modelfile
# ollama run code-reviewer

@dataclass
class IDPFeature:
    feature: str
    model: str
    integration: str
    example: str

features = [
    IDPFeature("AI Code Review",
        "codellama:13b / llama3",
        "Git Hook → Ollama API → Comment on PR",
        "PR สร้าง → Bot ส่ง Diff → Ollama Review → Comment"),
    IDPFeature("Auto Documentation",
        "llama3 / mistral",
        "CI Pipeline → Extract Code → Ollama → Generate Docs",
        "Push Code → CI สร้าง API Docs Readme อัตโนมัติ"),
    IDPFeature("Incident Analysis",
        "llama3:70b",
        "Alert → Collect Logs → Ollama Analyze → Suggest Fix",
        "PagerDuty Alert → รวม Log → AI วิเคราะห์ → แนะนำแก้"),
    IDPFeature("ChatOps Assistant",
        "llama3 / mistral",
        "Slack Bot → Ollama API → Reply in Channel",
        "/ask how to deploy → AI ตอบจาก Internal Docs"),
    IDPFeature("Test Generation",
        "codellama:13b",
        "Developer Portal → Paste Code → Generate Tests",
        "ส่ง Function → Ollama สร้าง Unit Test → Copy/Paste"),
    IDPFeature("SQL Assistant",
        "codellama / sqlcoder",
        "Developer Portal → Natural Language → SQL Query",
        "ถามภาษาไทย → Ollama แปลเป็น SQL → Query Database"),
]

print("=== IDP + Ollama Features ===")
for f in features:
    print(f"\n  [{f.feature}] Model: {f.model}")
    print(f"    Integration: {f.integration}")
    print(f"    Example: {f.example}")

Production Architecture

# === Production Architecture ===

@dataclass
class Component:
    name: str
    technology: str
    role: str
    scaling: str

components = [
    Component("IDP Portal",
        "Next.js / React + Tailwind",
        "Web UI สำหรับ Developer เข้าถึง AI Features",
        "Horizontal (Replicas) CDN Static Assets"),
    Component("API Gateway",
        "Kong / Nginx / Traefik",
        "Auth Rate Limiting Routing Load Balancing",
        "Horizontal (Multiple Instances)"),
    Component("Ollama Cluster",
        "Ollama Docker + GPU Node Pool (K8s)",
        "Serve LLM Inference (Code Review Docs Chat)",
        "Horizontal (Add GPU Nodes) + Queue"),
    Component("Request Queue",
        "Redis / RabbitMQ / Kafka",
        "Buffer Requests ป้องกัน GPU Overload",
        "Cluster Mode (Redis Cluster / Kafka Partitions)"),
    Component("Response Cache",
        "Redis / Memcached",
        "Cache Frequent Responses ลด GPU Load 30-50%",
        "Cluster Mode + TTL-based Eviction"),
    Component("Monitoring",
        "Prometheus + Grafana",
        "GPU Util VRAM Latency Throughput Error Rate",
        "Prometheus Federation สำหรับ Multi-cluster"),
]

print("=== Production Components ===")
for c in components:
    print(f"  [{c.name}] {c.technology}")
    print(f"    Role: {c.role}")
    print(f"    Scaling: {c.scaling}")

เคล็ดลับ

Ollama คืออะไร

Open Source Local LLM Runner LLaMA Mistral Gemma CodeLlama REST API GPU CUDA Metal Privacy Offline Free Modelfile

Internal Developer Platform คืออะไร

IDP Self-service Portal Service Catalog CI/CD Automation Observability Ollama AI Code Review Documentation ChatOps Test Generation

API Integration ทำอย่างไร

REST API localhost:11434 /api/generate /api/chat /api/embeddings Python SDK OpenAI Compatible LangChain Docker Kubernetes

Production Setup ทำอย่างไร

IDP Portal API Gateway Ollama Cluster GPU Queue Redis Cache Monitoring Prometheus Grafana Security Network Auth Rate Limit Scaling

สรุป

Ollama Local LLM Internal Developer Platform IDP REST API Code Review Documentation ChatOps GPU Privacy Cache Queue Production Scale

📖 บทความที่เกี่ยวข้อง

Ollama Local LLM Microservices Architectureอ่านบทความ → Ollama Local LLM Container Orchestrationอ่านบทความ → Ollama Local LLM MLOps Workflowอ่านบทความ → Ollama Local LLM Message Queue Designอ่านบทความ → Ollama Local LLM Chaos Engineeringอ่านบทความ →

📚 ดูบทความทั้งหมด →