Proxmox VE Cluster
Proxmox VE Cluster KVM LXC High Availability Ceph ZFS Storage Backup PBS Production Virtualization
| Feature | Proxmox VE | VMware vSphere | Hyper-V |
|---|---|---|---|
| License | Free (AGPL) + Subscription optional | Paid ($$$) | Included with Windows Server |
| Hypervisor | KVM + LXC | ESXi | Hyper-V |
| Web UI | Built-in (port 8006) | vCenter (separate) | Windows Admin Center |
| HA | Built-in (3+ nodes) | vSphere HA | Failover Clustering |
| Storage | ZFS, Ceph, LVM, NFS | VMFS, vSAN, NFS | CSV, SMB, iSCSI |
| Backup | PBS (free) | Separate product | Windows Backup |
Cluster Setup
# === Proxmox Cluster Setup ===
# Node 1: Create cluster
# pvecm create my-cluster
# pvecm status
# Node 2: Join cluster
# pvecm add 192.168.1.101
# pvecm status
# Node 3: Join cluster
# pvecm add 192.168.1.101
# pvecm status
# Verify cluster
# pvecm nodes
# pvecm expected 3
# Network configuration (/etc/network/interfaces)
# auto vmbr0
# iface vmbr0 inet static
# address 192.168.1.101/24
# gateway 192.168.1.1
# bridge-ports eno1
# bridge-stp off
#
# auto vmbr1
# iface vmbr1 inet static
# address 10.10.10.101/24
# bridge-ports eno2
# bridge-stp off
# # Cluster/Ceph network (10GbE)
# Corosync config (/etc/pve/corosync.conf)
# totem {
# version: 2
# cluster_name: my-cluster
# transport: knet
# interface {
# linknumber: 0
# }
# }
from dataclasses import dataclass
@dataclass
class ClusterRequirement:
component: str
minimum: str
recommended: str
purpose: str
requirements = [
ClusterRequirement("Nodes", "3 (for quorum)",
"3-5 nodes, odd number",
"Quorum voting, HA failover"),
ClusterRequirement("CPU", "4 cores per node",
"16-32 cores (Xeon/EPYC)",
"VM/Container processing"),
ClusterRequirement("RAM", "16 GB per node",
"64-256 GB ECC RAM",
"VM memory + ZFS ARC cache"),
ClusterRequirement("Storage", "SSD 256 GB",
"NVMe 1-2TB + HDD for bulk",
"VM disks, Ceph OSD, ZFS pool"),
ClusterRequirement("Network", "1GbE × 2",
"10GbE × 2 (management + storage)",
"Cluster comms, Ceph replication, migration"),
ClusterRequirement("UPS", "ไม่บังคับ",
"UPS + NUT monitoring",
"ป้องกัน Data Loss จากไฟดับ"),
]
print("=== Cluster Requirements ===")
for r in requirements:
print(f" [{r.component}] Min: {r.minimum}")
print(f" Recommended: {r.recommended}")
print(f" Purpose: {r.purpose}")
Storage and Backup
# === Storage Configuration ===
# ZFS Pool creation
# zpool create -f rpool mirror /dev/sda /dev/sdb
# zfs set compression=lz4 rpool
# zfs set atime=off rpool
# pvesm add zfspool local-zfs -pool rpool/data
# Ceph setup (on each node)
# pveceph install
# pveceph init --network 10.10.10.0/24
# pveceph mon create
# pveceph osd create /dev/sdc
# pveceph osd create /dev/sdd
# pveceph pool create vm-pool --pg_num 128
# Backup with PBS
# apt install proxmox-backup-server
# proxmox-backup-manager datastore create backups /mnt/backup
# # In PVE: Datacenter → Storage → Add → Proxmox Backup Server
# # Schedule: vzdump --all --mode snapshot --storage pbs --schedule daily
@dataclass
class StorageOption:
storage: str
type_: str
performance: str
ha_ready: str
best_for: str
cost: str
storages = [
StorageOption("ZFS (local)",
"Local, Mirror/RAIDZ", "สูงมาก (NVMe)",
"ใช้ Replication สำหรับ HA",
"Homelab, Small Cluster 2-3 nodes",
"ต่ำ (ใช้ Disk ที่มี)"),
StorageOption("Ceph (distributed)",
"Distributed, 3x Replication", "สูง (10GbE required)",
"HA ในตัว เมื่อ OSD ล่ม Recover อัตโนมัติ",
"Production Cluster 3+ nodes",
"กลาง (ต้อง 10GbE + NVMe)"),
StorageOption("NFS",
"Shared, Network filesystem", "กลาง",
"Shared แต่ NFS Server เป็น SPOF",
"Simple Shared Storage, Migration",
"ต่ำ"),
StorageOption("iSCSI/FC SAN",
"Block storage over network", "สูงมาก",
"HA ขึ้นกับ SAN design",
"Enterprise, existing SAN infrastructure",
"สูง (SAN hardware)"),
]
print("=== Storage Options ===")
for s in storages:
print(f" [{s.storage}] {s.type_}")
print(f" Performance: {s.performance}")
print(f" HA: {s.ha_ready}")
print(f" Best for: {s.best_for}")
print(f" Cost: {s.cost}")
Production Tips
# === Production Best Practices ===
@dataclass
class BestPractice:
area: str
practice: str
why: str
command: str
practices = [
BestPractice("Network",
"แยก Network สำหรับ Management, Storage, VM Traffic",
"ป้องกัน Congestion ระหว่าง Traffic ประเภทต่างๆ",
"สร้าง vmbr แยกสำหรับแต่ละ Network"),
BestPractice("Backup",
"Backup ทุกวัน ด้วย PBS เก็บ 30 วัน + offsite",
"ป้องกัน Data Loss จาก Hardware Failure",
"vzdump --all --mode snapshot --storage pbs"),
BestPractice("Monitoring",
"ใช้ Prometheus + Grafana monitor Cluster",
"เห็นปัญหาก่อน Node ล่ม CPU RAM Disk Network",
"pve-exporter + node-exporter + Grafana dashboard"),
BestPractice("Update",
"อัพเดท Proxmox ทีละ Node ใช้ Rolling Update",
"ไม่ Downtime HA ย้าย VM ไป Node อื่นก่อน Update",
"apt update && apt dist-upgrade (ทีละ Node)"),
BestPractice("Security",
"เปลี่ยน Port 8006, ใช้ 2FA, Firewall, SSH Key",
"ป้องกัน Unauthorized Access",
"pveum user modify root@pam -enable 1 + TOTP"),
]
print("=== Best Practices ===")
for p in practices:
print(f" [{p.area}] {p.practice}")
print(f" Why: {p.why}")
print(f" How: {p.command}")
เคล็ดลับ
- Quorum: ใช้ 3 Node ขึ้นไปเสมอ สำหรับ HA ที่เชื่อถือได้
- 10GbE: ใช้ 10GbE สำหรับ Storage/Ceph Network แตกต่างอย่างมาก
- ZFS RAM: เผื่อ RAM สำหรับ ZFS ARC Cache 1GB ต่อ 1TB Storage
- PBS: ใช้ Proxmox Backup Server ฟรี ดีกว่า vzdump ธรรมดา
- Test: ทดสอบ HA Failover ทุก Quarter ตัด Node ดู VM ย้ายได้จริง
Proxmox VE คืออะไร
Open Source Virtualization KVM LXC Container Web UI Cluster HA ZFS Ceph Backup PBS Free License Homelab Enterprise
สร้าง Cluster อย่างไร
3 Node Quorum pvecm create pvecm add Corosync Network 10GbE Shared Storage Ceph ZFS Replication NFS Live Migration
HA ทำงานอย่างไร
ha-manager HA Group Node ล่ม Fencing ย้าย VM อัตโนมัติ Shared Storage Ceph NFS Failover 1-3 นาที 3 Node Quorum
Storage แนะนำอะไร
ZFS Local เร็ว Snapshot Ceph Distributed HA NFS Simple Shared iSCSI SAN Enterprise ZFS+Ceph สำหรับ Production 10GbE NVMe
สรุป
Proxmox VE Cluster KVM LXC HA ZFS Ceph Storage PBS Backup Network 10GbE Quorum Fencing Production Monitoring
