SiamCafe.net Blog
Technology

oVirt Virtualization High Availability HA Setup ตั้งค่า HA สำหรับ Virtual Machines

ovirt virtualization high availability ha setup
oVirt Virtualization High Availability HA Setup | SiamCafe Blog
2026-04-19· อ. บอม — SiamCafe.net· 1,473 คำ

oVirt High Availability ?????????????????????

oVirt ???????????? open source virtualization platform ?????????????????????????????? KVM hypervisor ?????????????????? virtual machines, storage ????????? network ???????????? web-based management interface High Availability (HA) ?????? oVirt ????????????????????????????????????????????? VMs ??????????????????????????????????????????????????? host ????????? run ?????????????????? fail ????????? oVirt ?????? migrate VM ??????????????? host ???????????????????????????????????????

oVirt HA ??????????????????????????? hosted engine architecture ????????? oVirt Engine (management plane) run ???????????? VM ?????? cluster ????????? ????????? host ????????? run Engine fail ???????????? agent ?????? hosts ?????????????????????????????????????????? restart Engine VM ?????? host ???????????????????????? ?????????????????? workload VMs ????????? fencing mechanism ????????????????????? host failure ???????????? restart VMs ?????? hosts ????????????????????????

Components ???????????????????????? oVirt HA ?????????????????? Hosted Engine Agent ???????????? Engine VM, VDSM (Virtual Desktop and Server Manager) ?????????????????? VMs ????????????????????? host, Fencing Agents ???????????????????????????????????????????????? host failure, Shared Storage ???????????? VM disks ?????????????????? hosts ??????????????????????????????, SPM (Storage Pool Manager) ?????????????????? storage operations

????????????????????? oVirt Engine ????????? Hosts

Setup oVirt cluster ?????????????????? HA

# === oVirt Installation ===

# 1. Install oVirt Engine (on CentOS Stream 9)
# Add oVirt repository
dnf install -y centos-release-ovirt45
dnf module enable -y pki-deps postgresql:15

# Install Engine
dnf install -y ovirt-engine

# 2. Configure Engine
engine-setup

# Answer the prompts:
# Configure Engine: Yes
# Configure Data Warehouse: Yes
# Configure Grafana: Yes
# Application mode: Both (Virt + Gluster)
# Firewall manager: firewalld
# Host FQDN: engine.example.com
# Organization name: MyOrg
# Admin password: ********
# Database: Local
# Default SAN wipe after delete: No
# NFS ISO domain: Yes (optional)

# 3. Install oVirt Node on Hosts
# Option A: oVirt Node (minimal OS image)
# Download oVirt Node ISO and install on bare metal

# Option B: Install on existing CentOS
dnf install -y centos-release-ovirt45
dnf install -y ovirt-host

# 4. Add Host to Cluster via Engine
# Web UI: https://engine.example.com
# Compute ??? Hosts ??? New
# Name: host01.example.com
# Hostname: host01.example.com
# Authentication: SSH Public Key or Password

# 5. Configure Network
# Create ovirtmgmt bridge (automatic during host add)
# Add additional networks:
# - storage: for NFS/iSCSI traffic
# - migration: for live migration traffic (dedicated NIC recommended)
# - vm: for VM traffic

# 6. Hosted Engine Deploy (self-hosted)
hosted-engine --deploy

# Answer prompts:
# Storage type: NFS
# Storage connection: nfs.example.com:/hosted-engine
# VM disk size: 80 GB
# VM memory: 16384 MB
# VM CPUs: 4
# Engine FQDN: engine.example.com
# Admin password: ********

# 7. Add Additional Hosts
# Each host runs hosted-engine agent
hosted-engine --deploy --4

# On additional hosts:
hosted-engine --deploy
# Select: Additional host

echo "oVirt Engine and Hosts installed"

Configure HA ?????????????????? VMs

????????????????????? High Availability ?????????????????? virtual machines

# === VM High Availability Configuration ===

# 1. Via oVirt Engine REST API
# Enable HA for a VM
curl -k -u admin@internal:password \
  -H "Content-Type: application/xml" \
  -X PUT \
  "https://engine.example.com/ovirt-engine/api/vms/VM_ID" \
  -d '

  
    true
    100
  
  
    true
    true
  
  500
'

# 2. Via Ansible (recommended for automation)
cat > ovirt_ha_setup.yml << 'EOF'
---
- name: Configure oVirt HA
  hosts: localhost
  connection: local
  gather_facts: false
  
  vars:
    engine_url: https://engine.example.com/ovirt-engine/api
    engine_user: admin@internal
    engine_password: "{{ vault_engine_password }}"
    engine_cafile: /etc/pki/ovirt-engine/ca.pem
  
  tasks:
    - name: Login to oVirt
      ovirt_auth:
        url: "{{ engine_url }}"
        username: "{{ engine_user }}"
        password: "{{ engine_password }}"
        ca_file: "{{ engine_cafile }}"
      register: ovirt_auth
    
    - name: Create HA VM
      ovirt_vm:
        auth: "{{ ovirt_auth.ovirt_auth }}"
        name: web-server-01
        cluster: Default
        template: centos9-template
        memory: 4GiB
        cpu_cores: 2
        cpu_sockets: 1
        high_availability: true
        high_availability_priority: 100
        operating_system: rhel_9x64
        type: server
        state: running
        nics:
          - name: nic1
            profile_name: ovirtmgmt
        disks:
          - name: web-server-01-disk
            size: 50GiB
            storage_domain: data-nfs
            interface: virtio_scsi
    
    - name: Configure VM HA Policy
      ovirt_vm:
        auth: "{{ ovirt_auth.ovirt_auth }}"
        name: web-server-01
        high_availability: true
        high_availability_priority: 100
        lease:
          storage_domain: data-nfs
        placement_policy:
          affinity: migratable
          hosts:
            - host01
            - host02
            - host03
    
    - name: Logout
      ovirt_auth:
        ovirt_auth: "{{ ovirt_auth.ovirt_auth }}"
        state: absent
EOF

ansible-playbook ovirt_ha_setup.yml --ask-vault-pass

echo "HA VMs configured"

Fencing ????????? Power Management

Configure fencing ?????????????????? host failure detection

# === Fencing Configuration ===

# Fencing ???????????????????????????????????? isolate host ????????? fail
# ???????????????????????? VMs restart ?????? host ????????????????????????????????????????????????
# ????????????????????? split-brain (2 hosts run same VM)

# 1. IPMI/BMC Fencing (physical servers)
# Configure via Engine UI:
# Compute ??? Hosts ??? host01 ??? Power Management
# Type: ipmilan
# Address: 192.168.1.101 (BMC IP)
# Username: admin
# Password: ********
# Options: lanplus=1

# API equivalent:
curl -k -u admin@internal:password \
  -H "Content-Type: application/xml" \
  -X POST \
  "https://engine.example.com/ovirt-engine/api/hosts/HOST_ID/fenceagents" \
  -d '

  ipmilan
  
192.168.1.101
admin password 1
' # 2. Test Fencing curl -k -u admin@internal:password \ -H "Content-Type: application/xml" \ -X POST \ "https://engine.example.com/ovirt-engine/api/hosts/HOST_ID/fence" \ -d 'status' # 3. Fencing Policy (Cluster Level) # Compute ??? Clusters ??? Default ??? Fencing Policy # Enable fencing: Yes # Skip if host has live lease: Yes # Skip if SD is active: Yes # Skip if connectivity > threshold: Yes # 4. Ansible Fencing Setup cat > fencing_setup.yml << 'EOF' --- - name: Configure Fencing hosts: localhost tasks: - name: Add fence agent ovirt_host_pm: auth: "{{ ovirt_auth.ovirt_auth }}" name: host01 address: 192.168.1.101 username: admin password: "{{ vault_bmc_password }}" type: ipmilan options: lanplus: 1 order: 1 state: present - name: Test fence status ovirt_host_pm: auth: "{{ ovirt_auth.ovirt_auth }}" name: host01 state: status register: fence_status - name: Show fence status debug: var: fence_status EOF echo "Fencing configured"

Storage High Availability

Configure storage HA ?????????????????? oVirt

#!/usr/bin/env python3
# storage_ha.py ??? oVirt Storage HA Configuration
import json
import logging
from typing import Dict, List

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("storage")

class OVirtStorageHA:
    def __init__(self):
        self.domains = []
    
    def storage_options(self):
        return {
            "nfs": {
                "description": "NFS shared storage",
                "ha_level": "Depends on NFS server HA",
                "performance": "Good for general workloads",
                "setup": "Simple, widely supported",
                "recommendation": "Use HA NFS (DRBD, GlusterFS, or NAS appliance)",
            },
            "iscsi": {
                "description": "iSCSI block storage",
                "ha_level": "Depends on target HA (multipath recommended)",
                "performance": "Better than NFS for I/O intensive",
                "setup": "Moderate complexity",
                "recommendation": "Use multipath with 2+ paths for HA",
            },
            "fc": {
                "description": "Fibre Channel SAN",
                "ha_level": "Very high (dual fabric)",
                "performance": "Best for enterprise workloads",
                "setup": "Complex, requires FC switches",
                "recommendation": "Enterprise only, dual fabric mandatory",
            },
            "glusterfs": {
                "description": "GlusterFS distributed storage",
                "ha_level": "Built-in replication (replica 3)",
                "performance": "Good, scales horizontally",
                "setup": "Moderate, can use oVirt hosts as storage",
                "recommendation": "Good for hyper-converged deployment",
            },
        }
    
    def glusterfs_setup(self):
        return {
            "description": "Hyper-converged: oVirt hosts also serve GlusterFS storage",
            "minimum_nodes": 3,
            "replica_count": 3,
            "steps": [
                "1. Install GlusterFS on all oVirt hosts",
                "2. Create GlusterFS volume with replica 3",
                "3. Add as storage domain in oVirt Engine",
                "4. VMs stored on replicated GlusterFS volume",
                "5. If 1 host fails, data still available on 2 others",
            ],
            "commands": [
                "gluster peer probe host02",
                "gluster peer probe host03",
                "gluster volume create vmstore replica 3 host01:/data/brick host02:/data/brick host03:/data/brick",
                "gluster volume start vmstore",
                "gluster volume set vmstore group virt",
                "gluster volume set vmstore storage.owner-uid 36",
                "gluster volume set vmstore storage.owner-gid 36",
            ],
        }
    
    def multipath_config(self):
        return {
            "description": "iSCSI multipath for storage HA",
            "config": {
                "file": "/etc/multipath.conf",
                "content": {
                    "defaults": {
                        "user_friendly_names": "yes",
                        "find_multipaths": "yes",
                        "path_grouping_policy": "failover",
                        "path_selector": "round-robin 0",
                        "failback": "immediate",
                        "no_path_retry": 5,
                    },
                },
            },
        }

storage = OVirtStorageHA()
options = storage.storage_options()
print("Storage Options:")
for name, opt in options.items():
    print(f"  {name}: {opt['ha_level']}")

gluster = storage.glusterfs_setup()
print(f"\nGlusterFS Setup ({gluster['minimum_nodes']} nodes, replica {gluster['replica_count']})")
for cmd in gluster["commands"][:3]:
    print(f"  {cmd}")

Monitoring ????????? Disaster Recovery

Monitor oVirt cluster ????????? DR planning

# === oVirt Monitoring & DR ===

# 1. Built-in Grafana Dashboards
# oVirt Engine includes Grafana integration
# Access: https://engine.example.com/ovirt-engine-grafana/
# Dashboards:
# - Executive Dashboard (overview)
# - Inventory Dashboard (hosts, VMs, storage)
# - Service Level Dashboard (uptime, SLA)
# - Trend Dashboard (resource trends)

# 2. Prometheus Monitoring
cat > prometheus-ovirt.yml << 'EOF'
scrape_configs:
  - job_name: "ovirt-engine"
    metrics_path: /ovirt-engine/services/metrics
    static_configs:
      - targets: ["engine.example.com:443"]
    scheme: https
    tls_config:
      insecure_skip_verify: true

  - job_name: "ovirt-hosts"
    static_configs:
      - targets:
          - "host01:9100"
          - "host02:9100"
          - "host03:9100"
EOF

# 3. Health Check Script
cat > ovirt_health.sh << 'BASH'
#!/bin/bash
# oVirt Cluster Health Check

ENGINE_URL="https://engine.example.com/ovirt-engine/api"
AUTH="admin@internal:password"

echo "=== oVirt Cluster Health ==="

# Check hosts status
echo "Hosts:"
curl -sk -u $AUTH "$ENGINE_URL/hosts" \
  -H "Accept: application/json" | \
  python3 -c "
import sys, json
data = json.load(sys.stdin)
for h in data.get('host', []):
    print(f\"  {h['name']}: {h['status']}\")"

# Check VMs with HA enabled
echo -e "\nHA VMs:"
curl -sk -u $AUTH "$ENGINE_URL/vms?search=ha_enabled%3Dtrue" \
  -H "Accept: application/json" | \
  python3 -c "
import sys, json
data = json.load(sys.stdin)
for vm in data.get('vm', []):
    print(f\"  {vm['name']}: {vm['status']}\")"

# Check storage domains
echo -e "\nStorage Domains:"
curl -sk -u $AUTH "$ENGINE_URL/storagedomains" \
  -H "Accept: application/json" | \
  python3 -c "
import sys, json
data = json.load(sys.stdin)
for sd in data.get('storage_domain', []):
    avail = int(sd.get('available', 0)) // (1024**3)
    used = int(sd.get('used', 0)) // (1024**3)
    print(f\"  {sd['name']}: {sd['status']} (used: {used}GB, avail: {avail}GB)\")"
BASH

chmod +x ovirt_health.sh

# 4. Backup Engine
# Full backup:
engine-backup --mode=backup --file=engine-backup.tar.gz --log=backup.log

# Restore:
engine-backup --mode=restore --file=engine-backup.tar.gz --log=restore.log

# Schedule daily backup:
echo "0 2 * * * root engine-backup --mode=backup --file=/backup/engine-\$(date +\%Y\%m\%d).tar.gz --log=/var/log/engine-backup.log" > /etc/cron.d/ovirt-backup

echo "Monitoring and DR configured"

FAQ ??????????????????????????????????????????

Q: oVirt ????????? VMware vSphere ???????????????????????????????????????????

A: oVirt ???????????? open source ??????????????? license cost ????????? KVM hypervisor (Linux kernel built-in) community support ???????????????????????? ?????? commercial support ????????? Red Hat (RHEV) ????????????????????????????????? budget ??????????????? ???????????????????????? Linux skills VMware vSphere ???????????? commercial product license cost ?????????????????? features ????????????????????? (vMotion, DRS, vSAN) support ???????????????????????? ecosystem ???????????? ????????????????????????????????? enterprise ?????????????????????????????? full support Performance ???????????????????????????????????? KVM ????????? ESXi ??????????????????????????????????????? oVirt ????????????????????????????????????????????????????????????????????? VMware Broadcom licensing ???????????????????????????????????????

Q: Fencing ????????????????????????????????????????????? HA?

A: ??????????????????????????? Fencing ???????????? requirement ?????????????????? VM HA ???????????????????????? fencing ??????????????? host fail oVirt ??????????????????????????? restart VMs ?????? host ????????????????????? ??????????????????????????????????????????????????? host ??????????????????????????????????????????????????? ????????????????????? split-brain (2 copies ????????? VM ???????????????????????? run ???????????????????????? ??????????????? data corruption) Fencing agent ?????? power off host ????????? fail ???????????? ???????????????????????? restart VMs ????????????????????? ?????????????????? fencing ????????????????????? IPMI/BMC ?????????????????? physical servers, power switch ?????????????????? PDU-based fencing ???????????????????????? fencing hardware ????????? SSH fencing ???????????????????????????????????????????????????????????? production

Q: Live Migration ?????????????????????????????????????

A: Live Migration ???????????? running VM ????????? host ?????????????????????????????? host ?????????????????? downtime ??????????????????????????? 1) Copy memory pages ????????? source ?????? destination ????????? VM ???????????????????????? 2) Copy dirty pages (pages ??????????????????????????????????????????????????? copy) ?????????????????????????????????????????? 3) Pause VM ????????????????????? (?????????????????????????????????) copy dirty pages ????????????????????? 4) Resume VM ?????? destination 5) Cleanup source ???????????????????????? ???????????? 2 hosts ???????????? access shared storage ????????????????????????, CPU compatible, network connectivity ?????? ????????????????????? dedicated migration network (10Gbps) ?????????????????? large VMs

Q: Hosted Engine ?????????????????????????????? ????????? host ????????? run Engine fail?

A: ????????????????????? Hosted Engine Agent ?????????????????????????????? host ????????????????????? monitor Engine VM ???????????????????????? ????????? host ????????? run Engine fail agent ?????? hosts ???????????????????????????????????? ??????????????? host ???????????????????????????????????????????????? (score-based) ???????????? start Engine VM ?????? host ???????????? ???????????????????????????????????????????????? 3-5 ???????????? ?????????????????????????????? VMs ????????? run ???????????????????????????????????????????????? ????????? management ??????????????????????????????????????????????????? ???????????????????????????????????????????????? 3 hosts ?????????????????? hosted engine ??????????????? redundancy ??????????????????

📖 บทความที่เกี่ยวข้อง

Flux CD GitOps High Availability HA Setupอ่านบทความ → oVirt Virtualization Open Source Contributionอ่านบทความ → oVirt Virtualization GreenOps Sustainabilityอ่านบทความ → oVirt Virtualization Learning Path Roadmapอ่านบทความ → oVirt Virtualization Edge Deploymentอ่านบทความ →

📚 ดูบทความทั้งหมด →