Technology

BigQuery Scheduled Query Backup Recovery Strategy สำรองและกู้คืนข้อมูล

bigquery scheduled query backup recovery strategy
BigQuery Scheduled Query Backup Recovery Strategy | SiamCafe Blog
2025-06-16· อ. บอม — SiamCafe.net· 1,387 คำ

Backup Strategy ?????????????????? BigQuery Scheduled Query ?????????????????????

BigQuery Scheduled Query ?????????????????????????????????????????????????????????????????? data pipeline ?????????????????? SQL transformations ??????????????????????????? ??????????????? backup ????????? recovery strategy ?????????????????????????????????????????????????????????????????????????????? accidental deletion, query bugs, schema changes ?????????????????????????????? ???????????? configuration drift

????????????????????????????????? backup Scheduled query configurations (SQL, schedule, destination), Query results/destination tables, Transfer configurations (data transfer service), IAM permissions ????????? access controls, Dependent resources (views, functions, procedures)

BigQuery ?????? built-in protections Table snapshots (point-in-time recovery), Time travel (7 ????????? default, ?????????????????????????????? 7 ?????????), Fail-safe period (???????????????????????? 7 ????????????????????? time travel), Dataset-level backup (cross-region replication) ????????? scheduled query configurations ??????????????? built-in backup ???????????????????????????????????????

????????????????????? Backup Automation

Automate backup ?????????????????? BigQuery resources

# === BigQuery Backup Automation ===

# 1. Export Scheduled Query Configurations
cat > backup_scheduled_queries.sh << 'BASH'
#!/bin/bash
# Backup all scheduled query configurations
PROJECT_ID="my-project"
BACKUP_DIR="./backups/$(date +%Y%m%d)"
mkdir -p "$BACKUP_DIR"

echo "=== Backing up BigQuery Scheduled Queries ==="

# List all transfer configs (scheduled queries)
bq ls --transfer_config --transfer_location=asia-southeast1 \
  --project_id=$PROJECT_ID --format=json > "$BACKUP_DIR/transfer_configs.json"

# Export each config details
cat "$BACKUP_DIR/transfer_configs.json" | python3 -c "
import json, sys
configs = json.load(sys.stdin)
for config in configs:
    name = config.get('name', '').split('/')[-1]
    with open(f'$BACKUP_DIR/config_{name}.json', 'w') as f:
        json.dump(config, f, indent=2)
    print(f'Backed up: {name}')
"

# Backup table schemas
for dataset in $(bq ls --format=json | python3 -c "import json, sys;[print(d['datasetReference']['datasetId']) for d in json.load(sys.stdin)]"); do
    mkdir -p "$BACKUP_DIR/schemas/$dataset"
    for table in $(bq ls "$dataset" --format=json 2>/dev/null | python3 -c "import json, sys;[print(t['tableReference']['tableId']) for t in json.load(sys.stdin)]" 2>/dev/null); do
        bq show --schema --format=json "$dataset.$table" > "$BACKUP_DIR/schemas/$dataset/$table.json" 2>/dev/null
    done
done

# Backup views and routines
bq ls --routines "analytics" --format=json > "$BACKUP_DIR/routines.json" 2>/dev/null

echo "Backup complete: $BACKUP_DIR"
BASH

# 2. Table Snapshot for Point-in-Time Recovery
cat > create_snapshots.sql << 'SQL'
-- Create table snapshots for critical tables
CREATE SNAPSHOT TABLE `backup.orders_snapshot_20240615`
CLONE `analytics.orders`
OPTIONS(
  expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
);

CREATE SNAPSHOT TABLE `backup.daily_kpis_snapshot_20240615`
CLONE `analytics.daily_kpis`
OPTIONS(
  expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
);

-- Automated snapshot via scheduled query (daily)
-- Schedule: every day 01:00
DECLARE snapshot_name STRING;
SET snapshot_name = CONCAT('backup.orders_snap_', FORMAT_DATE('%Y%m%d', CURRENT_DATE()));

EXECUTE IMMEDIATE FORMAT("""
  CREATE SNAPSHOT TABLE `%s`
  CLONE `analytics.orders`
  OPTIONS(expiration_timestamp = TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL 30 DAY))
""", snapshot_name);
SQL

# 3. Cross-Region Replication
cat > cross_region_backup.sh << 'BASH'
#!/bin/bash
# Copy critical datasets to backup region
bq cp --force \
  "my-project:analytics.orders" \
  "my-project:analytics_backup_us.orders"

# Or use dataset copy (entire dataset)
bq mk --transfer_config \
  --project_id=my-project \
  --data_source=cross_region_copy \
  --target_dataset=analytics_backup_us \
  --display_name="Daily Analytics Backup" \
  --schedule="every day 02:00" \
  --params='{
    "source_dataset_id": "analytics",
    "source_project_id": "my-project",
    "overwrite_destination_table": "true"
  }'
BASH

chmod +x backup_scheduled_queries.sh cross_region_backup.sh
echo "Backup automation configured"

Recovery Strategy ????????? Disaster Recovery

????????????????????????????????????????????????????????? configurations

#!/usr/bin/env python3
# recovery_manager.py ??? BigQuery Recovery Manager
import json
import logging
from typing import Dict, List
from datetime import datetime, timedelta

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("recovery")

class BigQueryRecoveryManager:
    """Manage backup and recovery for BigQuery"""
    
    def __init__(self, project_id="my-project"):
        self.project_id = project_id
    
    def recovery_options(self):
        return {
            "time_travel": {
                "description": "Query data at a specific point in time",
                "retention": "7 days (default, configurable)",
                "sql": "SELECT * FROM `analytics.orders` FOR SYSTEM_TIME AS OF TIMESTAMP('2024-06-14 10:00:00 UTC')",
                "restore_sql": "CREATE OR REPLACE TABLE `analytics.orders` AS SELECT * FROM `analytics.orders` FOR SYSTEM_TIME AS OF TIMESTAMP('2024-06-14 10:00:00 UTC')",
                "cost": "Query cost only (no storage cost for time travel)",
                "use_case": "Accidental DELETE/UPDATE, wrong query result",
            },
            "table_snapshot": {
                "description": "Point-in-time copy of a table",
                "retention": "Configurable (set expiration)",
                "restore_sql": "CREATE OR REPLACE TABLE `analytics.orders` CLONE `backup.orders_snapshot_20240615`",
                "cost": "Storage for changed data only (incremental)",
                "use_case": "Before major schema changes, pre-deployment backup",
            },
            "dataset_copy": {
                "description": "Full dataset copy to another region/project",
                "restore_sql": "bq cp backup_project:analytics_backup.orders my-project:analytics.orders",
                "cost": "Full storage cost in backup location",
                "use_case": "Disaster recovery, region failure",
            },
            "export_to_gcs": {
                "description": "Export tables to Google Cloud Storage",
                "command": "bq extract --destination_format=PARQUET analytics.orders gs://backup-bucket/orders/*.parquet",
                "restore": "bq load --source_format=PARQUET analytics.orders gs://backup-bucket/orders/*.parquet",
                "cost": "GCS storage cost ($0.02/GB/month)",
                "use_case": "Long-term archive, compliance requirements",
            },
        }
    
    def recovery_plan(self, scenario):
        """Get recovery plan for specific scenario"""
        plans = {
            "accidental_delete": {
                "severity": "HIGH",
                "rto": "15 minutes",
                "steps": [
                    "1. Identify deleted data using INFORMATION_SCHEMA.TABLE_STORAGE",
                    "2. Use Time Travel to query data before deletion",
                    "3. Restore: CREATE OR REPLACE TABLE ... FOR SYSTEM_TIME AS OF ...",
                    "4. Validate row counts and data integrity",
                    "5. Notify stakeholders",
                ],
            },
            "bad_query_result": {
                "severity": "MEDIUM",
                "rto": "30 minutes",
                "steps": [
                    "1. Identify the scheduled query that produced bad results",
                    "2. Pause the scheduled query",
                    "3. Use Time Travel to restore previous good state",
                    "4. Fix the query SQL",
                    "5. Test with --dry_run before re-enabling",
                    "6. Re-enable scheduled query",
                ],
            },
            "region_outage": {
                "severity": "CRITICAL",
                "rto": "1-4 hours",
                "steps": [
                    "1. Confirm region outage from Google Cloud Status",
                    "2. Switch DNS/endpoints to backup region",
                    "3. Activate cross-region dataset copies",
                    "4. Re-create scheduled queries in backup region",
                    "5. Validate data freshness",
                    "6. Monitor until primary region recovers",
                ],
            },
        }
        return plans.get(scenario, {"error": "Unknown scenario"})

manager = BigQueryRecoveryManager()
options = manager.recovery_options()
print("Recovery Options:")
for name, info in options.items():
    print(f"\n  {name}:")
    print(f"    {info['description']}")
    print(f"    Use case: {info['use_case']}")
    print(f"    Cost: {info['cost']}")

plan = manager.recovery_plan("accidental_delete")
print(f"\nRecovery Plan (Accidental Delete):")
print(f"  Severity: {plan['severity']}, RTO: {plan['rto']}")
for step in plan["steps"]:
    print(f"  {step}")

Version Control ?????????????????? SQL Queries

?????????????????? version ????????? scheduled queries

# === Version Control for Scheduled Queries ===

# 1. Git-based query management
cat > queries/daily_sales.sql << 'SQL'
-- daily_sales.sql
-- Schedule: every day 02:00
-- Destination: analytics.daily_sales
-- Write disposition: WRITE_TRUNCATE

SELECT
  DATE(order_date) AS sale_date,
  product_category,
  COUNT(*) AS order_count,
  SUM(amount) AS total_revenue,
  AVG(amount) AS avg_order_value,
  COUNT(DISTINCT customer_id) AS unique_customers,
  CURRENT_TIMESTAMP() AS processed_at
FROM `raw.orders`
WHERE DATE(order_date) = @run_date
GROUP BY sale_date, product_category
SQL

# 2. Terraform for scheduled query management
cat > scheduled_queries.tf << 'EOF'
locals {
  scheduled_queries = {
    daily_sales = {
      display_name = "Daily Sales Aggregation"
      schedule     = "every day 02:00"
      query_file   = "queries/daily_sales.sql"
      destination  = "daily_sales"
      write_disp   = "WRITE_TRUNCATE"
    }
    hourly_metrics = {
      display_name = "Hourly Website Metrics"
      schedule     = "every 1 hours"
      query_file   = "queries/hourly_metrics.sql"
      destination  = "hourly_metrics_{run_date}"
      write_disp   = "WRITE_APPEND"
    }
    weekly_cohort = {
      display_name = "Weekly Cohort Analysis"
      schedule     = "every monday 04:00"
      query_file   = "queries/weekly_cohort.sql"
      destination  = "weekly_cohort"
      write_disp   = "WRITE_TRUNCATE"
    }
  }
}

resource "google_bigquery_data_transfer_config" "scheduled" {
  for_each = local.scheduled_queries

  display_name           = each.value.display_name
  location               = "asia-southeast1"
  data_source_id         = "scheduled_query"
  schedule               = each.value.schedule
  destination_dataset_id = "analytics"

  params = {
    destination_table_name_template = each.value.destination
    write_disposition               = each.value.write_disp
    query                           = file(each.value.query_file)
  }

  email_preferences {
    enable_failure_email = true
  }
}
EOF

# 3. CI/CD Pipeline
cat > .github/workflows/deploy-queries.yml << 'EOF'
name: Deploy Scheduled Queries

on:
  push:
    branches: [main]
    paths: ['queries/**', 'scheduled_queries.tf']

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Validate SQL
        run: |
          for f in queries/*.sql; do
            echo "Validating: $f"
            bq query --dry_run --use_legacy_sql=false < "$f"
          done

  deploy:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - name: Terraform Apply
        run: |
          terraform init
          terraform plan -out=tfplan
          terraform apply tfplan
EOF

echo "Version control configured"

Monitoring ????????? Alerting

?????????????????????????????????????????????????????? backup status

#!/usr/bin/env python3
# backup_monitor.py ??? Backup Monitoring Dashboard
import json
import logging
from typing import Dict, List

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("monitor")

class BackupMonitor:
    def __init__(self):
        pass
    
    def dashboard(self):
        return {
            "backup_status": {
                "scheduled_query_configs": {"status": "OK", "last_backup": "2024-06-15 01:00", "count": 12},
                "table_snapshots": {"status": "OK", "last_snapshot": "2024-06-15 01:30", "tables": 8},
                "cross_region_copy": {"status": "OK", "last_copy": "2024-06-15 02:00", "datasets": 2},
                "gcs_export": {"status": "WARNING", "last_export": "2024-06-14 03:00", "note": "24h+ since last export"},
            },
            "recovery_readiness": {
                "time_travel_available": True,
                "time_travel_window": "7 days",
                "latest_snapshot": "2024-06-15 01:30",
                "backup_region": "us-central1",
                "last_dr_test": "2024-05-15",
                "rto_target": "30 minutes",
                "rpo_target": "1 hour",
            },
            "storage_costs": {
                "primary_data": {"size": "2.1 TB", "cost": "$42/month"},
                "snapshots": {"size": "150 GB", "cost": "$3/month"},
                "cross_region": {"size": "2.1 TB", "cost": "$42/month"},
                "gcs_archive": {"size": "500 GB", "cost": "$10/month"},
                "total_backup_cost": "$55/month",
            },
            "alerts": [
                {"severity": "WARNING", "message": "GCS export not run in 24h+"},
                {"severity": "INFO", "message": "DR test due (last: 30 days ago)"},
            ],
            "scheduled_query_health": [
                {"name": "daily_sales", "last_run": "success", "duration": "45s", "rows": 1250},
                {"name": "hourly_metrics", "last_run": "success", "duration": "12s", "rows": 8500},
                {"name": "weekly_cohort", "last_run": "success", "duration": "3m20s", "rows": 45000},
                {"name": "daily_dq_checks", "last_run": "failed", "error": "Table not found"},
            ],
        }

monitor = BackupMonitor()
dash = monitor.dashboard()

print("Backup & Recovery Dashboard:")
for name, info in dash["backup_status"].items():
    print(f"  [{info['status']}] {name}: last={info.get('last_backup', info.get('last_snapshot', info.get('last_copy', info.get('last_export', ''))))}")

recovery = dash["recovery_readiness"]
print(f"\nRecovery Readiness:")
print(f"  Time Travel: {recovery['time_travel_window']}")
print(f"  RTO: {recovery['rto_target']}, RPO: {recovery['rpo_target']}")

costs = dash["storage_costs"]
print(f"\nBackup Costs: {costs['total_backup_cost']}")

print(f"\nScheduled Query Health:")
for q in dash["scheduled_query_health"]:
    status = "OK" if q["last_run"] == "success" else "FAIL"
    print(f"  [{status}] {q['name']}: {q.get('rows', q.get('error', ''))}")

Cost Optimization ?????????????????? Backup

?????????????????????????????????????????????????????? backup

# === Cost Optimization ===

cat > backup_cost_optimization.yaml << 'EOF'
backup_cost_optimization:
  table_snapshots:
    strategy: "Incremental snapshots (store only changes)"
    tip: "Set expiration ?????? snapshots ???????????????????????????????????????"
    retention:
      daily: "7 days"
      weekly: "30 days"
      monthly: "365 days"
    cost_savings: "60-80% vs full copies"

  time_travel:
    strategy: "????????? time travel window ????????? manual snapshots ?????????????????? short-term"
    default: "7 days (free, included in storage)"
    tip: "???????????????????????????????????? snapshot ?????????????????? recovery < 7 ?????????"

  gcs_archive:
    strategy: "Export to Coldline/Archive storage"
    storage_classes:
      standard: "$0.020/GB/month"
      nearline: "$0.010/GB/month (30-day min)"
      coldline: "$0.004/GB/month (90-day min)"
      archive: "$0.0012/GB/month (365-day min)"
    tip: "????????? lifecycle rules ????????????????????? Standard ??? Coldline ??? Archive"

  cross_region:
    strategy: "Replicate ??????????????? critical tables ????????????????????????????????? dataset"
    tip: "????????? scheduled query copy ??????????????? gold/silver tables"
    cost_savings: "50-70% vs full dataset replication"

  query_optimization:
    strategy: "Optimize scheduled queries ?????? bytes processed"
    tips:
      - "Partition tables ?????? scan cost 80-95%"
      - "Use SELECT specific columns ????????? SELECT *"
      - "Use materialized views ?????????????????? repeated queries"
      - "Set query priority BATCH ?????????????????? non-urgent backups"
EOF

python3 -c "
import yaml
with open('backup_cost_optimization.yaml') as f:
    data = yaml.safe_load(f)
opt = data['backup_cost_optimization']
print('Backup Cost Optimization:')
for name, info in opt.items():
    print(f'\n  {name}:')
    print(f'    Strategy: {info[\"strategy\"]}')
    print(f'    Tip: {info[\"tip\"]}')
    if 'cost_savings' in info:
        print(f'    Savings: {info[\"cost_savings\"]}')
"

echo "Cost optimization guide ready"

FAQ ??????????????????????????????????????????

Q: BigQuery Time Travel ????????? Table Snapshot ???????????????????????????????????????????

A: Time Travel ???????????? built-in feature query data ????????????????????????????????? 7 ????????? ????????????????????? setup ????????? (??????????????? storage cost) ????????? FOR SYSTEM_TIME AS OF syntax ??????????????? quick recovery ????????? accidental changes Table Snapshot ???????????? explicit copy ??? ????????????????????????????????????????????? ??????????????? expiration ????????? ???????????? incremental (??????????????? changes) ????????? CLONE syntax copy ????????????????????? ??????????????? pre-deployment backup ????????????????????????????????????????????? 7 ????????? ??????????????? ????????? Time Travel ?????????????????? recovery ??????????????? 7 ????????? ??????????????? Snapshot ???????????? major changes (schema migration, large updates) ????????? Snapshot ?????????????????? retention ????????????????????? 7 ?????????

Q: Scheduled Query Configuration ?????????????????????????????????????

A: BigQuery ??????????????? built-in backup ?????????????????? scheduled query configurations ?????????????????? ???????????? ??????????????????????????????????????? bq mk --transfer_config ???????????? Console ????????????????????? ????????? Terraform/Pulumi manage configurations as code ???????????? SQL queries ?????? Git repository Export configurations ???????????? JSON ??????????????????????????? (backup script) ????????? CI/CD deploy scheduled queries ????????? code Recovery ????????? Terraform state terraform import existing configurations, ????????? Git repo re-deploy ???????????? terraform apply, ????????? JSON backup ??????????????????????????????????????? bq CLI ????????????????????????????????? ???????????? manage scheduled queries ???????????? Console ???????????????????????? ?????????????????? code backup ????????????

Q: RPO ????????? RTO ???????????????????????????????????????????

A: RPO (Recovery Point Objective) ????????? data loss ???????????????????????????????????? ????????? RPO = 1 ????????????????????? ????????????????????? ????????????????????????????????????????????????????????????????????? 1 ????????????????????? RTO (Recovery Time Objective) ????????? ????????????????????????????????????????????????????????????????????????????????? ?????????????????? BigQuery RPO 0 (no data loss) ????????? Time Travel + frequent snapshots, RPO 1 ????????????????????? ????????? hourly scheduled snapshots, RPO 24 ????????????????????? ????????? daily backups RTO < 15 ???????????? ????????? Time Travel ???????????? Snapshot (CLONE ???????????????), RTO < 1 ????????????????????? ????????? cross-region copy, RTO < 4 ????????????????????? ????????? GCS export + reimport ??????????????? ?????????????????? analytics RPO 1h, RTO 30min ?????????????????? business-critical RPO 0, RTO 15min

Q: ?????????????????????????????? backup ????????????????????????????????????????????????????????????????

A: ???????????????????????????????????? backup cost ?????????????????????????????? 10-20% ????????? primary storage cost ???????????????????????? Primary data 2 TB ($42/???????????????), Snapshots 150 GB ($3), Cross-region 2 TB ($42, ???????????????????????????), GCS archive 500 GB ($2-10 ????????? class), ????????? $47-97/??????????????? ??????????????????????????? backup ????????? Time Travel (?????????) ????????? Snapshot ?????????????????? 7 ?????????, Snapshot ??????????????? critical tables, Cross-region ??????????????? gold layer, GCS ????????? Coldline/Archive class, ???????????? expiration ?????? backup ???????????????????????????????????????

📖 บทความที่เกี่ยวข้อง

BigQuery Scheduled Query Distributed Systemอ่านบทความ → BigQuery Scheduled Query Zero Downtime Deploymentอ่านบทความ → BigQuery Scheduled Query Technical Debt Managementอ่านบทความ → BigQuery Scheduled Query MLOps Workflowอ่านบทความ → BigQuery Scheduled Query SaaS Architectureอ่านบทความ →

📚 ดูบทความทั้งหมด →