Skaffold Dev Batch Processing Pipeline — พัฒนา
Skaffold Batch Processing

Skaffold Dev Batch Processing Pipeline Kubernetes Job CronJob Argo Workflows Local Dev Loop CI/CD Production
| Tool | Use Case | Complexity | Scheduling | Best For |
|---|---|---|---|---|
| K8s Job | Single batch task | ต่ำ | Manual / CI trigger | Simple ETL, migration |
| K8s CronJob | Scheduled batch | ต่ำ | Cron expression | Daily reports, cleanup |
| Argo Workflows | Multi-step pipeline | กลาง | Cron + Event | Complex DAG pipelines |
| Spark on K8s | Big data processing | สูง | Airflow / Argo | TB-scale data |
| Tekton | CI/CD + batch | กลาง | Trigger-based | Build + process pipeline |
Skaffold Configuration
=== Skaffold + Batch Job Setup ===
อ่านเพิ่ม: LLM Inference vLLM Infrastructure as Code | SiamCafe Blog · อ่านเพิ่ม: LocalAI Self-hosted Event Driven Design | SiamCafe Blog · อ่านเพิ่ม: LLM Inference vLLM CQRS Event Sourcing | SiamCafe Blog
skaffold.yaml
apiVersion: skaffold/v4beta6
kind: Config
metadata:
name: batch-pipeline
build:
artifacts:
- image: batch-etl
docker:
dockerfile: Dockerfile.etl
- image: batch-report
docker:
dockerfile: Dockerfile.report
deploy:
kubectl:
manifests:
- k8s/etl-job.yaml
- k8s/report-cronjob.yaml
profiles:
- name: production
build:
tagPolicy:
sha256: {}
deploy:
kubectl:
manifests:
- k8s/production/*.yaml
k8s/etl-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: etl-daily
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600
template:
spec:
restartPolicy: Never
containers:

- name: etl
image: batch-etl
command: ["python", "etl.py"]
resources:
requests: { cpu: "500m", memory: "1Gi" }
เนื้อหาเกี่ยวข้อง — ดูเพิ่มเติมเรื่อง Prometheus Federation Learning Path Roadmap
limits: { cpu: "2", memory: "4Gi" }
env:
- name: DB_URL
valueFrom:
secretKeyRef: { name: db-secret, key: url }
Commands
skaffold dev # Watch mode, auto rebuild+deploy on change
skaffold run # One-time build+deploy (CI/CD)
skaffold debug # Deploy with debug ports
แนะนำเพิ่มเติม — เรียนเทรดกับ iCafeForex
skaffold delete # Clean up deployed resources
skaffold render # Output rendered manifests
from dataclasses import dataclass
@dataclass
class SkaffoldCommand:
command: str
use_case: str
when: str
flags: str
commands = [
SkaffoldCommand("skaffold dev", "Watch mode, rebuild on change",
"Local development", "--port-forward --tail"),
SkaffoldCommand("skaffold run", "One-time build and deploy",
"CI/CD pipeline", "--tag=$GIT_SHA --profile=production"),
เนื้อหาเกี่ยวข้อง — แนะนำให้อ่าน Betteruptime Machine Learning Pipeline
SkaffoldCommand("skaffold debug", "Deploy with remote debug",
"Debugging batch job issues", "--port-forward"),
SkaffoldCommand("skaffold render", "Output K8s manifests",
"Review what will be deployed", "--output=rendered.yaml"),
SkaffoldCommand("skaffold delete", "Clean up resources",
"After testing, cleanup", ""),
SkaffoldCommand("skaffold build", "Build images only",
"CI build step", "--tag=$GIT_SHA --push"),
]
print("=== Skaffold Commands ===")
for c in commands:
print(f" [{c.command}] {c.use_case}")
แนะนำเพิ่มเติม — ดูสัญญาณเทรดที่ XM Signal
print(f" When: {c.when} | Flags: {c.flags}")
Pipeline Orchestration
=== Argo Workflows for Complex Pipelines ===
argo-workflow.yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
name: batch-pipeline
spec:
entrypoint: main
templates:
- name: main
dag:
tasks:
- name: extract
template: etl-step
เนื้อหาเกี่ยวข้อง — แนะนำให้อ่าน LLM Inference vLLM Testing Strategy QA
arguments:
parameters: [{name: step, value: extract}]
- name: transform
template: etl-step
dependencies: [extract]
arguments:
parameters: [{name: step, value: transform}]
- name: load
template: etl-step
dependencies: [transform]
arguments:
parameters: [{name: step, value: load}]
- name: report
template: report-step
dependencies: [load]
- name: etl-step
container:
image: batch-etl
command: [python, etl.py, "{{inputs.parameters.step}}"]
- name: report-step
container:
image: batch-report
command: [python, generate_report.py]
@dataclass
class PipelineStage:
stage: str
image: str
resources: str
timeout: str
retry: int
เนื้อหาเกี่ยวข้อง — ดูเพิ่มเติมเรื่อง LlamaIndex RAG Machine Learning Pipeline
depends_on: str
pipeline = [
PipelineStage("Extract", "batch-etl",
"CPU: 1, Memory: 2Gi", "30 min", 3, "None"),
PipelineStage("Transform", "batch-etl",
"CPU: 2, Memory: 4Gi", "60 min", 2, "Extract"),
PipelineStage("Load", "batch-etl",
"CPU: 1, Memory: 2Gi", "30 min", 3, "Transform"),
PipelineStage("Report", "batch-report",
"CPU: 500m, Memory: 1Gi", "15 min", 2, "Load"),
PipelineStage("Notify", "batch-notify",
"CPU: 100m, Memory: 128Mi", "5 min", 1, "Report"),
]
print("=== Pipeline Stages ===")
for p in pipeline:
print(f" [{p.stage}] Image: {p.image}")
print(f" Resources: {p.resources} | Timeout: {p.timeout}")
print(f" Retry: {p.retry} | Depends: {p.depends_on}")
Monitoring and Alerts
# === Job Monitoring ===
@dataclass
class JobMetric:
metric: str
source: str
alert: str
action: str
metrics = [
JobMetric("kube_job_status_failed", "kube-state-metrics",
"Job failed after all retries",
"Check logs, fix code, re-trigger"),
JobMetric("kube_job_duration_seconds", "kube-state-metrics",
"Duration > 2x average",
"Check for data skew, resource limits"),
JobMetric("kube_job_status_active", "kube-state-metrics",
"Job active > deadline",
"Kill job, investigate hang"),
JobMetric("container_memory_usage_bytes", "cAdvisor",
"Memory > 80% of limit",
"Increase limit or optimize code"),
JobMetric("kube_cronjob_next_schedule_time", "kube-state-metrics",
"Missed scheduled run",
"Check CronJob suspend, cluster health"),
]
print("=== Job Metrics ===")
for m in metrics:
print(f" [{m.metric}] Source: {m.source}")
print(f" Alert: {m.alert}")
print(f" Action: {m.action}")
# Best practices
practices = {
"Idempotent Jobs": "ออกแบบ Job ให้รันซ้ำได้ไม่มีผลข้างเคียง",
"Resource Limits": "ตั้ง requests และ limits ทุก Job ป้องกัน noisy neighbor",
"Deadline": "ตั้ง activeDeadlineSeconds ป้องกัน Job ค้าง",
"Spot Nodes": "ใช้ Spot/Preemptible Node สำหรับ Batch ลด 60-90%",
"Log Aggregation": "เก็บ Log ใน Loki/ELK ไม่พึ่ง kubectl logs",
"Cleanup": "ตั้ง ttlSecondsAfterFinished ลบ Job เก่าอัตโนมัติ",
}
print(f"\n\nBest Practices:")
for k, v in practices.items():
print(f" [{k}]: {v}")
เคล็ดลับ
- Dev Loop: ใช้ skaffold dev ลด Feedback Loop จาก 5 นาทีเหลือ 30 วินาที
- Spot: ใช้ Spot Node สำหรับ Batch Job ประหยัด 60-90%
- Idempotent: ออกแบบ Job ให้รันซ้ำได้ ป้องกัน Duplicate Processing
- Deadline: ตั้ง activeDeadlineSeconds ทุก Job ป้องกัน Runaway
- Cleanup: ตั้ง ttlSecondsAfterFinished ลบ Pod เก่าอัตโนมัติ
Skaffold คืออะไร
Google Kubernetes Dev Loop Build Deploy อัตโนมัติ Docker Helm Kustomize File Sync Watch Mode Debug CI/CD Pipeline





