GitHub Actions ที่ทำให้ผมไม่ต้องกลัว Deploy อีกแล้ว

วันที่กลัว Deploy

ก่อนรู้จัก GitHub Actions การ deploy เป็นฝันร้าย:

ขั้นตอนการ Deploy แบบ Manual:

# 1. Build แอป
npm run build

# 2. Run tests (บางทีลืม)
npm test

# 3. Build Docker image
docker build -t myapp:latest .

# 4. Tag image
docker tag myapp:latest myregistry/myapp:v1.2.3

# 5. Push image
docker push myregistry/myapp:v1.2.3

# 6. Update Kubernetes deployment
kubectl set image deployment/myapp myapp=myregistry/myapp:v1.2.3

# 7. Wait and pray 🙏
kubectl rollout status deployment/myapp

ปัญหาที่เจอบ่อย:

ลืม run tests แล้ว deploy code ที่เสีย
Version mismatch ระหว่าง code กับ Docker image
Environment differences ระหว่าง local กับ production
Manual errors พิมพ์ผิด copy-paste ผิด
Rollback ยาก เมื่อเกิดปัญหา
No audit trail ไม่รู้ใครเป็นคน deploy อะไร

ตัวอย่างความผิดพลาด:

# ที่เจอจริง 😅
docker push myregistry/myapp:v1.2.3
# Error: denied: requested access to the resource is denied

kubectl set image deployment/myapp myapp=myregistry/myapp:v1.2.4
# ผิด version! build v1.2.3 แต่ deploy v1.2.4

npm run build
# ลืมสลับ environment variable
# deploy production code ไป development database! 😱

ผลลัพธ์: ใช้เวลา deploy นาน เครียด และมักเกิดปัญหา! 😰

จนวันหนึ่ง เจอ GitHub Actions แล้วชีวิตเปลี่ยนไป! 🚀

GitHub Actions Fundamentals

1. เริ่มต้นด้วย Basic Workflow

# .github/workflows/ci.yml
name: CI/CD Pipeline

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    
    steps:
    - name: Checkout code
      uses: actions/checkout@v4
      
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: '18'
        cache: 'npm'
        
    - name: Install dependencies
      run: npm ci
      
    - name: Run linter
      run: npm run lint
      
    - name: Run tests
      run: npm test
      
    - name: Run build
      run: npm run build

คำอธิบาย:

on: กำหนดว่า workflow จะทำงานเมื่อไหร่
jobs: งานที่ต้องทำ
runs-on: OS ที่ใช้รันงาน
steps: ขั้นตอนของงาน
uses: ใช้ Action ที่คนอื่นสร้างไว้
run: รันคำสั่ง shell

2. Environment Variables และ Secrets

# .github/workflows/deploy.yml
name: Deploy to Production

on:
  push:
    tags:
      - 'v*'

env:
  NODE_VERSION: '18'
  DOCKER_REGISTRY: 'ghcr.io'

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: ${{ env.NODE_VERSION }}
        
    - name: Build application
      env:
        API_URL: ${{ vars.API_URL }}
        DATABASE_URL: ${{ secrets.DATABASE_URL }}
        JWT_SECRET: ${{ secrets.JWT_SECRET }}
      run: |
        npm ci
        npm run build
        
    - name: Login to Container Registry
      uses: docker/login-action@v3
      with:
        registry: ${{ env.DOCKER_REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}

Advanced CI/CD Patterns

1. Multi-Stage Pipeline

# .github/workflows/full-pipeline.yml
name: Full CI/CD Pipeline

on:
  push:
    branches: [ main, develop ]
  pull_request:
    branches: [ main ]

jobs:
  # Stage 1: Code Quality
  code-quality:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: '18'
        cache: 'npm'
        
    - name: Install dependencies
      run: npm ci
      
    - name: ESLint
      run: npm run lint
      
    - name: Prettier
      run: npm run format:check
      
    - name: TypeScript check
      run: npm run type-check
      
    - name: Security audit
      run: npm audit --audit-level=moderate

  # Stage 2: Testing
  test:
    needs: code-quality
    runs-on: ubuntu-latest
    
    strategy:
      matrix:
        node-version: [16, 18, 20]
        
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup Node.js ${{ matrix.node-version }}
      uses: actions/setup-node@v4
      with:
        node-version: ${{ matrix.node-version }}
        cache: 'npm'
        
    - name: Install dependencies
      run: npm ci
      
    - name: Unit tests
      run: npm run test:unit
      
    - name: Integration tests
      run: npm run test:integration
      
    - name: Upload coverage
      uses: codecov/codecov-action@v3
      if: matrix.node-version == 18

  # Stage 3: Build Docker Image
  build:
    needs: [code-quality, test]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/v')
    
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}
      image-digest: ${{ steps.build.outputs.digest }}
      
    steps:
    - uses: actions/checkout@v4
    
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v3
      
    - name: Login to Container Registry
      uses: docker/login-action@v3
      with:
        registry: ghcr.io
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
        
    - name: Extract metadata
      id: meta
      uses: docker/metadata-action@v5
      with:
        images: ghcr.io/${{ github.repository }}
        tags: |
          type=ref,event=branch
          type=ref,event=pr
          type=semver,pattern={{version}}
          type=semver,pattern={{major}}.{{minor}}
          type=sha,prefix={{branch}}-
          
    - name: Build and push
      id: build
      uses: docker/build-push-action@v5
      with:
        context: .
        platforms: linux/amd64,linux/arm64
        push: true
        tags: ${{ steps.meta.outputs.tags }}
        labels: ${{ steps.meta.outputs.labels }}
        cache-from: type=gha
        cache-to: type=gha,mode=max
        build-args: |
          NODE_VERSION=18
          BUILD_DATE=${{ github.event.head_commit.timestamp }}
          GIT_SHA=${{ github.sha }}

  # Stage 4: Security Scanning
  security:
    needs: build
    runs-on: ubuntu-latest
    
    steps:
    - name: Run Trivy vulnerability scanner
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: ${{ needs.build.outputs.image-tag }}
        format: 'sarif'
        output: 'trivy-results.sarif'
        
    - name: Upload Trivy scan results
      uses: github/codeql-action/upload-sarif@v2
      with:
        sarif_file: 'trivy-results.sarif'

  # Stage 5: Deploy to Staging
  deploy-staging:
    needs: [build, security]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment: staging
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Deploy to Staging
      run: |
        echo "Deploying to staging..."
        # จะใช้ ArgoCD หรือ Helm ในขั้นตอนจริง
        
    - name: Run smoke tests
      run: |
        npm ci
        npm run test:smoke -- --env=staging

  # Stage 6: Deploy to Production
  deploy-production:
    needs: [build, security, deploy-staging]
    runs-on: ubuntu-latest
    if: startsWith(github.ref, 'refs/tags/v')
    environment: production
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Deploy to Production
      run: |
        echo "Deploying to production..."
        
    - name: Run health checks
      run: |
        npm ci
        npm run test:health -- --env=production
        
    - name: Notify team
      if: always()
      uses: 8398a7/action-slack@v3
      with:
        status: ${{ job.status }}
        text: |
          Deployment ${{ job.status }}
          Version: ${{ github.ref_name }}
          Commit: ${{ github.sha }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

Reusable Workflows และ Actions

1. Reusable Workflow

# .github/workflows/reusable-deploy.yml
name: Reusable Deploy Workflow

on:
  workflow_call:
    inputs:
      environment:
        required: true
        type: string
      image-tag:
        required: true
        type: string
      health-check-url:
        required: false
        type: string
        default: '/health'
    secrets:
      KUBECONFIG:
        required: true
      SLACK_WEBHOOK:
        required: false

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup kubectl
      uses: azure/setup-kubectl@v3
      
    - name: Setup kubeconfig
      run: |
        mkdir -p ~/.kube
        echo "${{ secrets.KUBECONFIG }}" | base64 -d > ~/.kube/config
        
    - name: Deploy with Helm
      run: |
        helm upgrade --install myapp ./helm/myapp \
          --namespace myapp-${{ inputs.environment }} \
          --create-namespace \
          --values ./helm/values-${{ inputs.environment }}.yaml \
          --set image.tag=${{ inputs.image-tag }} \
          --wait --timeout=300s
          
    - name: Health check
      run: |
        # Wait for deployment to be ready
        kubectl wait --for=condition=available deployment/myapp \
          --namespace=myapp-${{ inputs.environment }} \
          --timeout=300s
          
        # Test health endpoint
        SERVICE_URL=$(kubectl get service myapp -o jsonpath='{.status.loadBalancer.ingress[0].ip}' -n myapp-${{ inputs.environment }})
        curl -f http://$SERVICE_URL${{ inputs.health-check-url }}
        
    - name: Notify success
      if: success() && secrets.SLACK_WEBHOOK
      uses: 8398a7/action-slack@v3
      with:
        status: success
        text: |
          ✅ Successfully deployed to ${{ inputs.environment }}
          Image: ${{ inputs.image-tag }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
        
    - name: Notify failure
      if: failure() && secrets.SLACK_WEBHOOK
      uses: 8398a7/action-slack@v3
      with:
        status: failure
        text: |
          ❌ Failed to deploy to ${{ inputs.environment }}
          Image: ${{ inputs.image-tag }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

2. ใช้ Reusable Workflow

# .github/workflows/main-pipeline.yml
name: Main Pipeline

on:
  push:
    tags: ['v*']

jobs:
  build:
    # ... build job ...
    
  deploy-staging:
    needs: build
    uses: ./.github/workflows/reusable-deploy.yml
    with:
      environment: staging
      image-tag: ${{ needs.build.outputs.image-tag }}
    secrets:
      KUBECONFIG: ${{ secrets.STAGING_KUBECONFIG }}
      SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
      
  deploy-production:
    needs: [build, deploy-staging]
    uses: ./.github/workflows/reusable-deploy.yml
    with:
      environment: production
      image-tag: ${{ needs.build.outputs.image-tag }}
    secrets:
      KUBECONFIG: ${{ secrets.PRODUCTION_KUBECONFIG }}
      SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}

3. Custom Action

// .github/actions/deploy-notification/action.yml
name: 'Deploy Notification'
description: 'Send deployment notification with rich information'

inputs:
  environment:
    description: 'Deployment environment'
    required: true
  status:
    description: 'Deployment status'
    required: true
  image-tag:
    description: 'Docker image tag'
    required: true
  slack-webhook:
    description: 'Slack webhook URL'
    required: true

runs:
  using: 'node20'
  main: 'dist/index.js'

// .github/actions/deploy-notification/src/index.js
const core = require('@actions/core');
const axios = require('axios');

async function run() {
  try {
    const environment = core.getInput('environment');
    const status = core.getInput('status');
    const imageTag = core.getInput('image-tag');
    const slackWebhook = core.getInput('slack-webhook');
    
    const color = status === 'success' ? 'good' : 'danger';
    const emoji = status === 'success' ? '✅' : '❌';
    
    const payload = {
      attachments: [
        {
          color,
          title: `${emoji} Deployment ${status}`,
          fields: [
            {
              title: 'Environment',
              value: environment,
              short: true
            },
            {
              title: 'Image Tag',
              value: imageTag,
              short: true
            },
            {
              title: 'Repository',
              value: process.env.GITHUB_REPOSITORY,
              short: true
            },
            {
              title: 'Actor',
              value: process.env.GITHUB_ACTOR,
              short: true
            }
          ],
          footer: 'GitHub Actions',
          ts: Math.floor(Date.now() / 1000)
        }
      ]
    };
    
    await axios.post(slackWebhook, payload);
    core.info('Notification sent successfully');
    
  } catch (error) {
    core.setFailed(error.message);
  }
}

run();

Database Migrations และ Deployment Strategies

1. Zero-Downtime Deployment

# .github/workflows/zero-downtime-deploy.yml
name: Zero Downtime Deployment

on:
  push:
    tags: ['v*']

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production
    
    steps:
    - uses: actions/checkout@v4
    
    # Pre-deployment checks
    - name: Pre-deployment health check
      run: |
        curl -f https://api.myapp.com/health
        
    # Database migrations (if needed)
    - name: Run database migrations
      run: |
        kubectl create job migration-${{ github.sha }} \
          --from=cronjob/db-migrations \
          --namespace=myapp-production
          
        kubectl wait --for=condition=complete job/migration-${{ github.sha }} \
          --namespace=myapp-production \
          --timeout=300s
          
    # Rolling deployment
    - name: Deploy new version
      run: |
        helm upgrade myapp ./helm/myapp \
          --namespace myapp-production \
          --values ./helm/values-production.yaml \
          --set image.tag=${{ github.ref_name }} \
          --wait --timeout=600s
          
    # Post-deployment verification
    - name: Verify deployment
      run: |
        # Check all pods are running
        kubectl get pods -l app=myapp -n myapp-production
        
        # Health check with retry
        for i in {1..30}; do
          if curl -f https://api.myapp.com/health; then
            echo "Health check passed"
            break
          fi
          echo "Attempt $i failed, retrying..."
          sleep 10
        done
        
    # Smoke tests
    - name: Run smoke tests
      run: |
        npm ci
        npm run test:smoke -- --env=production
        
    # Rollback on failure
    - name: Rollback on failure
      if: failure()
      run: |
        echo "Deployment failed, rolling back..."
        helm rollback myapp 0 --namespace=myapp-production
        
        # Wait for rollback to complete
        kubectl rollout status deployment/myapp \
          --namespace=myapp-production \
          --timeout=300s

2. Blue-Green Deployment

# .github/workflows/blue-green-deploy.yml
name: Blue-Green Deployment

on:
  workflow_dispatch:
    inputs:
      environment:
        description: 'Environment to deploy'
        required: true
        default: 'production'
        type: choice
        options:
        - staging
        - production

jobs:
  blue-green-deploy:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Determine deployment slot
      id: slot
      run: |
        # Check current active slot
        CURRENT_SLOT=$(kubectl get service myapp -o jsonpath='{.spec.selector.slot}' -n myapp-${{ inputs.environment }})
        
        if [ "$CURRENT_SLOT" = "blue" ]; then
          NEW_SLOT="green"
        else
          NEW_SLOT="blue"
        fi
        
        echo "current-slot=$CURRENT_SLOT" >> $GITHUB_OUTPUT
        echo "new-slot=$NEW_SLOT" >> $GITHUB_OUTPUT
        
    - name: Deploy to inactive slot
      run: |
        helm upgrade --install myapp-${{ steps.slot.outputs.new-slot }} ./helm/myapp \
          --namespace myapp-${{ inputs.environment }} \
          --values ./helm/values-${{ inputs.environment }}.yaml \
          --set image.tag=${{ github.ref_name }} \
          --set deployment.slot=${{ steps.slot.outputs.new-slot }} \
          --wait --timeout=600s
          
    - name: Test new deployment
      run: |
        # Get new deployment service endpoint
        NEW_IP=$(kubectl get service myapp-${{ steps.slot.outputs.new-slot }} \
          -o jsonpath='{.status.loadBalancer.ingress[0].ip}' \
          -n myapp-${{ inputs.environment }})
          
        # Run tests against new deployment
        npm ci
        ENDPOINT=http://$NEW_IP npm run test:smoke
        
    - name: Switch traffic to new slot
      run: |
        # Update main service to point to new slot
        kubectl patch service myapp \
          -p '{"spec":{"selector":{"slot":"${{ steps.slot.outputs.new-slot }}"}}}' \
          -n myapp-${{ inputs.environment }}
          
        # Wait a bit for traffic switch
        sleep 30
        
    - name: Verify traffic switch
      run: |
        # Test main endpoint
        curl -f https://api-${{ inputs.environment }}.myapp.com/health
        
    - name: Cleanup old deployment
      run: |
        # Remove old deployment after successful switch
        helm uninstall myapp-${{ steps.slot.outputs.current-slot }} \
          --namespace myapp-${{ inputs.environment }} || true

Monitoring และ Observability Integration

1. Performance Monitoring

# .github/workflows/performance-monitoring.yml
name: Performance Monitoring

on:
  schedule:
    - cron: '*/15 * * * *'  # Every 15 minutes
  workflow_dispatch:

jobs:
  performance-check:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: '18'
        
    - name: Install dependencies
      run: npm ci
      
    - name: Run performance tests
      run: |
        npm run test:performance -- --reporter=json > performance-results.json
        
    - name: Parse results
      id: results
      run: |
        RESPONSE_TIME=$(jq -r '.summary.responseTime.mean' performance-results.json)
        ERROR_RATE=$(jq -r '.summary.errorRate' performance-results.json)
        THROUGHPUT=$(jq -r '.summary.throughput' performance-results.json)
        
        echo "response-time=$RESPONSE_TIME" >> $GITHUB_OUTPUT
        echo "error-rate=$ERROR_RATE" >> $GITHUB_OUTPUT
        echo "throughput=$THROUGHPUT" >> $GITHUB_OUTPUT
        
    - name: Check performance thresholds
      run: |
        RESPONSE_TIME=${{ steps.results.outputs.response-time }}
        ERROR_RATE=${{ steps.results.outputs.error-rate }}
        
        if (( $(echo "$RESPONSE_TIME > 500" | bc -l) )); then
          echo "❌ Response time too high: ${RESPONSE_TIME}ms"
          exit 1
        fi
        
        if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
          echo "❌ Error rate too high: ${ERROR_RATE}"
          exit 1
        fi
        
        echo "✅ Performance checks passed"
        
    - name: Send metrics to monitoring
      run: |
        # Send to Prometheus Pushgateway
        echo "http_response_time_ms ${{ steps.results.outputs.response-time }}" | \
          curl -X POST --data-binary @- \
          http://pushgateway.monitoring.svc.cluster.local:9091/metrics/job/github-actions
          
    - name: Alert on failure
      if: failure()
      uses: 8398a7/action-slack@v3
      with:
        status: failure
        text: |
          🚨 Performance degradation detected!
          Response Time: ${{ steps.results.outputs.response-time }}ms
          Error Rate: ${{ steps.results.outputs.error-rate }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

2. Dependency Vulnerability Scanning

# .github/workflows/security-scan.yml
name: Security Scan

on:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM
  pull_request:
    branches: [main]

jobs:
  dependency-scan:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup Node.js
      uses: actions/setup-node@v4
      with:
        node-version: '18'
        
    - name: Install dependencies
      run: npm ci
      
    - name: Run npm audit
      run: |
        npm audit --audit-level=moderate --json > audit-results.json || true
        
    - name: Parse audit results
      id: audit
      run: |
        VULNERABILITIES=$(jq '.metadata.vulnerabilities | length' audit-results.json)
        HIGH_VULNS=$(jq '.metadata.vulnerabilities.high // 0' audit-results.json)
        CRITICAL_VULNS=$(jq '.metadata.vulnerabilities.critical // 0' audit-results.json)
        
        echo "total-vulnerabilities=$VULNERABILITIES" >> $GITHUB_OUTPUT
        echo "high-vulnerabilities=$HIGH_VULNS" >> $GITHUB_OUTPUT
        echo "critical-vulnerabilities=$CRITICAL_VULNS" >> $GITHUB_OUTPUT
        
    - name: Check vulnerability thresholds
      run: |
        if [ "${{ steps.audit.outputs.critical-vulnerabilities }}" -gt 0 ]; then
          echo "❌ Critical vulnerabilities found: ${{ steps.audit.outputs.critical-vulnerabilities }}"
          exit 1
        fi
        
        if [ "${{ steps.audit.outputs.high-vulnerabilities }}" -gt 5 ]; then
          echo "⚠️  Too many high vulnerabilities: ${{ steps.audit.outputs.high-vulnerabilities }}"
          exit 1
        fi
        
    - name: Snyk security scan
      uses: snyk/actions/node@master
      env:
        SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
      with:
        args: --severity-threshold=high --json > snyk-results.json
        
    - name: Upload security scan results
      uses: github/codeql-action/upload-sarif@v2
      with:
        sarif_file: snyk-results.json
        
    - name: Create security issue
      if: failure()
      uses: actions/github-script@v6
      with:
        script: |
          github.rest.issues.create({
            owner: context.repo.owner,
            repo: context.repo.repo,
            title: '🔒 Security vulnerabilities detected',
            body: `
            Security scan failed with the following issues:
            
            - Critical vulnerabilities: ${{ steps.audit.outputs.critical-vulnerabilities }}
            - High vulnerabilities: ${{ steps.audit.outputs.high-vulnerabilities }}
            - Total vulnerabilities: ${{ steps.audit.outputs.total-vulnerabilities }}
            
            Please review and fix the security issues.
            
            Triggered by: ${context.workflow} #${context.runNumber}
            `,
            labels: ['security', 'critical']
          })

Advanced Deployment Strategies

1. Canary Deployment

# .github/workflows/canary-deploy.yml
name: Canary Deployment

on:
  workflow_dispatch:
    inputs:
      canary-percentage:
        description: 'Canary traffic percentage'
        required: true
        default: '10'
        type: choice
        options: ['5', '10', '25', '50', '100']

jobs:
  canary-deploy:
    runs-on: ubuntu-latest
    environment: production
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Deploy canary version
      run: |
        # Deploy canary deployment
        helm upgrade --install myapp-canary ./helm/myapp \
          --namespace myapp-production \
          --values ./helm/values-production.yaml \
          --set image.tag=${{ github.ref_name }} \
          --set deployment.type=canary \
          --set replicaCount=2 \
          --wait --timeout=300s
          
    - name: Configure traffic split
      run: |
        # Update Istio VirtualService for traffic splitting
        kubectl apply -f - <<EOF
        apiVersion: networking.istio.io/v1beta1
        kind: VirtualService
        metadata:
          name: myapp-traffic-split
          namespace: myapp-production
        spec:
          hosts:
          - myapp.example.com
          http:
          - match:
            - headers:
                canary:
                  exact: "true"
            route:
            - destination:
                host: myapp-canary
              weight: 100
          - route:
            - destination:
                host: myapp
              weight: ${{ 100 - inputs.canary-percentage }}
            - destination:
                host: myapp-canary
              weight: ${{ inputs.canary-percentage }}
        EOF
        
    - name: Monitor canary metrics
      run: |
        # Monitor for 10 minutes
        for i in {1..60}; do
          # Check error rate
          ERROR_RATE=$(curl -s "http://prometheus:9090/api/v1/query?query=sum(rate(http_requests_total{service=\"myapp-canary\",status=~\"5..\"}[5m]))/sum(rate(http_requests_total{service=\"myapp-canary\"}[5m]))" | jq -r '.data.result[0].value[1] // 0')
          
          # Check response time
          RESPONSE_TIME=$(curl -s "http://prometheus:9090/api/v1/query?query=histogram_quantile(0.95,sum(rate(http_request_duration_seconds_bucket{service=\"myapp-canary\"}[5m]))by(le))" | jq -r '.data.result[0].value[1] // 0')
          
          echo "Minute $i - Error Rate: $ERROR_RATE, Response Time: ${RESPONSE_TIME}s"
          
          # Check thresholds
          if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
            echo "❌ Canary error rate too high: $ERROR_RATE"
            exit 1
          fi
          
          if (( $(echo "$RESPONSE_TIME > 0.5" | bc -l) )); then
            echo "❌ Canary response time too high: ${RESPONSE_TIME}s"
            exit 1
          fi
          
          sleep 10
        done
        
        echo "✅ Canary monitoring completed successfully"
        
    - name: Promote canary on success
      if: inputs.canary-percentage == '100'
      run: |
        # Replace main deployment with canary
        kubectl patch deployment myapp \
          --patch '{"spec":{"template":{"spec":{"containers":[{"name":"myapp","image":"myapp:${{ github.ref_name }}"}]}}}}' \
          --namespace myapp-production
          
        # Wait for rollout
        kubectl rollout status deployment/myapp --namespace myapp-production
        
        # Remove canary deployment
        helm uninstall myapp-canary --namespace myapp-production
        
    - name: Rollback canary on failure
      if: failure()
      run: |
        echo "❌ Canary deployment failed, rolling back..."
        
        # Remove traffic from canary
        kubectl apply -f - <<EOF
        apiVersion: networking.istio.io/v1beta1
        kind: VirtualService
        metadata:
          name: myapp-traffic-split
          namespace: myapp-production
        spec:
          hosts:
          - myapp.example.com
          http:
          - route:
            - destination:
                host: myapp
              weight: 100
        EOF
        
        # Remove canary deployment
        helm uninstall myapp-canary --namespace myapp-production || true

2. Feature Flag Integration

# .github/workflows/feature-flag-deploy.yml
name: Feature Flag Deployment

on:
  push:
    branches: [main]

jobs:
  deploy-with-flags:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Extract feature flags from commit
      id: flags
      run: |
        # Parse commit message for feature flags
        COMMIT_MSG="${{ github.event.head_commit.message }}"
        
        # Extract flags like [feature:new-ui=true]
        FLAGS=$(echo "$COMMIT_MSG" | grep -o '\[feature:[^]]*\]' | sed 's/\[feature://g' | sed 's/\]//g' | tr '\n' ',' | sed 's/,$//')
        
        echo "flags=$FLAGS" >> $GITHUB_OUTPUT
        echo "Found feature flags: $FLAGS"
        
    - name: Deploy with feature flags
      run: |
        # Deploy with feature flags as environment variables
        IFS=',' read -ra FLAG_ARRAY <<< "${{ steps.flags.outputs.flags }}"
        
        FEATURE_FLAGS=""
        for flag in "${FLAG_ARRAY[@]}"; do
          FLAG_NAME=$(echo "$flag" | cut -d'=' -f1 | tr '[:lower:]' '[:upper:]')
          FLAG_VALUE=$(echo "$flag" | cut -d'=' -f2)
          FEATURE_FLAGS="$FEATURE_FLAGS --set env.FEATURE_$FLAG_NAME=$FLAG_VALUE"
        done
        
        helm upgrade --install myapp ./helm/myapp \
          --namespace myapp-staging \
          --values ./helm/values-staging.yaml \
          --set image.tag=${{ github.sha }} \
          $FEATURE_FLAGS \
          --wait --timeout=300s
          
    - name: Run feature flag tests
      run: |
        # Run tests that verify feature flags work correctly
        npm ci
        npm run test:features -- --env=staging
        
    - name: Update feature flag service
      run: |
        # Update LaunchDarkly or similar service
        curl -X PATCH "https://app.launchdarkly.com/api/v2/flags/default/new-ui" \
          -H "Authorization: api-${{ secrets.LAUNCHDARKLY_TOKEN }}" \
          -H "Content-Type: application/json" \
          -d '{
            "patch": [
              {
                "op": "replace",
                "path": "/environments/staging/on",
                "value": true
              }
            ]
          }'

การทำ Rollback และ Recovery

1. Automated Rollback

# .github/workflows/automated-rollback.yml
name: Automated Rollback

on:
  workflow_dispatch:
    inputs:
      environment:
        description: 'Environment to rollback'
        required: true
        type: choice
        options: ['staging', 'production']
      revision:
        description: 'Revision to rollback to (0 for previous)'
        required: false
        default: '0'

jobs:
  rollback:
    runs-on: ubuntu-latest
    environment: ${{ inputs.environment }}
    
    steps:
    - uses: actions/checkout@v4
    
    - name: Setup kubectl and Helm
      uses: azure/setup-kubectl@v3
      
    - name: Configure kubeconfig
      run: |
        mkdir -p ~/.kube
        echo "${{ secrets.KUBECONFIG }}" | base64 -d > ~/.kube/config
        
    - name: Get current deployment info
      id: current
      run: |
        CURRENT_REVISION=$(helm history myapp -n myapp-${{ inputs.environment }} --max 1 --output json | jq -r '.[0].revision')
        CURRENT_IMAGE=$(kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].image}' -n myapp-${{ inputs.environment }})
        
        echo "current-revision=$CURRENT_REVISION" >> $GITHUB_OUTPUT
        echo "current-image=$CURRENT_IMAGE" >> $GITHUB_OUTPUT
        
    - name: Perform rollback
      run: |
        REVISION="${{ inputs.revision }}"
        if [ "$REVISION" = "0" ]; then
          # Rollback to previous revision
          REVISION=$((${{ steps.current.outputs.current-revision }} - 1))
        fi
        
        echo "Rolling back to revision $REVISION"
        
        helm rollback myapp $REVISION --namespace myapp-${{ inputs.environment }}
        
    - name: Wait for rollback completion
      run: |
        kubectl rollout status deployment/myapp \
          --namespace myapp-${{ inputs.environment }} \
          --timeout=300s
          
    - name: Verify rollback
      run: |
        # Health check
        kubectl wait --for=condition=available deployment/myapp \
          --namespace myapp-${{ inputs.environment }} \
          --timeout=120s
          
        # Get new image
        NEW_IMAGE=$(kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].image}' -n myapp-${{ inputs.environment }})
        
        echo "Rollback completed:"
        echo "  Previous image: ${{ steps.current.outputs.current-image }}"
        echo "  Current image: $NEW_IMAGE"
        
    - name: Run smoke tests
      run: |
        npm ci
        npm run test:smoke -- --env=${{ inputs.environment }}
        
    - name: Notify team
      uses: 8398a7/action-slack@v3
      with:
        status: ${{ job.status }}
        text: |
          🔄 Rollback ${{ job.status }} for ${{ inputs.environment }}
          From revision: ${{ steps.current.outputs.current-revision }}
          To revision: ${{ inputs.revision }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}

2. Health Check และ Auto-Recovery

# .github/workflows/health-monitor.yml
name: Health Monitor

on:
  schedule:
    - cron: '*/5 * * * *'  # Every 5 minutes
  workflow_dispatch:

jobs:
  health-check:
    runs-on: ubuntu-latest
    
    strategy:
      matrix:
        environment: [staging, production]
        
    steps:
    - name: Health check
      id: health
      run: |
        HEALTH_URL="https://api-${{ matrix.environment }}.myapp.com/health"
        
        # Try 3 times
        for i in {1..3}; do
          if curl -f --max-time 10 "$HEALTH_URL"; then
            echo "status=healthy" >> $GITHUB_OUTPUT
            exit 0
          fi
          echo "Attempt $i failed"
          sleep 10
        done
        
        echo "status=unhealthy" >> $GITHUB_OUTPUT
        
    - name: Check deployment status
      if: steps.health.outputs.status == 'unhealthy'
      id: deployment
      run: |
        # Check if deployment is actually running
        READY_PODS=$(kubectl get pods -l app=myapp -n myapp-${{ matrix.environment }} --field-selector=status.phase=Running --no-headers | wc -l)
        TOTAL_PODS=$(kubectl get pods -l app=myapp -n myapp-${{ matrix.environment }} --no-headers | wc -l)
        
        echo "ready-pods=$READY_PODS" >> $GITHUB_OUTPUT
        echo "total-pods=$TOTAL_PODS" >> $GITHUB_OUTPUT
        
    - name: Auto-restart if needed
      if: steps.health.outputs.status == 'unhealthy' && steps.deployment.outputs.ready-pods == '0'
      run: |
        echo "No healthy pods found, restarting deployment..."
        kubectl rollout restart deployment/myapp -n myapp-${{ matrix.environment }}
        kubectl rollout status deployment/myapp -n myapp-${{ matrix.environment }} --timeout=300s
        
    - name: Auto-rollback if restart fails
      if: failure() && matrix.environment == 'production'
      run: |
        echo "Restart failed, performing automatic rollback..."
        helm rollback myapp 0 --namespace myapp-production
        kubectl rollout status deployment/myapp --namespace myapp-production --timeout=300s
        
    - name: Alert on persistent failure
      if: failure()
      uses: 8398a7/action-slack@v3
      with:
        status: failure
        text: |
          🚨 CRITICAL: ${{ matrix.environment }} is DOWN!
          
          Health check failed and auto-recovery attempts unsuccessful.
          Manual intervention required.
          
          Ready pods: ${{ steps.deployment.outputs.ready-pods }}/${{ steps.deployment.outputs.total-pods }}
      env:
        SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
        
    - name: Create incident issue
      if: failure() && matrix.environment == 'production'
      uses: actions/github-script@v6
      with:
        script: |
          github.rest.issues.create({
            owner: context.repo.owner,
            repo: context.repo.repo,
            title: '🚨 PRODUCTION DOWN - Auto-recovery failed',
            body: `
            ## Incident Report
            
            **Environment:** ${{ matrix.environment }}  
            **Time:** ${new Date().toISOString()}  
            **Status:** Health check failed, auto-recovery unsuccessful
            
            ### Details
            - Ready pods: ${{ steps.deployment.outputs.ready-pods }}/${{ steps.deployment.outputs.total-pods }}
            - Health endpoint: https://api-${{ matrix.environment }}.myapp.com/health
            
            ### Actions Taken
            - [x] Health check (failed)
            - [x] Deployment restart attempt (failed)
            - [x] Auto-rollback attempt (failed)
            
            ### Next Steps
            - [ ] Manual investigation required
            - [ ] Check infrastructure status
            - [ ] Review recent deployments
            
            **Assignees:** @oncall-team
            `,
            labels: ['incident', 'critical', 'production'],
            assignees: ['oncall-engineer']
          })

เคสจริง: จาก Manual Hell สู่ Automated Heaven

ก่อนใช้ GitHub Actions

การ Deploy แต่ละครั้ง:

# ขั้นตอนที่ต้องจำและทำเอง (30-45 นาที)
1. git pull origin main
2. npm install
3. npm run test                    # บางทีลืม
4. npm run build
5. docker build -t myapp:v1.2.3 .
6. docker push registry/myapp:v1.2.3
7. kubectl set image deployment/myapp myapp=registry/myapp:v1.2.3
8. kubectl rollout status deployment/myapp
9. curl https://api.myapp.com/health  # manual check
10. แจ้งทีมใน Slack (บางทีลืม)

ปัญหาที่เจอจริง:

Deploy ผิด environment (staging code ไป production)
Version mismatch (Docker tag ไม่ตรงกับ code)
ลืม run tests แล้ว deploy code เสีย
Manual errors พิมพ์ผิด copy-paste ผิด
ไม่มี audit trail ไม่รู้ใครเป็นคน deploy
Rollback ยาก เมื่อเกิดปัญหา

หลังใช้ GitHub Actions

การ Deploy ใหม่:

# สำหรับ staging
git push origin main

# สำหรับ production  
git tag v1.2.3
git push origin v1.2.3

# That's it! 🎉

สิ่งที่เกิดขึ้นอัตโนมัติ:

✅ Code quality checks (ESLint, Prettier, TypeScript)
✅ Security scans (npm audit, Snyk, Trivy)
✅ Automated tests (Unit, Integration, E2E)
✅ Multi-platform builds (AMD64, ARM64)
✅ Zero-downtime deployment (Rolling update)
✅ Health checks และ smoke tests
✅ Automatic notifications (Slack, Teams)
✅ Monitoring integration (Prometheus metrics)
✅ Auto-rollback หากเกิดปัญหา

ผลลัพธ์ที่ได้:

Metric	Before	After	Improvement
Deploy Time	30-45 min	5-8 min	80% faster
Error Rate	25%	<2%	92% reduction
Rollback Time	20-30 min	2-3 min	90% faster
Manual Steps	10+	0	100% automated
Audit Trail	None	Complete	Full traceability

Team Productivity Impact:

Deployment frequency: จาก 1-2 ครั้ง/สัปดาห์ เป็น 5-10 ครั้ง/วัน
Lead time: จาก idea ถึง production ลดจาก 2 สัปดาห์ เป็น 2 วัน
MTTR (Mean Time To Recovery): จาก 2 ชั่วโมง เป็น 10 นาที
Developer confidence: จาก กลัว deploy เป็น deploy ได้ตลอด

สรุป: GitHub Actions ที่เปลี่ยนชีวิต Developer

ก่อนรู้จัก GitHub Actions:

Deploy = เครียด กลัว ใช้เวลานาน 😰
Manual process = prone to errors
ไม่กล้า deploy บ่อยๆ
Rollback = nightmare
No visibility into deployment process

หลังใช้ GitHub Actions:

Deployment คือ git push - Simple as that! 🚀
Zero manual errors ทุกอย่าง automated
Deploy confidence กล้า deploy ทุกวัน
Instant rollback หากเกิดปัญหา
Complete visibility รู้ทุกขั้นตอน

ข้อดีที่ได้จริง:

Developer Productivity เพิ่ม 5x: เวลาที่เคย manual deploy นำไปพัฒนา feature ใหม่
Quality Gate: ไม่มี bad code ผ่าน production ได้
Faster Feedback: รู้ปัญหาทันทีที่เกิดขึ้น
Team Collaboration: ใครก็ deploy ได้แบบเดียวกัน
Compliance: Audit trail ครบถ้วน

Best Practices ที่เรียนรู้:

Everything as Code: Workflow, Configuration, Infrastructure
Fail Fast: ตรวจปัญหาให้เร็วที่สุด
Progressive Deployment: Canary, Blue-Green strategies
Monitoring Integration: เชื่อม CI/CD เข้ากับ monitoring
Security First: Security checks ในทุกขั้นตอน

Anti-patterns ที่หลีกเลี่ยง:

Manual steps ใน automated pipeline
ไม่มี rollback strategy
Long-running workflows ที่ block คนอื่น
ไม่มี proper secret management
Over-complicated workflows

GitHub Actions เหมือน Personal Assistant ที่ไม่เคยผิดพลาด

มันทำให้การ deploy จาก “งานที่น่ากลัว” เป็น “เรื่องธรรมดา”

ตอนนี้ไม่สามารถจินตนาการได้เลยว่าจะทำงานโดยไม่มี CI/CD!

เพราะมันทำให้ผมไม่ต้องกลัว deploy อีกแล้ว! 🎯✨