วันที่กลัว Deploy
ก่อนรู้จัก GitHub Actions การ deploy เป็นฝันร้าย:
ขั้นตอนการ Deploy แบบ Manual:
# 1. Build แอป
npm run build
# 2. Run tests (บางทีลืม)
npm test
# 3. Build Docker image
docker build -t myapp:latest .
# 4. Tag image
docker tag myapp:latest myregistry/myapp:v1.2.3
# 5. Push image
docker push myregistry/myapp:v1.2.3
# 6. Update Kubernetes deployment
kubectl set image deployment/myapp myapp=myregistry/myapp:v1.2.3
# 7. Wait and pray 🙏
kubectl rollout status deployment/myapp
ปัญหาที่เจอบ่อย:
- ลืม run tests แล้ว deploy code ที่เสีย
- Version mismatch ระหว่าง code กับ Docker image
- Environment differences ระหว่าง local กับ production
- Manual errors พิมพ์ผิด copy-paste ผิด
- Rollback ยาก เมื่อเกิดปัญหา
- No audit trail ไม่รู้ใครเป็นคน deploy อะไร
ตัวอย่างความผิดพลาด:
# ที่เจอจริง 😅
docker push myregistry/myapp:v1.2.3
# Error: denied: requested access to the resource is denied
kubectl set image deployment/myapp myapp=myregistry/myapp:v1.2.4
# ผิด version! build v1.2.3 แต่ deploy v1.2.4
npm run build
# ลืมสลับ environment variable
# deploy production code ไป development database! 😱
ผลลัพธ์: ใช้เวลา deploy นาน เครียด และมักเกิดปัญหา! 😰
จนวันหนึ่ง เจอ GitHub Actions แล้วชีวิตเปลี่ยนไป! 🚀
GitHub Actions Fundamentals
1. เริ่มต้นด้วย Basic Workflow
# .github/workflows/ci.yml
name: CI/CD Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
- name: Run tests
run: npm test
- name: Run build
run: npm run build
คำอธิบาย:
on: กำหนดว่า workflow จะทำงานเมื่อไหร่jobs: งานที่ต้องทำruns-on: OS ที่ใช้รันงานsteps: ขั้นตอนของงานuses: ใช้ Action ที่คนอื่นสร้างไว้run: รันคำสั่ง shell
2. Environment Variables และ Secrets
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
tags:
- 'v*'
env:
NODE_VERSION: '18'
DOCKER_REGISTRY: 'ghcr.io'
jobs:
deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
- name: Build application
env:
API_URL: ${{ vars.API_URL }}
DATABASE_URL: ${{ secrets.DATABASE_URL }}
JWT_SECRET: ${{ secrets.JWT_SECRET }}
run: |
npm ci
npm run build
- name: Login to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.DOCKER_REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
Advanced CI/CD Patterns
1. Multi-Stage Pipeline
# .github/workflows/full-pipeline.yml
name: Full CI/CD Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
# Stage 1: Code Quality
code-quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: ESLint
run: npm run lint
- name: Prettier
run: npm run format:check
- name: TypeScript check
run: npm run type-check
- name: Security audit
run: npm audit --audit-level=moderate
# Stage 2: Testing
test:
needs: code-quality
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [16, 18, 20]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Unit tests
run: npm run test:unit
- name: Integration tests
run: npm run test:integration
- name: Upload coverage
uses: codecov/codecov-action@v3
if: matrix.node-version == 18
# Stage 3: Build Docker Image
build:
needs: [code-quality, test]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/v')
outputs:
image-tag: ${{ steps.meta.outputs.tags }}
image-digest: ${{ steps.build.outputs.digest }}
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/${{ github.repository }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha,prefix={{branch}}-
- name: Build and push
id: build
uses: docker/build-push-action@v5
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
NODE_VERSION=18
BUILD_DATE=${{ github.event.head_commit.timestamp }}
GIT_SHA=${{ github.sha }}
# Stage 4: Security Scanning
security:
needs: build
runs-on: ubuntu-latest
steps:
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
image-ref: ${{ needs.build.outputs.image-tag }}
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
# Stage 5: Deploy to Staging
deploy-staging:
needs: [build, security]
runs-on: ubuntu-latest
if: github.ref == 'refs/heads/main'
environment: staging
steps:
- uses: actions/checkout@v4
- name: Deploy to Staging
run: |
echo "Deploying to staging..."
# จะใช้ ArgoCD หรือ Helm ในขั้นตอนจริง
- name: Run smoke tests
run: |
npm ci
npm run test:smoke -- --env=staging
# Stage 6: Deploy to Production
deploy-production:
needs: [build, security, deploy-staging]
runs-on: ubuntu-latest
if: startsWith(github.ref, 'refs/tags/v')
environment: production
steps:
- uses: actions/checkout@v4
- name: Deploy to Production
run: |
echo "Deploying to production..."
- name: Run health checks
run: |
npm ci
npm run test:health -- --env=production
- name: Notify team
if: always()
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: |
Deployment ${{ job.status }}
Version: ${{ github.ref_name }}
Commit: ${{ github.sha }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
Reusable Workflows และ Actions
1. Reusable Workflow
# .github/workflows/reusable-deploy.yml
name: Reusable Deploy Workflow
on:
workflow_call:
inputs:
environment:
required: true
type: string
image-tag:
required: true
type: string
health-check-url:
required: false
type: string
default: '/health'
secrets:
KUBECONFIG:
required: true
SLACK_WEBHOOK:
required: false
jobs:
deploy:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
steps:
- uses: actions/checkout@v4
- name: Setup kubectl
uses: azure/setup-kubectl@v3
- name: Setup kubeconfig
run: |
mkdir -p ~/.kube
echo "${{ secrets.KUBECONFIG }}" | base64 -d > ~/.kube/config
- name: Deploy with Helm
run: |
helm upgrade --install myapp ./helm/myapp \
--namespace myapp-${{ inputs.environment }} \
--create-namespace \
--values ./helm/values-${{ inputs.environment }}.yaml \
--set image.tag=${{ inputs.image-tag }} \
--wait --timeout=300s
- name: Health check
run: |
# Wait for deployment to be ready
kubectl wait --for=condition=available deployment/myapp \
--namespace=myapp-${{ inputs.environment }} \
--timeout=300s
# Test health endpoint
SERVICE_URL=$(kubectl get service myapp -o jsonpath='{.status.loadBalancer.ingress[0].ip}' -n myapp-${{ inputs.environment }})
curl -f http://$SERVICE_URL${{ inputs.health-check-url }}
- name: Notify success
if: success() && secrets.SLACK_WEBHOOK
uses: 8398a7/action-slack@v3
with:
status: success
text: |
✅ Successfully deployed to ${{ inputs.environment }}
Image: ${{ inputs.image-tag }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
- name: Notify failure
if: failure() && secrets.SLACK_WEBHOOK
uses: 8398a7/action-slack@v3
with:
status: failure
text: |
❌ Failed to deploy to ${{ inputs.environment }}
Image: ${{ inputs.image-tag }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
2. ใช้ Reusable Workflow
# .github/workflows/main-pipeline.yml
name: Main Pipeline
on:
push:
tags: ['v*']
jobs:
build:
# ... build job ...
deploy-staging:
needs: build
uses: ./.github/workflows/reusable-deploy.yml
with:
environment: staging
image-tag: ${{ needs.build.outputs.image-tag }}
secrets:
KUBECONFIG: ${{ secrets.STAGING_KUBECONFIG }}
SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
deploy-production:
needs: [build, deploy-staging]
uses: ./.github/workflows/reusable-deploy.yml
with:
environment: production
image-tag: ${{ needs.build.outputs.image-tag }}
secrets:
KUBECONFIG: ${{ secrets.PRODUCTION_KUBECONFIG }}
SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
3. Custom Action
// .github/actions/deploy-notification/action.yml
name: 'Deploy Notification'
description: 'Send deployment notification with rich information'
inputs:
environment:
description: 'Deployment environment'
required: true
status:
description: 'Deployment status'
required: true
image-tag:
description: 'Docker image tag'
required: true
slack-webhook:
description: 'Slack webhook URL'
required: true
runs:
using: 'node20'
main: 'dist/index.js'
// .github/actions/deploy-notification/src/index.js
const core = require('@actions/core');
const axios = require('axios');
async function run() {
try {
const environment = core.getInput('environment');
const status = core.getInput('status');
const imageTag = core.getInput('image-tag');
const slackWebhook = core.getInput('slack-webhook');
const color = status === 'success' ? 'good' : 'danger';
const emoji = status === 'success' ? '✅' : '❌';
const payload = {
attachments: [
{
color,
title: `${emoji} Deployment ${status}`,
fields: [
{
title: 'Environment',
value: environment,
short: true
},
{
title: 'Image Tag',
value: imageTag,
short: true
},
{
title: 'Repository',
value: process.env.GITHUB_REPOSITORY,
short: true
},
{
title: 'Actor',
value: process.env.GITHUB_ACTOR,
short: true
}
],
footer: 'GitHub Actions',
ts: Math.floor(Date.now() / 1000)
}
]
};
await axios.post(slackWebhook, payload);
core.info('Notification sent successfully');
} catch (error) {
core.setFailed(error.message);
}
}
run();
Database Migrations และ Deployment Strategies
1. Zero-Downtime Deployment
# .github/workflows/zero-downtime-deploy.yml
name: Zero Downtime Deployment
on:
push:
tags: ['v*']
jobs:
deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
# Pre-deployment checks
- name: Pre-deployment health check
run: |
curl -f https://api.myapp.com/health
# Database migrations (if needed)
- name: Run database migrations
run: |
kubectl create job migration-${{ github.sha }} \
--from=cronjob/db-migrations \
--namespace=myapp-production
kubectl wait --for=condition=complete job/migration-${{ github.sha }} \
--namespace=myapp-production \
--timeout=300s
# Rolling deployment
- name: Deploy new version
run: |
helm upgrade myapp ./helm/myapp \
--namespace myapp-production \
--values ./helm/values-production.yaml \
--set image.tag=${{ github.ref_name }} \
--wait --timeout=600s
# Post-deployment verification
- name: Verify deployment
run: |
# Check all pods are running
kubectl get pods -l app=myapp -n myapp-production
# Health check with retry
for i in {1..30}; do
if curl -f https://api.myapp.com/health; then
echo "Health check passed"
break
fi
echo "Attempt $i failed, retrying..."
sleep 10
done
# Smoke tests
- name: Run smoke tests
run: |
npm ci
npm run test:smoke -- --env=production
# Rollback on failure
- name: Rollback on failure
if: failure()
run: |
echo "Deployment failed, rolling back..."
helm rollback myapp 0 --namespace=myapp-production
# Wait for rollback to complete
kubectl rollout status deployment/myapp \
--namespace=myapp-production \
--timeout=300s
2. Blue-Green Deployment
# .github/workflows/blue-green-deploy.yml
name: Blue-Green Deployment
on:
workflow_dispatch:
inputs:
environment:
description: 'Environment to deploy'
required: true
default: 'production'
type: choice
options:
- staging
- production
jobs:
blue-green-deploy:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
steps:
- uses: actions/checkout@v4
- name: Determine deployment slot
id: slot
run: |
# Check current active slot
CURRENT_SLOT=$(kubectl get service myapp -o jsonpath='{.spec.selector.slot}' -n myapp-${{ inputs.environment }})
if [ "$CURRENT_SLOT" = "blue" ]; then
NEW_SLOT="green"
else
NEW_SLOT="blue"
fi
echo "current-slot=$CURRENT_SLOT" >> $GITHUB_OUTPUT
echo "new-slot=$NEW_SLOT" >> $GITHUB_OUTPUT
- name: Deploy to inactive slot
run: |
helm upgrade --install myapp-${{ steps.slot.outputs.new-slot }} ./helm/myapp \
--namespace myapp-${{ inputs.environment }} \
--values ./helm/values-${{ inputs.environment }}.yaml \
--set image.tag=${{ github.ref_name }} \
--set deployment.slot=${{ steps.slot.outputs.new-slot }} \
--wait --timeout=600s
- name: Test new deployment
run: |
# Get new deployment service endpoint
NEW_IP=$(kubectl get service myapp-${{ steps.slot.outputs.new-slot }} \
-o jsonpath='{.status.loadBalancer.ingress[0].ip}' \
-n myapp-${{ inputs.environment }})
# Run tests against new deployment
npm ci
ENDPOINT=http://$NEW_IP npm run test:smoke
- name: Switch traffic to new slot
run: |
# Update main service to point to new slot
kubectl patch service myapp \
-p '{"spec":{"selector":{"slot":"${{ steps.slot.outputs.new-slot }}"}}}' \
-n myapp-${{ inputs.environment }}
# Wait a bit for traffic switch
sleep 30
- name: Verify traffic switch
run: |
# Test main endpoint
curl -f https://api-${{ inputs.environment }}.myapp.com/health
- name: Cleanup old deployment
run: |
# Remove old deployment after successful switch
helm uninstall myapp-${{ steps.slot.outputs.current-slot }} \
--namespace myapp-${{ inputs.environment }} || true
Monitoring และ Observability Integration
1. Performance Monitoring
# .github/workflows/performance-monitoring.yml
name: Performance Monitoring
on:
schedule:
- cron: '*/15 * * * *' # Every 15 minutes
workflow_dispatch:
jobs:
performance-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
- name: Install dependencies
run: npm ci
- name: Run performance tests
run: |
npm run test:performance -- --reporter=json > performance-results.json
- name: Parse results
id: results
run: |
RESPONSE_TIME=$(jq -r '.summary.responseTime.mean' performance-results.json)
ERROR_RATE=$(jq -r '.summary.errorRate' performance-results.json)
THROUGHPUT=$(jq -r '.summary.throughput' performance-results.json)
echo "response-time=$RESPONSE_TIME" >> $GITHUB_OUTPUT
echo "error-rate=$ERROR_RATE" >> $GITHUB_OUTPUT
echo "throughput=$THROUGHPUT" >> $GITHUB_OUTPUT
- name: Check performance thresholds
run: |
RESPONSE_TIME=${{ steps.results.outputs.response-time }}
ERROR_RATE=${{ steps.results.outputs.error-rate }}
if (( $(echo "$RESPONSE_TIME > 500" | bc -l) )); then
echo "❌ Response time too high: ${RESPONSE_TIME}ms"
exit 1
fi
if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
echo "❌ Error rate too high: ${ERROR_RATE}"
exit 1
fi
echo "✅ Performance checks passed"
- name: Send metrics to monitoring
run: |
# Send to Prometheus Pushgateway
echo "http_response_time_ms ${{ steps.results.outputs.response-time }}" | \
curl -X POST --data-binary @- \
http://pushgateway.monitoring.svc.cluster.local:9091/metrics/job/github-actions
- name: Alert on failure
if: failure()
uses: 8398a7/action-slack@v3
with:
status: failure
text: |
🚨 Performance degradation detected!
Response Time: ${{ steps.results.outputs.response-time }}ms
Error Rate: ${{ steps.results.outputs.error-rate }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
2. Dependency Vulnerability Scanning
# .github/workflows/security-scan.yml
name: Security Scan
on:
schedule:
- cron: '0 2 * * *' # Daily at 2 AM
pull_request:
branches: [main]
jobs:
dependency-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '18'
- name: Install dependencies
run: npm ci
- name: Run npm audit
run: |
npm audit --audit-level=moderate --json > audit-results.json || true
- name: Parse audit results
id: audit
run: |
VULNERABILITIES=$(jq '.metadata.vulnerabilities | length' audit-results.json)
HIGH_VULNS=$(jq '.metadata.vulnerabilities.high // 0' audit-results.json)
CRITICAL_VULNS=$(jq '.metadata.vulnerabilities.critical // 0' audit-results.json)
echo "total-vulnerabilities=$VULNERABILITIES" >> $GITHUB_OUTPUT
echo "high-vulnerabilities=$HIGH_VULNS" >> $GITHUB_OUTPUT
echo "critical-vulnerabilities=$CRITICAL_VULNS" >> $GITHUB_OUTPUT
- name: Check vulnerability thresholds
run: |
if [ "${{ steps.audit.outputs.critical-vulnerabilities }}" -gt 0 ]; then
echo "❌ Critical vulnerabilities found: ${{ steps.audit.outputs.critical-vulnerabilities }}"
exit 1
fi
if [ "${{ steps.audit.outputs.high-vulnerabilities }}" -gt 5 ]; then
echo "⚠️ Too many high vulnerabilities: ${{ steps.audit.outputs.high-vulnerabilities }}"
exit 1
fi
- name: Snyk security scan
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high --json > snyk-results.json
- name: Upload security scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: snyk-results.json
- name: Create security issue
if: failure()
uses: actions/github-script@v6
with:
script: |
github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: '🔒 Security vulnerabilities detected',
body: `
Security scan failed with the following issues:
- Critical vulnerabilities: ${{ steps.audit.outputs.critical-vulnerabilities }}
- High vulnerabilities: ${{ steps.audit.outputs.high-vulnerabilities }}
- Total vulnerabilities: ${{ steps.audit.outputs.total-vulnerabilities }}
Please review and fix the security issues.
Triggered by: ${context.workflow} #${context.runNumber}
`,
labels: ['security', 'critical']
})
Advanced Deployment Strategies
1. Canary Deployment
# .github/workflows/canary-deploy.yml
name: Canary Deployment
on:
workflow_dispatch:
inputs:
canary-percentage:
description: 'Canary traffic percentage'
required: true
default: '10'
type: choice
options: ['5', '10', '25', '50', '100']
jobs:
canary-deploy:
runs-on: ubuntu-latest
environment: production
steps:
- uses: actions/checkout@v4
- name: Deploy canary version
run: |
# Deploy canary deployment
helm upgrade --install myapp-canary ./helm/myapp \
--namespace myapp-production \
--values ./helm/values-production.yaml \
--set image.tag=${{ github.ref_name }} \
--set deployment.type=canary \
--set replicaCount=2 \
--wait --timeout=300s
- name: Configure traffic split
run: |
# Update Istio VirtualService for traffic splitting
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: myapp-traffic-split
namespace: myapp-production
spec:
hosts:
- myapp.example.com
http:
- match:
- headers:
canary:
exact: "true"
route:
- destination:
host: myapp-canary
weight: 100
- route:
- destination:
host: myapp
weight: ${{ 100 - inputs.canary-percentage }}
- destination:
host: myapp-canary
weight: ${{ inputs.canary-percentage }}
EOF
- name: Monitor canary metrics
run: |
# Monitor for 10 minutes
for i in {1..60}; do
# Check error rate
ERROR_RATE=$(curl -s "http://prometheus:9090/api/v1/query?query=sum(rate(http_requests_total{service=\"myapp-canary\",status=~\"5..\"}[5m]))/sum(rate(http_requests_total{service=\"myapp-canary\"}[5m]))" | jq -r '.data.result[0].value[1] // 0')
# Check response time
RESPONSE_TIME=$(curl -s "http://prometheus:9090/api/v1/query?query=histogram_quantile(0.95,sum(rate(http_request_duration_seconds_bucket{service=\"myapp-canary\"}[5m]))by(le))" | jq -r '.data.result[0].value[1] // 0')
echo "Minute $i - Error Rate: $ERROR_RATE, Response Time: ${RESPONSE_TIME}s"
# Check thresholds
if (( $(echo "$ERROR_RATE > 0.05" | bc -l) )); then
echo "❌ Canary error rate too high: $ERROR_RATE"
exit 1
fi
if (( $(echo "$RESPONSE_TIME > 0.5" | bc -l) )); then
echo "❌ Canary response time too high: ${RESPONSE_TIME}s"
exit 1
fi
sleep 10
done
echo "✅ Canary monitoring completed successfully"
- name: Promote canary on success
if: inputs.canary-percentage == '100'
run: |
# Replace main deployment with canary
kubectl patch deployment myapp \
--patch '{"spec":{"template":{"spec":{"containers":[{"name":"myapp","image":"myapp:${{ github.ref_name }}"}]}}}}' \
--namespace myapp-production
# Wait for rollout
kubectl rollout status deployment/myapp --namespace myapp-production
# Remove canary deployment
helm uninstall myapp-canary --namespace myapp-production
- name: Rollback canary on failure
if: failure()
run: |
echo "❌ Canary deployment failed, rolling back..."
# Remove traffic from canary
kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: myapp-traffic-split
namespace: myapp-production
spec:
hosts:
- myapp.example.com
http:
- route:
- destination:
host: myapp
weight: 100
EOF
# Remove canary deployment
helm uninstall myapp-canary --namespace myapp-production || true
2. Feature Flag Integration
# .github/workflows/feature-flag-deploy.yml
name: Feature Flag Deployment
on:
push:
branches: [main]
jobs:
deploy-with-flags:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Extract feature flags from commit
id: flags
run: |
# Parse commit message for feature flags
COMMIT_MSG="${{ github.event.head_commit.message }}"
# Extract flags like [feature:new-ui=true]
FLAGS=$(echo "$COMMIT_MSG" | grep -o '\[feature:[^]]*\]' | sed 's/\[feature://g' | sed 's/\]//g' | tr '\n' ',' | sed 's/,$//')
echo "flags=$FLAGS" >> $GITHUB_OUTPUT
echo "Found feature flags: $FLAGS"
- name: Deploy with feature flags
run: |
# Deploy with feature flags as environment variables
IFS=',' read -ra FLAG_ARRAY <<< "${{ steps.flags.outputs.flags }}"
FEATURE_FLAGS=""
for flag in "${FLAG_ARRAY[@]}"; do
FLAG_NAME=$(echo "$flag" | cut -d'=' -f1 | tr '[:lower:]' '[:upper:]')
FLAG_VALUE=$(echo "$flag" | cut -d'=' -f2)
FEATURE_FLAGS="$FEATURE_FLAGS --set env.FEATURE_$FLAG_NAME=$FLAG_VALUE"
done
helm upgrade --install myapp ./helm/myapp \
--namespace myapp-staging \
--values ./helm/values-staging.yaml \
--set image.tag=${{ github.sha }} \
$FEATURE_FLAGS \
--wait --timeout=300s
- name: Run feature flag tests
run: |
# Run tests that verify feature flags work correctly
npm ci
npm run test:features -- --env=staging
- name: Update feature flag service
run: |
# Update LaunchDarkly or similar service
curl -X PATCH "https://app.launchdarkly.com/api/v2/flags/default/new-ui" \
-H "Authorization: api-${{ secrets.LAUNCHDARKLY_TOKEN }}" \
-H "Content-Type: application/json" \
-d '{
"patch": [
{
"op": "replace",
"path": "/environments/staging/on",
"value": true
}
]
}'
การทำ Rollback และ Recovery
1. Automated Rollback
# .github/workflows/automated-rollback.yml
name: Automated Rollback
on:
workflow_dispatch:
inputs:
environment:
description: 'Environment to rollback'
required: true
type: choice
options: ['staging', 'production']
revision:
description: 'Revision to rollback to (0 for previous)'
required: false
default: '0'
jobs:
rollback:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
steps:
- uses: actions/checkout@v4
- name: Setup kubectl and Helm
uses: azure/setup-kubectl@v3
- name: Configure kubeconfig
run: |
mkdir -p ~/.kube
echo "${{ secrets.KUBECONFIG }}" | base64 -d > ~/.kube/config
- name: Get current deployment info
id: current
run: |
CURRENT_REVISION=$(helm history myapp -n myapp-${{ inputs.environment }} --max 1 --output json | jq -r '.[0].revision')
CURRENT_IMAGE=$(kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].image}' -n myapp-${{ inputs.environment }})
echo "current-revision=$CURRENT_REVISION" >> $GITHUB_OUTPUT
echo "current-image=$CURRENT_IMAGE" >> $GITHUB_OUTPUT
- name: Perform rollback
run: |
REVISION="${{ inputs.revision }}"
if [ "$REVISION" = "0" ]; then
# Rollback to previous revision
REVISION=$((${{ steps.current.outputs.current-revision }} - 1))
fi
echo "Rolling back to revision $REVISION"
helm rollback myapp $REVISION --namespace myapp-${{ inputs.environment }}
- name: Wait for rollback completion
run: |
kubectl rollout status deployment/myapp \
--namespace myapp-${{ inputs.environment }} \
--timeout=300s
- name: Verify rollback
run: |
# Health check
kubectl wait --for=condition=available deployment/myapp \
--namespace myapp-${{ inputs.environment }} \
--timeout=120s
# Get new image
NEW_IMAGE=$(kubectl get deployment myapp -o jsonpath='{.spec.template.spec.containers[0].image}' -n myapp-${{ inputs.environment }})
echo "Rollback completed:"
echo " Previous image: ${{ steps.current.outputs.current-image }}"
echo " Current image: $NEW_IMAGE"
- name: Run smoke tests
run: |
npm ci
npm run test:smoke -- --env=${{ inputs.environment }}
- name: Notify team
uses: 8398a7/action-slack@v3
with:
status: ${{ job.status }}
text: |
🔄 Rollback ${{ job.status }} for ${{ inputs.environment }}
From revision: ${{ steps.current.outputs.current-revision }}
To revision: ${{ inputs.revision }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
2. Health Check และ Auto-Recovery
# .github/workflows/health-monitor.yml
name: Health Monitor
on:
schedule:
- cron: '*/5 * * * *' # Every 5 minutes
workflow_dispatch:
jobs:
health-check:
runs-on: ubuntu-latest
strategy:
matrix:
environment: [staging, production]
steps:
- name: Health check
id: health
run: |
HEALTH_URL="https://api-${{ matrix.environment }}.myapp.com/health"
# Try 3 times
for i in {1..3}; do
if curl -f --max-time 10 "$HEALTH_URL"; then
echo "status=healthy" >> $GITHUB_OUTPUT
exit 0
fi
echo "Attempt $i failed"
sleep 10
done
echo "status=unhealthy" >> $GITHUB_OUTPUT
- name: Check deployment status
if: steps.health.outputs.status == 'unhealthy'
id: deployment
run: |
# Check if deployment is actually running
READY_PODS=$(kubectl get pods -l app=myapp -n myapp-${{ matrix.environment }} --field-selector=status.phase=Running --no-headers | wc -l)
TOTAL_PODS=$(kubectl get pods -l app=myapp -n myapp-${{ matrix.environment }} --no-headers | wc -l)
echo "ready-pods=$READY_PODS" >> $GITHUB_OUTPUT
echo "total-pods=$TOTAL_PODS" >> $GITHUB_OUTPUT
- name: Auto-restart if needed
if: steps.health.outputs.status == 'unhealthy' && steps.deployment.outputs.ready-pods == '0'
run: |
echo "No healthy pods found, restarting deployment..."
kubectl rollout restart deployment/myapp -n myapp-${{ matrix.environment }}
kubectl rollout status deployment/myapp -n myapp-${{ matrix.environment }} --timeout=300s
- name: Auto-rollback if restart fails
if: failure() && matrix.environment == 'production'
run: |
echo "Restart failed, performing automatic rollback..."
helm rollback myapp 0 --namespace myapp-production
kubectl rollout status deployment/myapp --namespace myapp-production --timeout=300s
- name: Alert on persistent failure
if: failure()
uses: 8398a7/action-slack@v3
with:
status: failure
text: |
🚨 CRITICAL: ${{ matrix.environment }} is DOWN!
Health check failed and auto-recovery attempts unsuccessful.
Manual intervention required.
Ready pods: ${{ steps.deployment.outputs.ready-pods }}/${{ steps.deployment.outputs.total-pods }}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }}
- name: Create incident issue
if: failure() && matrix.environment == 'production'
uses: actions/github-script@v6
with:
script: |
github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: '🚨 PRODUCTION DOWN - Auto-recovery failed',
body: `
## Incident Report
**Environment:** ${{ matrix.environment }}
**Time:** ${new Date().toISOString()}
**Status:** Health check failed, auto-recovery unsuccessful
### Details
- Ready pods: ${{ steps.deployment.outputs.ready-pods }}/${{ steps.deployment.outputs.total-pods }}
- Health endpoint: https://api-${{ matrix.environment }}.myapp.com/health
### Actions Taken
- [x] Health check (failed)
- [x] Deployment restart attempt (failed)
- [x] Auto-rollback attempt (failed)
### Next Steps
- [ ] Manual investigation required
- [ ] Check infrastructure status
- [ ] Review recent deployments
**Assignees:** @oncall-team
`,
labels: ['incident', 'critical', 'production'],
assignees: ['oncall-engineer']
})
เคสจริง: จาก Manual Hell สู่ Automated Heaven
ก่อนใช้ GitHub Actions
การ Deploy แต่ละครั้ง:
# ขั้นตอนที่ต้องจำและทำเอง (30-45 นาที)
1. git pull origin main
2. npm install
3. npm run test # บางทีลืม
4. npm run build
5. docker build -t myapp:v1.2.3 .
6. docker push registry/myapp:v1.2.3
7. kubectl set image deployment/myapp myapp=registry/myapp:v1.2.3
8. kubectl rollout status deployment/myapp
9. curl https://api.myapp.com/health # manual check
10. แจ้งทีมใน Slack (บางทีลืม)
ปัญหาที่เจอจริง:
- Deploy ผิด environment (staging code ไป production)
- Version mismatch (Docker tag ไม่ตรงกับ code)
- ลืม run tests แล้ว deploy code เสีย
- Manual errors พิมพ์ผิด copy-paste ผิด
- ไม่มี audit trail ไม่รู้ใครเป็นคน deploy
- Rollback ยาก เมื่อเกิดปัญหา
หลังใช้ GitHub Actions
การ Deploy ใหม่:
# สำหรับ staging
git push origin main
# สำหรับ production
git tag v1.2.3
git push origin v1.2.3
# That's it! 🎉
สิ่งที่เกิดขึ้นอัตโนมัติ:
- ✅ Code quality checks (ESLint, Prettier, TypeScript)
- ✅ Security scans (npm audit, Snyk, Trivy)
- ✅ Automated tests (Unit, Integration, E2E)
- ✅ Multi-platform builds (AMD64, ARM64)
- ✅ Zero-downtime deployment (Rolling update)
- ✅ Health checks และ smoke tests
- ✅ Automatic notifications (Slack, Teams)
- ✅ Monitoring integration (Prometheus metrics)
- ✅ Auto-rollback หากเกิดปัญหา
ผลลัพธ์ที่ได้:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Deploy Time | 30-45 min | 5-8 min | 80% faster |
| Error Rate | 25% | <2% | 92% reduction |
| Rollback Time | 20-30 min | 2-3 min | 90% faster |
| Manual Steps | 10+ | 0 | 100% automated |
| Audit Trail | None | Complete | Full traceability |
Team Productivity Impact:
- Deployment frequency: จาก 1-2 ครั้ง/สัปดาห์ เป็น 5-10 ครั้ง/วัน
- Lead time: จาก idea ถึง production ลดจาก 2 สัปดาห์ เป็น 2 วัน
- MTTR (Mean Time To Recovery): จาก 2 ชั่วโมง เป็น 10 นาที
- Developer confidence: จาก กลัว deploy เป็น deploy ได้ตลอด
สรุป: GitHub Actions ที่เปลี่ยนชีวิต Developer
ก่อนรู้จัก GitHub Actions:
- Deploy = เครียด กลัว ใช้เวลานาน 😰
- Manual process = prone to errors
- ไม่กล้า deploy บ่อยๆ
- Rollback = nightmare
- No visibility into deployment process
หลังใช้ GitHub Actions:
- Deployment คือ git push - Simple as that! 🚀
- Zero manual errors ทุกอย่าง automated
- Deploy confidence กล้า deploy ทุกวัน
- Instant rollback หากเกิดปัญหา
- Complete visibility รู้ทุกขั้นตอน
ข้อดีที่ได้จริง:
- Developer Productivity เพิ่ม 5x: เวลาที่เคย manual deploy นำไปพัฒนา feature ใหม่
- Quality Gate: ไม่มี bad code ผ่าน production ได้
- Faster Feedback: รู้ปัญหาทันทีที่เกิดขึ้น
- Team Collaboration: ใครก็ deploy ได้แบบเดียวกัน
- Compliance: Audit trail ครบถ้วน
Best Practices ที่เรียนรู้:
- Everything as Code: Workflow, Configuration, Infrastructure
- Fail Fast: ตรวจปัญหาให้เร็วที่สุด
- Progressive Deployment: Canary, Blue-Green strategies
- Monitoring Integration: เชื่อม CI/CD เข้ากับ monitoring
- Security First: Security checks ในทุกขั้นตอน
Anti-patterns ที่หลีกเลี่ยง:
- Manual steps ใน automated pipeline
- ไม่มี rollback strategy
- Long-running workflows ที่ block คนอื่น
- ไม่มี proper secret management
- Over-complicated workflows
GitHub Actions เหมือน Personal Assistant ที่ไม่เคยผิดพลาด
มันทำให้การ deploy จาก “งานที่น่ากลัว” เป็น “เรื่องธรรมดา”
ตอนนี้ไม่สามารถจินตนาการได้เลยว่าจะทำงานโดยไม่มี CI/CD!
เพราะมันทำให้ผมไม่ต้องกลัว deploy อีกแล้ว! 🎯✨