Production Deployment
Production Deployment
Moving containers from development to production requires careful consideration of reliability, scalability, and operational concerns. This chapter covers deployment strategies, health checks, and patterns for production-ready containers.
Production Readiness
The Twelve-Factor App
Containers work best when following twelve-factor app principles:
Twelve-Factor Principles for Containers
| Factor | Container Implementation |
|---|---|
| Codebase | One image per service |
| Dependencies | All deps in image |
| Config | Environment variables |
| Backing Services | Service discovery |
| Build, Release, Run | Separate build/deploy stages |
| Processes | Stateless containers |
| Port Binding | Exposed ports |
| Concurrency | Horizontal scaling |
| Disposability | Fast startup/shutdown |
| Dev/Prod Parity | Same image everywhere |
| Logs | Stream to stdout |
| Admin Processes | Run in containers |
Production Dockerfile
# syntax=docker/dockerfile:1
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build && npm prune --production
# Production stage
FROM node:20-alpine AS production
# Security: non-root user
RUN addgroup -S nodejs && adduser -S nodejs -G nodejs
WORKDIR /app
# Copy only production artifacts
COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nodejs:nodejs /app/package.json ./
# Security: drop to non-root
USER nodejs
# Operational: expose port
EXPOSE 3000
# Reliability: health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
# Start application
CMD ["node", "dist/server.js"]Health Checks
Why Health Checks Matter
Health checks enable:
- Automatic container restart on failure
- Load balancer integration
- Rolling deployment verification
- Service discovery health awareness
Dockerfile Health Check
# Simple HTTP health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
# Using wget (smaller in alpine)
HEALTHCHECK CMD wget --quiet --tries=1 --spider http://localhost:8080/health || exit 1
# Custom script
HEALTHCHECK CMD /app/healthcheck.shHealth Check Parameters
| Parameter | Default | Description |
|---|---|---|
--interval | 30s | Time between checks |
--timeout | 30s | Time to wait for response |
--start-period | 0s | Grace period for startup |
--retries | 3 | Failures before unhealthy |
Application Health Endpoints
// Node.js health endpoint
app.get("/health", (req, res) => {
res.status(200).json({ status: "healthy" });
});
// With dependency checks
app.get("/health", async (req, res) => {
try {
await db.query("SELECT 1");
await redis.ping();
res.status(200).json({
status: "healthy",
checks: {
database: "ok",
redis: "ok",
},
});
} catch (error) {
res.status(503).json({
status: "unhealthy",
error: error.message,
});
}
});# Python/Flask health endpoint
@app.route('/health')
def health():
try:
# Check database
db.session.execute('SELECT 1')
return jsonify({'status': 'healthy'}), 200
except Exception as e:
return jsonify({'status': 'unhealthy', 'error': str(e)}), 503Liveness vs Readiness
In orchestrated environments (Kubernetes), distinguish between:
// Liveness: Is the process alive?
app.get("/livez", (req, res) => {
res.status(200).send("OK");
});
// Readiness: Can it serve traffic?
app.get("/readyz", async (req, res) => {
if (await isReady()) {
res.status(200).send("OK");
} else {
res.status(503).send("Not Ready");
}
});Note
Liveness checks should be simple—just verify the process is running. Readiness checks verify the application can handle requests (database connected, etc.).
Graceful Shutdown
Signal Handling
// Node.js graceful shutdown
const server = app.listen(3000);
process.on("SIGTERM", () => {
console.log("SIGTERM received, shutting down gracefully");
server.close(() => {
console.log("HTTP server closed");
// Close database connections
db.close(() => {
console.log("Database connection closed");
process.exit(0);
});
});
// Force close after timeout
setTimeout(() => {
console.error("Forced shutdown after timeout");
process.exit(1);
}, 30000);
});# Python graceful shutdown
import signal
import sys
def handle_sigterm(signum, frame):
print("SIGTERM received, shutting down gracefully")
# Cleanup code here
sys.exit(0)
signal.signal(signal.SIGTERM, handle_sigterm)Docker Stop Behavior
# Docker sends SIGTERM, waits, then SIGKILL
docker stop mycontainer # Default 10 second timeout
# Increase timeout for slow shutdown
docker stop --time 30 mycontainer
# Compose timeout
docker compose down --timeout 30Using Tini
For proper signal handling and zombie process reaping:
FROM node:20-alpine
# Install tini
RUN apk add --no-cache tini
# Use tini as init
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["node", "server.js"]Or use Docker's --init flag:
docker run --init myappDeployment Strategies
Rolling Update
Deploy new versions gradually:
# docker-compose.yml with scale
services:
app:
image: myapp:${VERSION}
deploy:
replicas: 3
update_config:
parallelism: 1
delay: 10s
order: start-first
failure_action: rollback
rollback_config:
parallelism: 1
delay: 10sBlue-Green Deployment
Run two identical environments, switch traffic:
# Blue environment (current)
services:
app-blue:
image: myapp:1.0
ports:
- "3000:3000"
# Green environment (new)
services:
app-green:
image: myapp:2.0
ports:
- "3001:3000"
# Switch by updating nginx/load balancer
nginx:
image: nginx
volumes:
- ./nginx.conf:/etc/nginx/nginx.confCanary Deployment
Route small percentage of traffic to new version:
# nginx.conf for canary
upstream app {
server app-stable:3000 weight=9;
server app-canary:3000 weight=1;
}Deployment Strategy Comparison
| Strategy | Zero Downtime | Rollback Speed | Resource Usage |
|---|---|---|---|
| Rolling | Yes | Medium | Normal |
| Blue-Green | Yes | Instant | 2x during deploy |
| Canary | Yes | Instant | Slightly more |
| Recreate | No | Fast | Normal |
Container Orchestration
Docker Swarm
Docker's native orchestration:
# Initialize swarm
docker swarm init
# Deploy stack
docker stack deploy -c docker-compose.yml myapp
# Scale service
docker service scale myapp_api=5
# View services
docker service ls
docker service ps myapp_api
# Update service
docker service update --image myapp:2.0 myapp_api# docker-compose.yml for Swarm
version: "3.8"
services:
app:
image: myapp:latest
deploy:
replicas: 3
placement:
constraints:
- node.role == worker
resources:
limits:
cpus: "0.5"
memory: 512M
restart_policy:
condition: on-failure
delay: 5s
max_attempts: 3
ports:
- "3000:3000"
networks:
- app-network
networks:
app-network:
driver: overlayKubernetes Basics
While Kubernetes deserves its own guide, here's a basic deployment:
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:1.0
ports:
- containerPort: 3000
resources:
limits:
cpu: "500m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: myapp
spec:
selector:
app: myapp
ports:
- port: 80
targetPort: 3000
type: LoadBalancerNote
Docker skills transfer directly to Kubernetes—you're still building and running Docker images. Kubernetes adds powerful orchestration, scaling, and self-healing capabilities.
Load Balancing
Nginx Load Balancer
# docker-compose.yml
services:
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- app
app:
image: myapp:latest
deploy:
replicas: 3# nginx.conf
upstream app {
least_conn;
server app:3000;
}
server {
listen 80;
location / {
proxy_pass http://app;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_cache_bypass $http_upgrade;
}
location /health {
access_log off;
proxy_pass http://app;
}
}Traefik
# docker-compose.yml with Traefik
services:
traefik:
image: traefik:v3.0
command:
- "--providers.docker=true"
- "--entrypoints.web.address=:80"
ports:
- "80:80"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
app:
image: myapp:latest
labels:
- "traefik.enable=true"
- "traefik.http.routers.app.rule=Host(`app.example.com`)"
- "traefik.http.services.app.loadbalancer.server.port=3000"Environment Configuration
Configuration Management
# docker-compose.yml
services:
app:
image: myapp:latest
environment:
- NODE_ENV=production
- LOG_LEVEL=info
env_file:
- .env.production
configs:
- source: app_config
target: /app/config.json
configs:
app_config:
file: ./config/production.jsonSecret Management
services:
app:
image: myapp:latest
secrets:
- db_password
- api_key
environment:
DB_PASSWORD_FILE: /run/secrets/db_password
secrets:
db_password:
external: true # Created with: docker secret create
api_key:
file: ./secrets/api_key.txtZero-Downtime Deployment Script
#!/bin/bash
# deploy.sh
set -e
IMAGE_NAME="myapp"
NEW_VERSION=$1
OLD_VERSION=$(docker ps --filter name=myapp --format '{{.Image}}' | head -1)
echo "Deploying $IMAGE_NAME:$NEW_VERSION"
# Pull new image
docker pull $IMAGE_NAME:$NEW_VERSION
# Start new container
docker run -d \
--name myapp-new \
--network myapp-network \
--health-cmd='wget --quiet --tries=1 --spider http://localhost:3000/health || exit 1' \
--health-interval=5s \
--health-retries=3 \
$IMAGE_NAME:$NEW_VERSION
# Wait for healthy
echo "Waiting for new container to be healthy..."
until [ "$(docker inspect --format='{{.State.Health.Status}}' myapp-new)" == "healthy" ]; do
sleep 2
done
# Switch load balancer
echo "Switching traffic..."
docker stop myapp-old 2>/dev/null || true
docker rename myapp myapp-old 2>/dev/null || true
docker rename myapp-new myapp
# Cleanup
docker rm myapp-old 2>/dev/null || true
echo "Deployment complete: $NEW_VERSION"Production Checklist
Container Configuration
- Multi-stage build for minimal image
- Non-root user
- Health checks defined
- Resource limits set
- Read-only filesystem where possible
- Proper signal handling
- Graceful shutdown implemented
Deployment
- Deployment strategy chosen
- Rollback procedure tested
- Zero-downtime verified
- Load balancer configured
- SSL/TLS termination set up
- DNS configured
Operations
- Logging to stdout/stderr
- Monitoring configured
- Alerting set up
- Backup procedures for data
- Disaster recovery plan
Quick Reference
Health Check Options
| Option | Default | Purpose |
|---|---|---|
--interval | 30s | Check frequency |
--timeout | 30s | Timeout per check |
--start-period | 0s | Initial grace period |
--retries | 3 | Failures before unhealthy |
Swarm Commands
| Command | Purpose |
|---|---|
docker swarm init | Initialize swarm |
docker stack deploy | Deploy stack |
docker service ls | List services |
docker service scale | Scale replicas |
docker service update | Update service |
In the next chapter, we'll explore monitoring and logging for containerized applications.