Learning Guides
Menu

Monitoring and Logging

8 min readDocker for Developers

Monitoring and Logging

Observability is critical for production containers. This chapter covers logging strategies, metrics collection, and monitoring solutions for containerized applications.

The Three Pillars of Observability

PLAINTEXT
┌─────────────────────────────────────────────────────────┐
│                    Observability                         │
│  ┌───────────────┐ ┌───────────────┐ ┌───────────────┐  │
│  │    Logs       │ │   Metrics     │ │   Traces      │  │
│  │               │ │               │ │               │  │
│  │  What         │ │  How much     │ │  Where        │  │
│  │  happened     │ │  and when     │ │  (flow)       │  │
│  └───────────────┘ └───────────────┘ └───────────────┘  │
└─────────────────────────────────────────────────────────┘

Observability Pillars

PillarPurposeExamples
LogsEvent recordsErrors, requests, debug info
MetricsNumerical measurementsCPU, memory, request count
TracesRequest flowDistributed transaction path

Container Logging

Docker Logging Basics

Containers should log to stdout and stderr:

DOCKERFILE
# Application logs to stdout/stderr
FROM node:20-alpine
WORKDIR /app
COPY . .
# Logs go to container output
CMD ["node", "server.js"]
JAVASCRIPT
// Application logging to stdout/stderr
console.log("Info message"); // stdout
console.error("Error message"); // stderr
 
// Structured logging
console.log(
  JSON.stringify({
    level: "info",
    message: "Request received",
    timestamp: new Date().toISOString(),
    requestId: req.id,
  }),
);

Viewing Logs

BASH
# View container logs
docker logs mycontainer
 
# Follow logs in real-time
docker logs -f mycontainer
 
# Show timestamps
docker logs -t mycontainer
 
# Last N lines
docker logs --tail 100 mycontainer
 
# Logs since specific time
docker logs --since 1h mycontainer
docker logs --since 2024-01-15T10:00:00 mycontainer
 
# Compose logs
docker compose logs
docker compose logs -f api
docker compose logs --tail 50 api db

Logging Drivers

Docker supports multiple logging drivers:

BASH
# Check current driver
docker info --format '{{.LoggingDriver}}'
 
# Run with specific driver
docker run --log-driver json-file myapp
docker run --log-driver syslog myapp
docker run --log-driver none myapp

Logging Drivers

DriverDescriptionUse Case
json-fileDefault, local JSON filesDevelopment, simple setups
syslogSystem syslogIntegration with syslog
journaldsystemd journalsystemd-based systems
fluentdFluentd collectorCentralized logging
awslogsAWS CloudWatchAWS deployments
gcplogsGoogle Cloud LoggingGCP deployments
noneNo loggingWhen using app-level logging

Configuring json-file Driver

YAML
# docker-compose.yml
services:
  app:
    image: myapp
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "5"
        compress: "true"
JSON
// /etc/docker/daemon.json
{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "5",
    "compress": "true"
  }
}

Warning

Without log rotation (max-size and max-file), container logs can consume all disk space. Always configure rotation in production.

Structured Logging

JSON Logging

JAVASCRIPT
// Node.js with pino
const pino = require("pino");
const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
});
 
logger.info({ requestId: "123", userId: "456" }, "User action");
 
// Output:
// {"level":"info","time":1704067200000,"requestId":"123","userId":"456","msg":"User action"}
PYTHON
# Python with structlog
import structlog
 
structlog.configure(
    processors=[
        structlog.stdlib.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer()
    ]
)
 
logger = structlog.get_logger()
logger.info("user_action", request_id="123", user_id="456")
 
# Output:
# {"event": "user_action", "request_id": "123", "user_id": "456", "level": "info", "timestamp": "2024-01-15T10:00:00Z"}

Log Levels

JAVASCRIPT
// Define consistent log levels
const logger = {
  debug: (msg, meta) =>
    console.log(JSON.stringify({ level: "debug", msg, ...meta })),
  info: (msg, meta) =>
    console.log(JSON.stringify({ level: "info", msg, ...meta })),
  warn: (msg, meta) =>
    console.warn(JSON.stringify({ level: "warn", msg, ...meta })),
  error: (msg, meta) =>
    console.error(JSON.stringify({ level: "error", msg, ...meta })),
};
 
// Use based on log level environment variable
const level = process.env.LOG_LEVEL || "info";

Note

Use structured (JSON) logging in production. It enables filtering, searching, and analysis in log aggregation systems.

Centralized Logging

ELK Stack (Elasticsearch, Logstash, Kibana)

YAML
# docker-compose.yml
services:
  elasticsearch:
    image: elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data
    ports:
      - "9200:9200"
 
  logstash:
    image: logstash:8.11.0
    volumes:
      - ./logstash/pipeline:/usr/share/logstash/pipeline
    ports:
      - "5044:5044"
    depends_on:
      - elasticsearch
 
  kibana:
    image: kibana:8.11.0
    ports:
      - "5601:5601"
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    depends_on:
      - elasticsearch
 
  app:
    image: myapp
    logging:
      driver: gelf
      options:
        gelf-address: "udp://localhost:12201"
 
volumes:
  elasticsearch-data:

Loki + Grafana

YAML
# docker-compose.yml
services:
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - loki-data:/loki
 
  promtail:
    image: grafana/promtail:2.9.0
    volumes:
      - /var/log:/var/log
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - ./promtail-config.yml:/etc/promtail/config.yml
    command: -config.file=/etc/promtail/config.yml
 
  grafana:
    image: grafana/grafana:10.2.0
    ports:
      - "3000:3000"
    environment:
      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
    volumes:
      - grafana-data:/var/lib/grafana
 
volumes:
  loki-data:
  grafana-data:
YAML
# promtail-config.yml
server:
  http_listen_port: 9080
 
positions:
  filename: /tmp/positions.yaml
 
clients:
  - url: http://loki:3100/loki/api/v1/push
 
scrape_configs:
  - job_name: docker
    static_configs:
      - targets:
          - localhost
        labels:
          job: docker
          __path__: /var/lib/docker/containers/*/*-json.log

Metrics Collection

Prometheus Metrics

YAML
# docker-compose.yml
services:
  prometheus:
    image: prom/prometheus:v2.47.0
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    command:
      - "--config.file=/etc/prometheus/prometheus.yml"
      - "--storage.tsdb.path=/prometheus"
 
  grafana:
    image: grafana/grafana:10.2.0
    ports:
      - "3000:3000"
    volumes:
      - grafana-data:/var/lib/grafana
    depends_on:
      - prometheus
 
  app:
    image: myapp
    ports:
      - "3001:3000"
      - "9100:9100" # Metrics endpoint
 
volumes:
  prometheus-data:
  grafana-data:
YAML
# prometheus.yml
global:
  scrape_interval: 15s
 
scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]
 
  - job_name: "app"
    static_configs:
      - targets: ["app:9100"]
 
  - job_name: "docker"
    static_configs:
      - targets: ["host.docker.internal:9323"]

Exposing Application Metrics

JAVASCRIPT
// Node.js with prom-client
const client = require("prom-client");
const express = require("express");
const app = express();
 
// Create metrics
const httpRequestCounter = new client.Counter({
  name: "http_requests_total",
  help: "Total number of HTTP requests",
  labelNames: ["method", "path", "status"],
});
 
const httpRequestDuration = new client.Histogram({
  name: "http_request_duration_seconds",
  help: "HTTP request duration in seconds",
  labelNames: ["method", "path"],
  buckets: [0.1, 0.5, 1, 2, 5],
});
 
// Middleware to track requests
app.use((req, res, next) => {
  const end = httpRequestDuration.startTimer({
    method: req.method,
    path: req.path,
  });
  res.on("finish", () => {
    httpRequestCounter.inc({
      method: req.method,
      path: req.path,
      status: res.statusCode,
    });
    end();
  });
  next();
});
 
// Metrics endpoint
app.get("/metrics", async (req, res) => {
  res.set("Content-Type", client.register.contentType);
  res.end(await client.register.metrics());
});
PYTHON
# Python with prometheus-client
from prometheus_client import Counter, Histogram, generate_latest
from flask import Flask, Response
 
app = Flask(__name__)
 
REQUEST_COUNT = Counter('http_requests_total', 'Total requests', ['method', 'path', 'status'])
REQUEST_LATENCY = Histogram('http_request_duration_seconds', 'Request latency', ['method', 'path'])
 
@app.before_request
def before_request():
    request.start_time = time.time()
 
@app.after_request
def after_request(response):
    REQUEST_COUNT.labels(request.method, request.path, response.status_code).inc()
    REQUEST_LATENCY.labels(request.method, request.path).observe(time.time() - request.start_time)
    return response
 
@app.route('/metrics')
def metrics():
    return Response(generate_latest(), mimetype='text/plain')

Container Metrics with cAdvisor

YAML
services:
  cadvisor:
    image: gcr.io/cadvisor/cadvisor:v0.47.0
    ports:
      - "8080:8080"
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    privileged: true

Note

cAdvisor provides detailed container metrics including CPU, memory, network, and filesystem usage. It's essential for understanding container resource consumption.

Complete Monitoring Stack

Docker Compose Setup

YAML
# docker-compose.monitoring.yml
services:
  # Application
  app:
    image: myapp
    ports:
      - "3000:3000"
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    labels:
      - "prometheus.scrape=true"
      - "prometheus.port=9100"
 
  # Metrics
  prometheus:
    image: prom/prometheus:v2.47.0
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus:/etc/prometheus
      - prometheus-data:/prometheus
 
  # Container metrics
  cadvisor:
    image: gcr.io/cadvisor/cadvisor:v0.47.0
    ports:
      - "8080:8080"
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
 
  # Log aggregation
  loki:
    image: grafana/loki:2.9.0
    ports:
      - "3100:3100"
 
  promtail:
    image: grafana/promtail:2.9.0
    volumes:
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - ./promtail:/etc/promtail
 
  # Visualization
  grafana:
    image: grafana/grafana:10.2.0
    ports:
      - "3001:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin
    volumes:
      - grafana-data:/var/lib/grafana
      - ./grafana/provisioning:/etc/grafana/provisioning
 
  # Alerting
  alertmanager:
    image: prom/alertmanager:v0.26.0
    ports:
      - "9093:9093"
    volumes:
      - ./alertmanager:/etc/alertmanager
 
volumes:
  prometheus-data:
  grafana-data:

Alerting

Prometheus Alert Rules

YAML
# prometheus/alert.rules.yml
groups:
  - name: container_alerts
    rules:
      - alert: HighMemoryUsage
        expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Container {{ $labels.name }} high memory usage"
          description: "Memory usage is above 90%"
 
      - alert: HighCPUUsage
        expr: rate(container_cpu_usage_seconds_total[5m]) > 0.9
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Container {{ $labels.name }} high CPU usage"
 
      - alert: ContainerDown
        expr: up{job="containers"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Container {{ $labels.instance }} is down"

Alertmanager Configuration

YAML
# alertmanager/config.yml
global:
  smtp_smarthost: "smtp.example.com:587"
  smtp_from: "alerts@example.com"
 
route:
  receiver: "default"
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 1h
  routes:
    - match:
        severity: critical
      receiver: "pagerduty"
 
receivers:
  - name: "default"
    email_configs:
      - to: "team@example.com"
 
  - name: "pagerduty"
    pagerduty_configs:
      - service_key: "your-pagerduty-key"
 
  - name: "slack"
    slack_configs:
      - api_url: "https://hooks.slack.com/services/..."
        channel: "#alerts"

Docker Stats and Events

Built-in Monitoring

BASH
# Real-time container stats
docker stats
 
# Stats for specific containers
docker stats app db redis
 
# One-shot stats
docker stats --no-stream
 
# Custom format
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"
 
# Docker events
docker events
 
# Filter events
docker events --filter 'type=container'
docker events --filter 'event=die'

Docker System Information

BASH
# Overall system info
docker system df
 
# Detailed breakdown
docker system df -v
 
# System information
docker info

Quick Reference

Logging Commands

CommandPurpose
docker logsView container logs
docker logs -fFollow logs
docker logs --tail NLast N lines
docker logs --sinceLogs since time

Monitoring Tools

ToolPurpose
PrometheusMetrics collection
GrafanaVisualization
LokiLog aggregation
cAdvisorContainer metrics
AlertmanagerAlert routing

Essential Metrics

MetricDescription
container_cpu_usage_seconds_totalCPU usage
container_memory_usage_bytesMemory usage
container_network_receive_bytes_totalNetwork in
container_network_transmit_bytes_totalNetwork out
container_fs_usage_bytesFilesystem usage

In the next chapter, we'll explore integrating Docker into CI/CD pipelines.