Building Docker Images
Building Docker Images
Creating custom Docker images is a fundamental skill. This chapter covers Dockerfile syntax, the build process, and how to create efficient, production-ready images.
Introduction to Dockerfiles
A Dockerfile is a text file containing instructions to build a Docker image. Each instruction creates a layer in the final image.
# A simple Dockerfile
FROM node:20-alpine
WORKDIR /app
COPY package.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["npm", "start"]Note
Dockerfiles are named Dockerfile by convention (no extension). You can use
alternative names with the -f flag when building.
The Build Process
Basic Build Command
# Build from current directory
docker build -t myapp:1.0 .
# The dot (.) specifies the build contextWhat Happens During a Build
- Docker sends the build context to the daemon
- Docker executes each instruction in order
- Each instruction creates a new layer
- The final image is tagged with the specified name
$ docker build -t myapp:1.0 .
[+] Building 45.2s (10/10) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> [internal] load .dockerignore 0.0s
=> [internal] load metadata for docker.io/library/node:20-alpine 1.2s
=> [1/5] FROM docker.io/library/node:20-alpine@sha256:abc123... 5.3s
=> [2/5] WORKDIR /app 0.1s
=> [3/5] COPY package.json ./ 0.0s
=> [4/5] RUN npm install 25.6s
=> [5/5] COPY . . 0.1s
=> exporting to image 2.8sBuild Context
The build context is the set of files Docker can access during the build:
# Current directory as context
docker build .
# Specific directory as context
docker build ./app
# Build from a URL
docker build https://github.com/user/repo.git
# Build from stdin
docker build - < DockerfileWarning
Docker sends the entire build context to the daemon. Large contexts slow down
builds. Use .dockerignore to exclude unnecessary files.
The .dockerignore File
Create a .dockerignore file to exclude files from the build context:
# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
.env
.env.*
*.md
!README.md
Dockerfile*
docker-compose*
.dockerignore
coverage
.nyc_output
dist
buildImpact of .dockerignore
# Without .dockerignore (includes node_modules)
Sending build context to Docker daemon 245.3MB
# With .dockerignore (excludes node_modules)
Sending build context to Docker daemon 156.7kBDockerfile Instructions
FROM - Base Image
Every Dockerfile starts with FROM:
# Use an official base image
FROM node:20-alpine
# Use a specific version
FROM python:3.12.1-slim-bookworm
# Use scratch for minimal images
FROM scratch
# Multiple FROM for multi-stage builds
FROM node:20-alpine AS builder
FROM nginx:alpine AS productionWORKDIR - Working Directory
Set the working directory for subsequent instructions:
# Create and change to /app directory
WORKDIR /app
# Subsequent commands run from /app
COPY . . # Copies to /app
RUN npm install # Runs in /appNote
Always use WORKDIR instead of RUN cd /some/path. WORKDIR creates the
directory if it doesn't exist and persists across instructions.
COPY - Copy Files
Copy files from build context to the image:
# Copy single file
COPY package.json ./
# Copy multiple files
COPY package.json package-lock.json ./
# Copy directory
COPY src/ ./src/
# Copy with glob pattern
COPY *.json ./
# Copy with different name
COPY config.prod.json ./config.json
# Copy with ownership
COPY --chown=node:node . .ADD - Copy with Extras
ADD is like COPY but with additional features:
# Copy and extract tar archives
ADD app.tar.gz /app/
# Download from URL
ADD https://example.com/file.txt /app/
# Prefer COPY for simple file copying
COPY app.tar.gz /app/ # Doesn't extract
ADD app.tar.gz /app/ # Extracts automaticallyWarning
Prefer COPY over ADD unless you specifically need auto-extraction or URL downloading. COPY is more transparent and predictable.
RUN - Execute Commands
Run commands during the build:
# Shell form (runs in shell)
RUN npm install
RUN apt-get update && apt-get install -y curl
# Exec form (no shell)
RUN ["npm", "install"]
# Multi-line for readability
RUN apt-get update && \
apt-get install -y \
curl \
git \
vim && \
rm -rf /var/lib/apt/lists/*CMD - Default Command
Specify the default command when the container starts:
# Exec form (preferred)
CMD ["npm", "start"]
CMD ["python", "app.py"]
CMD ["node", "server.js"]
# Shell form
CMD npm start
# Command with arguments
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]Note
CMD can be overridden at runtime: docker run myapp python other-script.py
ENTRYPOINT - Container Executable
Define the container's main executable:
# The container behaves like this executable
ENTRYPOINT ["python"]
CMD ["app.py"]
# Running the container:
# docker run myapp → python app.py
# docker run myapp other.py → python other.pyENTRYPOINT vs CMD
# CMD only (easily overridden)
FROM python:3.12
CMD ["python", "app.py"]
# docker run myapp bash → runs bash, not python
# ENTRYPOINT + CMD (app arguments overridable)
FROM python:3.12
ENTRYPOINT ["python"]
CMD ["app.py"]
# docker run myapp bash → runs python bash (error)
# docker run myapp other.py → runs python other.py
# ENTRYPOINT with script (common pattern)
FROM node:20
COPY docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["npm", "start"]ENV - Environment Variables
Set environment variables:
# Single variable
ENV NODE_ENV=production
# Multiple variables
ENV NODE_ENV=production \
PORT=3000 \
LOG_LEVEL=info
# Variables are available during build and runtime
RUN echo $NODE_ENV # Available in build
# And when container runsARG - Build Arguments
Define build-time variables:
# Define build argument
ARG NODE_VERSION=20
FROM node:${NODE_VERSION}-alpine
ARG APP_VERSION
ENV APP_VERSION=${APP_VERSION}
# Pass during build
# docker build --build-arg APP_VERSION=1.0.0 .Warning
ARG values are not available after the build completes unless you copy them to ENV. Don't use ARG for secrets—they're visible in image history.
EXPOSE - Document Ports
Document which ports the container listens on:
# Single port
EXPOSE 3000
# Multiple ports
EXPOSE 80 443
# UDP port
EXPOSE 53/udp
# Note: EXPOSE doesn't publish ports
# You still need -p at runtimeNote
EXPOSE is documentation. It doesn't actually publish ports. Use docker run -p to publish ports at runtime.
VOLUME - Define Mount Points
Declare volumes for persistent data:
# Create a volume mount point
VOLUME /data
# Multiple volumes
VOLUME ["/data", "/logs"]
# The container expects data here
# Map at runtime: docker run -v mydata:/data myappUSER - Run as Non-Root
Specify the user for RUN, CMD, and ENTRYPOINT:
# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Switch to non-root user
USER appuser
# Subsequent commands run as appuser
WORKDIR /home/appuser/app
COPY --chown=appuser:appgroup . .
CMD ["node", "server.js"]LABEL - Metadata
Add metadata to images:
LABEL maintainer="dev@example.com"
LABEL version="1.0"
LABEL description="My application image"
# Multiple labels
LABEL org.opencontainers.image.title="My App" \
org.opencontainers.image.version="1.0.0" \
org.opencontainers.image.authors="team@example.com"HEALTHCHECK - Container Health
Define how Docker checks if the container is healthy:
# Check every 30s, timeout after 10s
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
# Disable healthcheck from parent image
HEALTHCHECK NONEHealthcheck Status
$ docker ps
CONTAINER ID IMAGE STATUS
abc123 myapp Up 2 min (healthy)
def456 myapp Up 5 min (unhealthy)Complete Dockerfile Examples
Node.js Application
# Node.js Dockerfile
FROM node:20-alpine
# Set working directory
WORKDIR /app
# Copy package files first (better caching)
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Copy application code
COPY . .
# Create non-root user
RUN addgroup -S nodejs && adduser -S nodejs -G nodejs
USER nodejs
# Expose port
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD wget --quiet --tries=1 --spider http://localhost:3000/health || exit 1
# Start the application
CMD ["node", "server.js"]Python Application
# Python Dockerfile
FROM python:3.12-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
# Set working directory
WORKDIR /app
# Install system dependencies
RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
libpq-dev && \
rm -rf /var/lib/apt/lists/*
# Copy requirements first
COPY requirements.txt .
# Install Python dependencies
RUN pip install -r requirements.txt
# Copy application code
COPY . .
# Create non-root user
RUN useradd --create-home --shell /bin/bash appuser && \
chown -R appuser:appuser /app
USER appuser
# Expose port
EXPOSE 8000
# Start the application
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]Go Application
# Go Dockerfile with multi-stage build
FROM golang:1.22-alpine AS builder
WORKDIR /app
# Copy go mod files
COPY go.mod go.sum ./
RUN go mod download
# Copy source code
COPY . .
# Build the binary
RUN CGO_ENABLED=0 GOOS=linux go build -o /app/server .
# Final stage - minimal image
FROM alpine:3.19
# Install ca-certificates for HTTPS
RUN apk --no-cache add ca-certificates
WORKDIR /app
# Copy binary from builder
COPY --from=builder /app/server .
# Create non-root user
RUN adduser -D -g '' appuser
USER appuser
EXPOSE 8080
CMD ["./server"]Build Options
Common Build Flags
# Tag the image
docker build -t myapp:1.0 .
# Multiple tags
docker build -t myapp:1.0 -t myapp:latest .
# Specify Dockerfile
docker build -f Dockerfile.prod .
# Build arguments
docker build --build-arg VERSION=1.0.0 .
# No cache (rebuild everything)
docker build --no-cache .
# Pull fresh base images
docker build --pull .
# Show build output
docker build --progress=plain .
# Build for specific platform
docker build --platform linux/amd64 .Build with BuildKit
BuildKit is Docker's modern build engine with enhanced features:
# Enable BuildKit (usually default now)
export DOCKER_BUILDKIT=1
# Build with BuildKit
docker build .
# BuildKit features:
# - Parallel stage execution
# - Better caching
# - Secret mounting
# - SSH forwardingCaching and Layer Optimization
Understanding the Build Cache
Docker caches each layer. If an instruction hasn't changed, Docker reuses the cached layer:
Step 1/5 : FROM node:20-alpine
---> Using cache
Step 2/5 : WORKDIR /app
---> Using cache
Step 3/5 : COPY package.json ./
---> Using cache
Step 4/5 : RUN npm install
---> Running in abc123... # Cache miss hereOrder Instructions for Caching
Place frequently changing instructions last:
# GOOD: Dependencies cached if package.json unchanged
COPY package.json ./
RUN npm install
COPY . . # Source changes don't invalidate npm install
# BAD: Any source change invalidates npm install cache
COPY . .
RUN npm installCache Optimization
# Optimized Dockerfile
FROM node:20-alpine
WORKDIR /app
# Layer 1: Rarely changes
COPY package.json package-lock.json ./
# Layer 2: Only changes when dependencies change
RUN npm ci
# Layer 3: Changes frequently
COPY . .
# Layer 4: Rarely changes
CMD ["npm", "start"]Debugging Builds
Build Progress Output
# Show detailed output
docker build --progress=plain .
# Show all output including cache hits
docker build --progress=plain --no-cache .Inspect Intermediate Layers
# View image history
docker history myapp:1.0
# Run an intermediate layer for debugging
docker run -it <intermediate-layer-id> /bin/shBuild Targets
# Dockerfile with stages
FROM node:20-alpine AS base
WORKDIR /app
COPY package*.json ./
FROM base AS development
RUN npm install
COPY . .
CMD ["npm", "run", "dev"]
FROM base AS production
RUN npm ci --only=production
COPY . .
CMD ["npm", "start"]# Build specific target
docker build --target development -t myapp:dev .
docker build --target production -t myapp:prod .Quick Reference
| Instruction | Purpose |
|---|---|
FROM | Set base image |
WORKDIR | Set working directory |
COPY | Copy files from context |
ADD | Copy files with extras (tar, URLs) |
RUN | Execute build commands |
CMD | Default container command |
ENTRYPOINT | Container executable |
ENV | Set environment variables |
ARG | Build-time arguments |
EXPOSE | Document ports |
VOLUME | Declare mount points |
USER | Set user for commands |
LABEL | Add metadata |
HEALTHCHECK | Define health check |
In the next chapter, we'll explore advanced Dockerfile best practices including multi-stage builds and security considerations.