Writing Dockerfiles
A Dockerfile is a plain-text script of instructions that Docker executes top-to-bottom to build an image. Each instruction creates a new layer.
Dockerfile Instructions Reference
FROM — Base Image
FROM eclipse-temurin:21-jre-alpine
# Multi-stage: name each stage
FROM maven:3.9-eclipse-temurin-21 AS builder
FROM eclipse-temurin:21-jre-alpine AS runtime
Always pin to a specific version tag. Never use
FROM ubuntu(resolves tolatest).
WORKDIR — Set Working Directory
WORKDIR /app
# Creates the directory if it doesn't exist.
# All subsequent instructions run relative to this path.
# Preferred over RUN cd /app
COPY vs ADD
# COPY — preferred for most cases
COPY src/main/resources/application.yml /app/config/
COPY target/myapp.jar /app/myapp.jar
COPY . . # Copy everything from build context
# ADD — avoid unless you specifically need its extras:
# - Auto-extract tar archives (ADD app.tar.gz /app)
# - Fetch remote URLs (avoid — non-deterministic, use curl instead)
ADD https://example.com/file.tar.gz /tmp/ # ❌ Non-reproducible
RUN — Execute Commands
# Single command
RUN apt-get update
# Chain commands with && to keep in ONE layer
# ✅ Efficient — one layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl wget && \
rm -rf /var/lib/apt/lists/* # Clean up in same layer!
# ❌ Inefficient — three layers
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
Clean up package manager caches in the same RUN instruction — once the layer is committed, you can't remove files from it in a later layer.
CMD vs ENTRYPOINT
# CMD — default command, easily overridden at runtime
CMD ["java", "-jar", "/app/myapp.jar"]
docker run myapp # runs: java -jar /app/myapp.jar
docker run myapp java -jar other.jar # overrides CMD entirely
# ENTRYPOINT — fixed command, arguments are appended
ENTRYPOINT ["java"]
CMD ["-jar", "/app/myapp.jar"]
docker run myapp # runs: java -jar /app/myapp.jar
docker run myapp -jar other.jar # runs: java -jar other.jar (CMD overridden)
# Best pattern for applications — combine both:
ENTRYPOINT ["java"]
CMD ["-jar", "/app/myapp.jar"]
# Exec form (preferred) vs Shell form:
CMD ["java", "-jar", "app.jar"] # ✅ Exec form — signals go directly to process
CMD java -jar app.jar # ❌ Shell form — runs via /bin/sh -c, PID 1 is shell
ENV — Environment Variables
ENV APP_HOME=/app
ENV JAVA_OPTS="-Xms256m -Xmx512m"
ENV SPRING_PROFILES_ACTIVE=production
# Multi-line (Docker 1.9+)
ENV APP_HOME=/app \
APP_PORT=8080
# Access in RUN
RUN echo $APP_HOME
Prefer passing secrets at runtime (
docker run -e SECRET=...) — ENV values baked into image are visible in image history.
ARG — Build-Time Variables
ARG JAR_FILE=target/myapp.jar
ARG BUILD_DATE
ARG GIT_COMMIT
COPY ${JAR_FILE} /app/app.jar
# Pass at build time
docker build --build-arg JAR_FILE=target/myapp-1.0.jar \
--build-arg GIT_COMMIT=$(git rev-parse HEAD) .
# ARG values not present in final image (unlike ENV)
# ⚠ Exception: ARG before FROM applies to FROM only
ARG BASE_IMAGE=eclipse-temurin:21-jre-alpine
FROM ${BASE_IMAGE}
EXPOSE — Document Port
EXPOSE 8080
EXPOSE 8080/tcp
EXPOSE 9090/udp
# EXPOSE is documentation only — it does NOT publish the port.
# Use -p flag at runtime to actually publish:
docker run -p 8080:8080 myapp
VOLUME — Declare Mount Point
VOLUME ["/app/data"]
VOLUME /app/logs /app/uploads
# Declares that these paths should be persisted outside the container.
# Docker creates an anonymous volume automatically at runtime.
# Better: explicitly define volumes in docker run / docker-compose.
USER — Non-Root User
# Create group and user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Switch to non-root
USER appuser
# Or by UID
USER 1001
HEALTHCHECK — Container Health
HEALTHCHECK \
CMD curl -f http://localhost:8080/actuator/health || exit 1
# Status: healthy / unhealthy / starting
LABEL — Metadata
LABEL maintainer="team@example.com" \
version="1.0.0" \
description="My Spring Boot API" \
org.opencontainers.image.source="https://github.com/org/repo" \
org.opencontainers.image.revision="${GIT_COMMIT}"
Multi-Stage Builds
Build in one stage, copy only the result to the final image. Drastically reduces image size.
Spring Boot — Multi-Stage Build
# ─── Stage 1: Build ─────────────────────────────────────────────
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /build
# Copy POM first — layer cache: only re-download deps when pom.xml changes
COPY pom.xml .
RUN mvn dependency:go-offline -q
# Now copy source and build
COPY src ./src
RUN mvn package -DskipTests -q
# ─── Stage 2: Extract layers (Spring Boot layered JARs) ──────────
FROM builder AS extractor
WORKDIR /build
RUN java -Djarmode=layertools -jar target/*.jar extract --destination extracted
# ─── Stage 3: Runtime ────────────────────────────────────────────
FROM eclipse-temurin:21-jre-alpine AS runtime
WORKDIR /app
# Security: non-root user
RUN addgroup -S spring && adduser -S spring -G spring
# Copy Spring Boot layers (ordered by change frequency — slowest changing first)
COPY /build/extracted/dependencies/ ./
COPY /build/extracted/spring-boot-loader/ ./
COPY /build/extracted/snapshot-dependencies/ ./
COPY /build/extracted/application/ ./
USER spring
EXPOSE 8080
HEALTHCHECK \
CMD wget -qO- http://localhost:8080/actuator/health || exit 1
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]
Without multi-stage: ~650 MB (JDK + Maven + source + test deps + JAR)
With multi-stage: ~180 MB (JRE only + app layers)
Why Spring Boot Layers?
Spring Boot layered JAR extracts into:
dependencies/ ← third-party JARs — rarely change
spring-boot-loader/ ← boot loader — rarely changes
snapshot-dependencies/ ← your SNAPSHOT deps — change occasionally
application/ ← your code — changes often
→ Only the "application" layer is rebuilt on code change
→ CI pipeline rebuilds only the changed layer (seconds, not minutes)
Layer Caching Optimization
The #1 performance tip: put things that change rarely at the top, things that change often at the bottom.
# ❌ BAD ORDER — source code change invalidates ALL layers below
FROM eclipse-temurin:21-jre-alpine
COPY . . # ← Copies everything including source
RUN mvn package # ← Must re-run every time
COPY target/app.jar /app/app.jar
# ✅ GOOD ORDER — cache pom.xml deps independently of source
FROM maven:3.9-eclipse-temurin-21 AS builder
COPY pom.xml . # ← Only changes when deps change
RUN mvn dependency:go-offline # ← Cached until pom.xml changes
COPY src ./src # ← Changes every commit
RUN mvn package -DskipTests # ← Only re-runs when src changes
.dockerignore
Exclude files from the build context sent to the Docker daemon. Speeds up builds and prevents secrets leaking into images.
# .dockerignore
.git
.gitignore
.idea
*.iml
target/ # Maven output — we build inside Docker
!target/myapp.jar # But include final JAR if pre-built
*.log
.env # ← NEVER send .env to Docker daemon
node_modules/
README.md
Dockerfile*
docker-compose*.yml
Security Hardening Checklist
# ✅ 1. Use minimal base image
FROM eclipse-temurin:21-jre-alpine # NOT full JDK, NOT Debian
# ✅ 2. Non-root user
RUN addgroup -S app && adduser -S app -G app
USER app
# ✅ 3. No secrets in image
# Never: ENV DB_PASSWORD=secret123
# Do: docker run -e DB_PASSWORD=$SECRET at runtime
# ✅ 4. Read-only filesystem (set at runtime or in K8s)
# docker run --read-only --tmpfs /tmp myapp
# ✅ 5. Minimal layers — no build tools in final image (multi-stage)
# ✅ 6. Pin exact versions
FROM eclipse-temurin:21.0.5_11-jre-alpine # pinned patch version
# NOT: FROM eclipse-temurin:21
# ✅ 7. Scan image for vulnerabilities
# trivy image myapp:1.0.0
Common Dockerfile Patterns
Pattern: Overridable JVM options
ENV JAVA_OPTS=""
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar /app/app.jar"]
# Override at runtime:
docker run -e JAVA_OPTS="-Xmx2g -XX:+UseG1GC" myapp
Pattern: Wait for dependencies
# Use wait-for-it.sh or dockerize to wait for DB before starting app
COPY wait-for-it.sh /wait-for-it.sh
RUN chmod +x /wait-for-it.sh
CMD ["/wait-for-it.sh", "db:5432", "--", "java", "-jar", "/app/app.jar"]
Pattern: Configuration via environment
# Spring Boot reads SPRING_* env vars automatically
docker run \
-e SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/mydb \
-e SPRING_DATASOURCE_USERNAME=user \
-e SPRING_DATASOURCE_PASSWORD=$DB_PASS \
-e SPRING_PROFILES_ACTIVE=production \
myapp:1.0.0
Build Commands
# Build an image
docker build -t myapp:1.0.0 .
docker build -t myapp:1.0.0 -f Dockerfile.prod . # Custom Dockerfile name
docker build -t myapp:1.0.0 --no-cache . # Force full rebuild
docker build -t myapp:1.0.0 --build-arg PROFILE=prod .
# Build and tag multiple
docker build -t myapp:1.0.0 -t myapp:latest .
# Multi-platform build (build for linux/amd64 AND linux/arm64)
docker buildx build --platform linux/amd64,linux/arm64 -t myapp:1.0.0 --push .
# Check image layers and size
docker history myapp:1.0.0
docker image inspect myapp:1.0.0
Interview Questions
- What is the difference between
CMDandENTRYPOINT? - What is a multi-stage build and why is it used?
- Why does instruction order matter in a Dockerfile?
- What is the
.dockerignorefile and why is it important? - What is the difference between
COPYandADD? - Why should you use
RUN apt-get update && apt-get installin oneRUNinstruction? - How do Spring Boot layered JARs improve Docker build performance?
- Why should containers run as a non-root user?
- What is the difference between
ARGandENV? - What does
EXPOSEactually do?