Writing Dockerfiles
A Dockerfile is a plain-text script of instructions that Docker executes top-to-bottom to build an image. Each instruction creates a new layer.
Dockerfile Instructions Referenceβ
FROM β Base Imageβ
FROM eclipse-temurin:21-jre-alpine
# Multi-stage: name each stage
FROM maven:3.9-eclipse-temurin-21 AS builder
FROM eclipse-temurin:21-jre-alpine AS runtime
Always pin to a specific version tag. Never use
FROM ubuntu(resolves tolatest).
WORKDIR β Set Working Directoryβ
WORKDIR /app
# Creates the directory if it doesn't exist.
# All subsequent instructions run relative to this path.
# Preferred over RUN cd /app
COPY vs ADDβ
# COPY β preferred for most cases
COPY src/main/resources/application.yml /app/config/
COPY target/myapp.jar /app/myapp.jar
COPY . . # Copy everything from build context
# ADD β avoid unless you specifically need its extras:
# - Auto-extract tar archives (ADD app.tar.gz /app)
# - Fetch remote URLs (avoid β non-deterministic, use curl instead)
ADD https://example.com/file.tar.gz /tmp/ # β Non-reproducible
RUN β Execute Commandsβ
# Single command
RUN apt-get update
# Chain commands with && to keep in ONE layer
# β
Efficient β one layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl wget && \
rm -rf /var/lib/apt/lists/* # Clean up in same layer!
# β Inefficient β three layers
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
Clean up package manager caches in the same RUN instruction β once the layer is committed, you can't remove files from it in a later layer.
CMD vs ENTRYPOINTβ
# CMD β default command, easily overridden at runtime
CMD ["java", "-jar", "/app/myapp.jar"]
docker run myapp # runs: java -jar /app/myapp.jar
docker run myapp java -jar other.jar # overrides CMD entirely
# ENTRYPOINT β fixed command, arguments are appended
ENTRYPOINT ["java"]
CMD ["-jar", "/app/myapp.jar"]
docker run myapp # runs: java -jar /app/myapp.jar
docker run myapp -jar other.jar # runs: java -jar other.jar (CMD overridden)
# Best pattern for applications β combine both:
ENTRYPOINT ["java"]
CMD ["-jar", "/app/myapp.jar"]
# Exec form (preferred) vs Shell form:
CMD ["java", "-jar", "app.jar"] # β
Exec form β signals go directly to process
CMD java -jar app.jar # β Shell form β runs via /bin/sh -c, PID 1 is shell
ENV β Environment Variablesβ
ENV APP_HOME=/app
ENV JAVA_OPTS="-Xms256m -Xmx512m"
ENV SPRING_PROFILES_ACTIVE=production
# Multi-line (Docker 1.9+)
ENV APP_HOME=/app \
APP_PORT=8080
# Access in RUN
RUN echo $APP_HOME
Prefer passing secrets at runtime (
docker run -e SECRET=...) β ENV values baked into image are visible in image history.
ARG β Build-Time Variablesβ
ARG JAR_FILE=target/myapp.jar
ARG BUILD_DATE
ARG GIT_COMMIT
COPY ${JAR_FILE} /app/app.jar
# Pass at build time
docker build --build-arg JAR_FILE=target/myapp-1.0.jar \
--build-arg GIT_COMMIT=$(git rev-parse HEAD) .
# ARG values not present in final image (unlike ENV)
# β Exception: ARG before FROM applies to FROM only
ARG BASE_IMAGE=eclipse-temurin:21-jre-alpine
FROM ${BASE_IMAGE}
EXPOSE β Document Portβ
EXPOSE 8080
EXPOSE 8080/tcp
EXPOSE 9090/udp
# EXPOSE is documentation only β it does NOT publish the port.
# Use -p flag at runtime to actually publish:
docker run -p 8080:8080 myapp
VOLUME β Declare Mount Pointβ
VOLUME ["/app/data"]
VOLUME /app/logs /app/uploads
# Declares that these paths should be persisted outside the container.
# Docker creates an anonymous volume automatically at runtime.
# Better: explicitly define volumes in docker run / docker-compose.
USER β Non-Root Userβ
# Create group and user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Switch to non-root
USER appuser
# Or by UID
USER 1001
HEALTHCHECK β Container Healthβ
HEALTHCHECK --interval=30s --timeout=5s --start-period=30s --retries=3 \
CMD curl -f http://localhost:8080/actuator/health || exit 1
# Status: healthy / unhealthy / starting
LABEL β Metadataβ
LABEL maintainer="[email protected]" \
version="1.0.0" \
description="My Spring Boot API" \
org.opencontainers.image.source="https://github.com/org/repo" \
org.opencontainers.image.revision="${GIT_COMMIT}"
Multi-Stage Buildsβ
Build in one stage, copy only the result to the final image. Drastically reduces image size.
Spring Boot β Multi-Stage Buildβ
# βββ Stage 1: Build βββββββββββββββββββββββββββββββββββββββββββββ
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /build
# Copy POM first β layer cache: only re-download deps when pom.xml changes
COPY pom.xml .
RUN mvn dependency:go-offline -q
# Now copy source and build
COPY src ./src
RUN mvn package -DskipTests -q
# βββ Stage 2: Extract layers (Spring Boot layered JARs) ββββββββββ
FROM builder AS extractor
WORKDIR /build
RUN java -Djarmode=layertools -jar target/*.jar extract --destination extracted
# βββ Stage 3: Runtime ββββββββββββββββββββββββββββββββββββββββββββ
FROM eclipse-temurin:21-jre-alpine AS runtime
WORKDIR /app
# Security: non-root user
RUN addgroup -S spring && adduser -S spring -G spring
# Copy Spring Boot layers (ordered by change frequency β slowest changing first)
COPY --from=extractor --chown=spring:spring /build/extracted/dependencies/ ./
COPY --from=extractor --chown=spring:spring /build/extracted/spring-boot-loader/ ./
COPY --from=extractor --chown=spring:spring /build/extracted/snapshot-dependencies/ ./
COPY --from=extractor --chown=spring:spring /build/extracted/application/ ./
USER spring
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \
CMD wget -qO- http://localhost:8080/actuator/health || exit 1
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]
Without multi-stage: ~650 MB (JDK + Maven + source + test deps + JAR)
With multi-stage: ~180 MB (JRE only + app layers)
Why Spring Boot Layers?β
Spring Boot layered JAR extracts into:
dependencies/ β third-party JARs β rarely change
spring-boot-loader/ β boot loader β rarely changes
snapshot-dependencies/ β your SNAPSHOT deps β change occasionally
application/ β your code β changes often
β Only the "application" layer is rebuilt on code change
β CI pipeline rebuilds only the changed layer (seconds, not minutes)
Layer Caching Optimizationβ
The #1 performance tip: put things that change rarely at the top, things that change often at the bottom.
# β BAD ORDER β source code change invalidates ALL layers below
FROM eclipse-temurin:21-jre-alpine
COPY . . # β Copies everything including source
RUN mvn package # β Must re-run every time
COPY target/app.jar /app/app.jar
# β
GOOD ORDER β cache pom.xml deps independently of source
FROM maven:3.9-eclipse-temurin-21 AS builder
COPY pom.xml . # β Only changes when deps change
RUN mvn dependency:go-offline # β Cached until pom.xml changes
COPY src ./src # β Changes every commit
RUN mvn package -DskipTests # β Only re-runs when src changes
BuildKit Cache Mounts (Advanced)β
Modern Docker (using BuildKit) supports cache mounts, allowing package managers to keep their cache between builds without bloating the image layer.
# Syntax directive specifies BuildKit versions
# syntax=docker/dockerfile:1.4
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /app
# The --mount=type=cache persists the ~/.m2 cache directory across multiple `docker build` invocations
COPY pom.xml .
COPY src ./src
RUN --mount=type=cache,target=/root/.m2 \
mvn clean package -DskipTests
.dockerignoreβ
Exclude files from the build context sent to the Docker daemon. Speeds up builds and prevents secrets leaking into images.
# .dockerignore
.git
.gitignore
.idea
*.iml
target/ # Maven output β we build inside Docker
!target/myapp.jar # But include final JAR if pre-built
*.log
.env # β NEVER send .env to Docker daemon
node_modules/
README.md
Dockerfile*
docker-compose*.yml
Security Hardening Checklistβ
# β
1. Use Distroless or Alpine base images
# Distroless images contain ONLY your application and its runtime dependencies.
# They do NOT contain package managers, shells, or any other programs you would expect to find in a standard Linux distribution.
FROM gcr.io/distroless/java21-debian12
# No shell (/bin/sh) = massive attack surface reduction
# β
2. Non-root user
RUN addgroup -S app && adduser -S app -G app
USER app
# β
3. No secrets in image
# Never: ENV DB_PASSWORD=secret123
# Do: docker run -e DB_PASSWORD=$SECRET at runtime
# β
4. Read-only filesystem (set at runtime or in K8s)
# docker run --read-only --tmpfs /tmp myapp
# β
5. Minimal layers β no build tools in final image (multi-stage)
# β
6. Pin exact versions
FROM eclipse-temurin:21.0.5_11-jre-alpine # pinned patch version
# NOT: FROM eclipse-temurin:21
# β
7. Scan image for vulnerabilities
# trivy image myapp:1.0.0
Common Dockerfile Patternsβ
Pattern: Overridable JVM optionsβ
ENV JAVA_OPTS=""
ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar /app/app.jar"]
# Override at runtime:
docker run -e JAVA_OPTS="-Xmx2g -XX:+UseG1GC" myapp
Pattern: Wait for dependenciesβ
# Use wait-for-it.sh or dockerize to wait for DB before starting app
COPY wait-for-it.sh /wait-for-it.sh
RUN chmod +x /wait-for-it.sh
CMD ["/wait-for-it.sh", "db:5432", "--", "java", "-jar", "/app/app.jar"]
Pattern: Configuration via environmentβ
# Spring Boot reads SPRING_* env vars automatically
docker run \
-e SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/mydb \
-e SPRING_DATASOURCE_USERNAME=user \
-e SPRING_DATASOURCE_PASSWORD=$DB_PASS \
-e SPRING_PROFILES_ACTIVE=production \
myapp:1.0.0
Build Commandsβ
# Build an image
docker build -t myapp:1.0.0 .
docker build -t myapp:1.0.0 -f Dockerfile.prod . # Custom Dockerfile name
docker build -t myapp:1.0.0 --no-cache . # Force full rebuild
docker build -t myapp:1.0.0 --build-arg PROFILE=prod .
# Build and tag multiple
docker build -t myapp:1.0.0 -t myapp:latest .
# Multi-platform build (build for linux/amd64 AND linux/arm64)
docker buildx build --platform linux/amd64,linux/arm64 -t myapp:1.0.0 --push .
# Check image layers and size
docker history myapp:1.0.0
docker image inspect myapp:1.0.0
Interview Questionsβ
- What is the difference between
CMDandENTRYPOINT? - What is a multi-stage build and why is it used?
- Why does instruction order matter in a Dockerfile?
- What is the
.dockerignorefile and why is it important? - What is the difference between
COPYandADD? - Why should you use
RUN apt-get update && apt-get installin oneRUNinstruction? - How do Spring Boot layered JARs improve Docker build performance?
- Why should containers run as a non-root user?
- What is the difference between
ARGandENV? - What does
EXPOSEactually do?