Configuration & Resource Management

Namespaces (Environment Isolation)

Use namespaces to separate environments, teams, or applications.

# Create namespaces
kubectl create namespace production
kubectl create namespace staging
kubectl create namespace monitoring

# Common convention
production    → prod workloads
staging       → pre-prod testing
development   → dev workloads
monitoring    → Prometheus, Grafana
kube-system   → K8s system components (don't touch)

# Reference namespace in resource
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
  namespace: production     # ← specify namespace

# Work in a namespace
kubectl get pods -n production
kubectl get all -n production

# Set default namespace for your session
kubectl config set-context --current --namespace=production

Labels and Selectors

Labels are key-value pairs used to organise and select objects.

metadata:
  labels:
    app: my-api             # App name
    version: "1.2.0"        # Version
    environment: production # Environment
    team: backend           # Owning team
    tier: api               # Tier (api, worker, db)

# Filter by label
kubectl get pods -l app=my-api
kubectl get pods -l app=my-api,environment=production
kubectl get pods -l 'version in (1.0.0, 1.1.0)'
kubectl get pods -l 'environment notin (dev,staging)'

# Set label on running resource
kubectl label pod my-pod app=my-api
kubectl label pod my-pod release=canary --overwrite

# Remove label
kubectl label pod my-pod release-

Annotations

Non-identifying metadata — for tools, documentation, automation.

metadata:
  annotations:
    description: "Spring Boot REST API"
    git-commit: "abc123def456"
    deployment-time: "2024-01-15T10:00:00Z"
    kubectl.kubernetes.io/last-applied-configuration: ...
    # Ingress controller config
    nginx.ingress.kubernetes.io/rate-limit: "100"
    # cert-manager config
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    # Prometheus scrape config
    prometheus.io/scrape: "true"
    prometheus.io/path: "/actuator/prometheus"
    prometheus.io/port: "9090"

ResourceQuota

Limit total resource consumption in a namespace.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    # Compute
    requests.cpu: "10"            # Total CPU requests in namespace
    requests.memory: 20Gi        # Total memory requests
    limits.cpu: "20"
    limits.memory: 40Gi

    # Object counts
    pods: "100"
    services: "20"
    secrets: "50"
    configmaps: "50"
    persistentvolumeclaims: "20"
    services.loadbalancers: "5"
    services.nodeports: "0"       # Disallow NodePort services

kubectl get resourcequota -n production
kubectl describe resourcequota production-quota -n production

LimitRange

Set default requests/limits and enforce min/max for individual containers.

apiVersion: v1
kind: LimitRange
metadata:
  name: container-limits
  namespace: production
spec:
  limits:
    - type: Container
      # Default values applied to containers without explicit settings
      default:
        cpu: "500m"
        memory: "512Mi"
      defaultRequest:
        cpu: "100m"
        memory: "128Mi"
      # Enforced bounds
      max:
        cpu: "4"
        memory: "4Gi"
      min:
        cpu: "50m"
        memory: "64Mi"

    - type: PersistentVolumeClaim
      max:
        storage: 100Gi
      min:
        storage: 1Gi

Taints and Tolerations

Taints on a Node repel Pods unless the Pod has a matching toleration.

# Add taint to a node
kubectl taint nodes node1 key=value:NoSchedule
kubectl taint nodes node1 gpu=true:NoSchedule
kubectl taint nodes node1 env=production:NoExecute   # Evict existing Pods too

# Remove taint
kubectl taint nodes node1 gpu=true:NoSchedule-

# Pod toleration — allows it to run on tainted node
spec:
  tolerations:
    - key: "gpu"
      operator: "Equal"
      value: "true"
      effect: "NoSchedule"

    # Tolerate ALL taints on a node (e.g., DaemonSet on control plane)
    - operator: "Exists"
      effect: "NoSchedule"

Taint Effects

Effect	Meaning
`NoSchedule`	Don't schedule new Pods without toleration
`PreferNoSchedule`	Prefer not to schedule (soft)
`NoExecute`	Don't schedule AND evict existing Pods without toleration

Use case: Dedicated nodes for GPU workloads, production vs dev isolation, control plane protection.

Advanced Scheduling (Affinity & Spread Constraints)

Node Affinity

spec:
  affinity:
    nodeAffinity:
      # HARD requirement — Pod won't schedule without this
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: kubernetes.io/arch
                operator: In
                values: ["amd64"]
              - key: node.kubernetes.io/instance-type
                operator: In
                values: ["m5.xlarge", "m5.2xlarge"]

      # SOFT preference — schedules anyway if no match
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          preference:
            matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values: ["us-east-1a"]

Pod Topology Spread Constraints

Ensures High Availability by preventing all replicas from landing on the same node or in the same Availability Zone.

spec:
  topologySpreadConstraints:
    - maxSkew: 1                           # Max difference in pod count between zones
      topologyKey: topology.kubernetes.io/zone
      whenUnsatisfiable: DoNotSchedule     # Strict: don't schedule if it violates skew
      labelSelector:
        matchLabels:
          app: my-api

Pod Disruption Budget (PDB)

Limit how many Pods of an application can be down simultaneously — protects during node drains, updates.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-api-pdb
spec:
  # Option 1: Minimum available
  minAvailable: 2             # At least 2 Pods must remain running

  # Option 2: Maximum unavailable (choose one)
  # maxUnavailable: 1         # At most 1 Pod can be unavailable

  selector:
    matchLabels:
      app: my-api

kubectl get pdb
# NAME       MIN AVAILABLE  MAX UNAVAILABLE  ALLOWED DISRUPTIONS  AGE
# my-api-pdb 2              N/A              1                    1d

Scenario: You have 3 replicas and minAvailable: 2. During kubectl drain node1, K8s will not remove a Pod if it would leave fewer than 2 running.

Production Secrets Management

By default, Kubernetes Secret objects are heavily insecure — they are merely base64 encoded (not encrypted). Anyone with access to the YAML or etcd can decode them.

1. Enable Encryption at Rest

Configure the API server to encrypt secrets before saving them to etcd:

# EncryptionConfiguration passed to kube-apiserver
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
    providers:
      - kms:               # Use AWS KMS / GCP KMS broker
          name: myKmsPlugin
          endpoint: unix:///tmp/kms-provider.sock
      - aescbc:            # Or local key (less secure)
          keys:
            - name: key1
              secret: <base64-encoded-key>

2. External Secrets Operator (ESO)

The Senior Pattern: Do NOT store secrets in Git or apply them via YAML. Use External Secrets Operator to automatically sync secrets from AWS Secrets Manager or HashiCorp Vault into the cluster.

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
  namespace: production
spec:
  refreshInterval: "1h"
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: k8s-db-secret      # Name of the K8s Secret ESO will generate
  data:
    - secretKey: password    # Key in the K8s secret
      remoteRef:
        key: prod/db/creds   # Key in AWS Secrets Manager
        property: password

Contexts and kubeconfig

# View current context
kubectl config current-context

# List all contexts (clusters)
kubectl config get-contexts

# Switch context
kubectl config use-context my-prod-cluster

# Set default namespace in context
kubectl config set-context --current --namespace=production

# View kubeconfig
kubectl config view
cat ~/.kube/config

Multi-Cluster kubeconfig

# Merge multiple kubeconfig files
KUBECONFIG=~/.kube/config:~/.kube/dev-cluster:~/.kube/prod-cluster \
  kubectl config view --merge --flatten > ~/.kube/config-merged

export KUBECONFIG=~/.kube/config-merged

Production Configuration Checklist

# Every production Deployment should have:
spec:
  replicas: 3                          # ✅ Multiple replicas
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0               # ✅ Zero-downtime update
      maxSurge: 1

  template:
    spec:
      containers:
        - resources:
            requests:                 # ✅ Always set requests (for scheduler)
              cpu: "250m"
              memory: "256Mi"
            limits:                   # ✅ Always set limits (prevent OOM)
              cpu: "1"
              memory: "512Mi"
          readinessProbe: ...         # ✅ Readiness probe (traffic routing)
          livenessProbe: ...          # ✅ Liveness probe (auto-restart)
          securityContext:
            runAsNonRoot: true        # ✅ Non-root
            readOnlyRootFilesystem: true  # ✅ Read-only filesystem

      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - topologyKey: kubernetes.io/hostname  # ✅ Spread across nodes

Useful Configuration Commands

# Explain any resource field
kubectl explain pod.spec.containers.resources
kubectl explain deployment.spec.strategy

# Edit a live resource
kubectl edit deployment my-api           # Opens in $EDITOR
kubectl patch deployment my-api -p '{"spec":{"replicas":5}}'

# Apply from stdin
cat deployment.yaml | kubectl apply -f -

# Dry run (validate without applying)
kubectl apply -f deployment.yaml --dry-run=client
kubectl apply -f deployment.yaml --dry-run=server  # Server-side validation

# Generate YAML without applying
kubectl create deployment my-api --image=myapp:1.0.0 \
  --dry-run=client -o yaml > deployment.yaml

# Diff changes before applying
kubectl diff -f deployment.yaml

Interview Questions

What is the difference between a label and an annotation in Kubernetes?
What does a ResourceQuota do and in what scope does it apply?
What is a LimitRange and what does it add on top of ResourceQuota?
What is a taint and what problem does it solve?
What is the difference between NoSchedule and NoExecute taint effects?
What is a Pod Disruption Budget (PDB) and why is it important?
How do you spread Pods across availability zones?
What is a kubeconfig context?
What does kubectl apply --dry-run=server do?
Why should you always set resource requests on every container in production?

Namespaces (Environment Isolation)​

Labels and Selectors​

Annotations​

ResourceQuota​

LimitRange​

Taints and Tolerations​

Taint Effects​

Advanced Scheduling (Affinity & Spread Constraints)​

Node Affinity​

Pod Topology Spread Constraints​

Pod Disruption Budget (PDB)​

Production Secrets Management​

1. Enable Encryption at Rest​

2. External Secrets Operator (ESO)​

Contexts and kubeconfig​

Multi-Cluster kubeconfig​

Production Configuration Checklist​

Useful Configuration Commands​

Interview Questions​