All key concepts, bash commands, and architectures extracted from four official Red Hat student workbooks and organized into one progressive knowledge reference.
DO188 Intro to Containers with Podman (4.12)DO180 OpenShift Admin I — Managing Containers & K8s (4.12)DO288 Dev I — Containerizing Applications (4.2)DO280 Admin II — Configuring a Production Cluster (4.14)
Chapter 1 · DO188
Container Fundamentals
A container is an encapsulated process that includes all required runtime dependencies. Unlike a virtual machine, a container shares the host kernel but isolates its filesystem, network, and process tree using Linux kernel primitives.
🧱
Namespaces
Kernel feature that isolates processes — each container gets its own PID, network, mount, UTS, IPC, and user namespaces, providing the illusion of a dedicated machine.
⚙️
Control Groups (cgroups)
Kernel mechanism for resource management. Limits and tracks CPU time, memory, disk I/O, and network bandwidth allocated to containers.
📦
Container Image
An immutable, layered archive defining an application and its libraries. Read-only layers are stacked via a union filesystem; a writable layer is added at runtime.
🏃
Container Instance
A running process created from a container image. Analogous to an object instantiated from a class. Many instances can run from a single image simultaneously.
📋
OCI Standard
The Open Container Initiative defines image-spec and runtime-spec so any compliant engine (Podman, Docker, CRI-O) can run the same images interchangeably.
🔄
Ephemeral by Default
Container engines remove the writable layer when a container is deleted. Any data written inside a container is lost unless explicitly persisted via a volume or bind mount.
Containers vs. Virtual Machines
Attribute
Virtual Machine
Container
Machine-level component
Hypervisor (KVM, VMware, Hyper-V)
Container engine (Podman, CRI-O)
Virtualization level
Fully virtualized environment + own kernel
Shared host kernel; isolated user-space only
Typical size
Gigabytes
Megabytes
Startup time
Minutes
Milliseconds to seconds
Portability
Usually tied to same hypervisor
Any OCI-compliant engine
Best for
Full OS isolation, non-Linux workloads
Microservices, scale-out applications
Chapter 2 · DO188
Podman — Container Engine
Podman (Pod Manager) is a daemonless, rootless-capable OCI container engine from Red Hat. Unlike Docker, it does not require a background daemon — each podman invocation runs as a regular process, reducing attack surface.
Checking the Installation
verify podman
$ podman -v
# Output: podman version 4.x.x$ podman info
# Shows host OS, kernel, storage driver, registry configuration
Running Your First Container
run containers
Run a one-shot command inside a RHEL container (image is pulled automatically if not local)
$ podman run registry.redhat.io/rhel7/rhel:7.9 echo 'Red Hat'# Run interactively with a bash shell$ podman run -it registry.redhat.io/rhel7/rhel:7.9 /bin/bash
# Run detached (background) with port mapping host:container$ podman run -d -p 8080:8080 registry.access.redhat.com/ubi8/httpd-24:latest
# Run with auto-remove on exit, environment variable, and custom name$ podman run --rm --name myapp -e NAME='Red Hat' registry.redhat.io/rhel7/rhel:7.9 printenv NAME
# Bind only to localhost (prevents external access)$ podman run -p 127.0.0.1:8075:80 my-app
Managing Container Lifecycle
Command
Description
podman ps
List running containers
podman ps --all
List all containers including stopped ones
podman ps --all --format=json
Output container list as JSON
podman stop <name|id>
Send SIGTERM, then SIGKILL after timeout (default 10 s)
podman stop -t 30 <name>
Graceful stop with custom timeout
podman kill <name>
Send SIGKILL immediately
podman start <name>
Restart a stopped container
podman restart <name>
Stop then start a container
podman rm <name|id>
Remove a stopped container
podman rm -f <name>
Force-remove a running container
podman pause / unpause <name>
Freeze / resume all processes in a container using cgroup freezer
Inspecting and Interacting with Containers
inspect & exec
# Run a command in a running container$ podman exec -it myapp /bin/bash
$ podman exec myapp cat /etc/hostname
# Get full JSON metadata (IP, mounts, env vars, etc.)$ podman inspect myapp
# Extract a specific field using Go template$ podman inspect myapp -f'{{.NetworkSettings.Networks.apps.IPAddress}}'# Copy files between host and container$ podman cp myapp:/etc/hosts /tmp/hosts
$ podman cp /tmp/config.yaml myapp:/app/config.yaml
# Show port mappings for a container$ podman port myapp
$ podman port --all# Tail container logs (follow mode)$ podman logs -f myapp
$ podman logs --tail 50 myapp
Rootless Containers
Rootless Podman runs containers without root privileges. The user namespace maps the container's internal root (UID 0) to the current unprivileged host user. This limits the blast radius of container escapes.
rootless podman
# Run a container as your unprivileged user (no sudo needed)$ podman run --userns=auto registry.access.redhat.com/ubi8/ubi-minimal bash
# View the user namespace mapping$ podman unshare cat /proc/self/uid_map
# Storage location for rootless images# Default: ~/.local/share/containers/storage
Chapter 3 · DO188 / DO288
Container Images & Registries
A container image is a read-only, layered archive (OCI image-spec). Each layer is a diff of filesystem changes. Layers are cached and shared between images to save disk space.
Image Naming Convention
[registry/][namespace/]image-name[:tag][@digest]
Examples:
registry.redhat.io/rhel9/rhel:9.2 ← Red Hat Registry, versioned tag
registry.access.redhat.com/ubi8:latest ← UBI image, floating tag
quay.io/myorg/myapp:v2.1.3 ← Quay.io, semantic version
myapp@sha256:a1b2c3... ← Pinned by digest (immutable)
Pulling, Listing, Tagging, Removing Images
image management
# Pull an image from a registry$ podman pull registry.redhat.io/rhel7/rhel:7.9
# List all local images$ podman images
# Tag a local image (does not copy data, creates an alias)$ podman tag myapp:latest quay.io/myorg/myapp:v2.1.3
# Push image to a remote registry$ podman push quay.io/myorg/myapp:v2.1.3
# Inspect image metadata (layers, env vars, entrypoint, etc.)$ podman inspect registry.redhat.io/ubi8/ubi-minimal:latest
# Remove a local image$ podman rmi myapp:latest
# Remove all unused images$ podman image prune
# Search for images in registries$ podman search ubi8
Registry Authentication
registry auth
# Log in to the Red Hat Registry (prompts for credentials)$ podman login registry.redhat.io
# Log in to Quay.io$ podman login quay.io
# Log in to the internal OpenShift registry using oc token$ podman login -u $(oc whoami) -p $(oc whoami -t) \
default-route-openshift-image-registry.apps.ocp4.example.com
# Log out of a specific registry$ podman logout quay.io
# Log out of all registries$ podman logout --all
Skopeo — Registry Manipulation Without Pulling
Skopeo works with container images at the registry level, without needing a running container engine. It inspects, copies, and deletes images across registries efficiently.
skopeo
# Inspect image metadata without downloading it$ skopeo inspect docker://registry.redhat.io/ubi8/ubi-minimal:latest
# Inspect with credentials$ skopeo inspect --creds user:password \
docker://registry.redhat.io/rhscl/postgresql-96-rhel7
# Copy image between registries (no intermediate local storage needed)$ skopeo copy --dest-tls-verify=false \
docker://registry.redhat.io/ubi8/ubi-minimal:latest \
docker://registry.example.com/myorg/ubi-minimal:latest
# Copy from local storage to remote registry$ skopeo copy --dest-tls-verify=false \
containers-storage:myimage \
docker://registry.example.com/myorg/myimage
# Copy between two private registries with different credentials$ skopeo copy --src-creds=user1:pass1--dest-creds=user2:pass2 \
docker://src-registry.example.com/myimage \
docker://dest-registry.example.com/myimage
# Delete an image from a registry$ skopeo delete docker://registry.example.com/myorg/old-image:tag
Chapter 4 · DO188 / DO288
Building Custom Images — Containerfile
A Containerfile (compatible with Dockerfile syntax) is a text recipe for building a container image. Each instruction creates a new immutable layer.
Essential Containerfile Instructions
Instruction
Purpose
Example
FROM
Base image to build upon. Every Containerfile starts here.
FROM ubi8/ubi-minimal:8.8
RUN
Execute a shell command during build (creates a layer).
RUN dnf install -y python3 && dnf clean all
COPY
Copy files from build context into the image.
COPY app.py /app/app.py
ADD
Like COPY but also handles URLs and auto-extracts tar archives.
ADD app.tar.gz /app/
WORKDIR
Set working directory for subsequent instructions.
WORKDIR /app
ENV
Set environment variables available at build and runtime.
ENV PORT=8080 DEBUG=false
ARG
Build-time variable (not available at runtime unless set in ENV too).
ARG VERSION=1.0
EXPOSE
Documents which port the container listens on (informational only).
EXPOSE 8080
USER
Switch to a non-root user for all subsequent instructions and the final container.
USER 1001
ENTRYPOINT
Main command that always runs (use exec form JSON array).
ENTRYPOINT ["python3", "-m", "http.server"]
CMD
Default arguments for ENTRYPOINT, or the default command if no ENTRYPOINT.
Multi-stage builds produce small production images by separating the build environment from the runtime environment.
CONTAINERFILE — multi-stage Go application
# ── Stage 1: Build ──────────────────────────────────────
FROM registry.access.redhat.com/ubi8/go-toolset:1.17 AS build
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o myapp .
# ── Stage 2: Runtime (only the binary, no build tools) ──
FROM registry.access.redhat.com/ubi8/ubi-minimal:latest
WORKDIR /app
# Copy only the compiled binary from the build stage
COPY --from=build /app/myapp .
EXPOSE 8080
USER 1001
ENTRYPOINT ["./myapp"]
Building Images with Podman
podman build
# Build from current directory Containerfile$ podman build -t myapp:latest .
# Build from a specific Containerfile in a different context$ podman build -f Containerfile.prod -t myapp:prod /path/to/context
# Pass build arguments$ podman build --build-arg VERSION=2.0 -t myapp:2.0 .
# Squash all layers into one (reduces image size)$ podman build --squash-t myapp:squashed .
# Multi-platform build$ podman build --platform linux/amd64,linux/arm64 -t myapp:multi .
# View image history (layers)$ podman history myapp:latest
OpenShift Note
OpenShift requires containers to run as non-root. Always add USER 1001 (or any UID above 1000) as the last USER instruction. OpenShift will reject pods whose containers run as UID 0 unless a special SCC is granted.
Chapter 5 · DO188
Persisting Data — Volumes & Bind Mounts
Containers are ephemeral; data written inside is lost on removal. There are two main mechanisms to persist data outside the container lifecycle.
Podman Named Volumes
Named volumes are managed by Podman, stored under ~/.local/share/containers/storage/volumes/ (rootless) or /var/lib/containers/storage/volumes/ (root). They survive container removal.
named volumes
# Create a named volume$ podman volume create mydata
# List volumes$ podman volume ls
# Inspect volume (shows mountpoint on host)$ podman volume inspect mydata
# Mount volume into a container at /var/lib/myapp$ podman run -d-v mydata:/var/lib/myapp:Z myapp:latest
# Remove a volume (fails if in use)$ podman volume rm mydata
# Remove all unused volumes$ podman volume prune
Bind Mounts
Bind mounts map a host directory into the container. Useful during development to share source code.
bind mounts
# Mount host directory /host/data into /app/data inside container$ podman run -v /host/data:/app/data:Z myapp:latest
# Read-only bind mount$ podman run -v /host/config:/app/config:ro,Z myapp:latest
# :Z label tells Podman to relabel the files for SELinux (required on RHEL/CentOS)# :z shares the label between containers (less restrictive)
Running a Database Container with Persistent Storage
Podman uses a software-defined network layer (CNI or Netavark) to connect containers. By default, each container gets a private IP address on the podman bridge network. DNS-based name resolution works between containers on the same user-defined network.
Podman Network Commands
podman network
# Create a custom bridge network$ podman network create example-net
# Create a network with specific subnet$ podman network create --subnet 192.168.100.0/24 my-subnet
# List all networks$ podman network ls
# Inspect a network (shows subnet, gateway, connected containers)$ podman network inspect example-net
# Remove a network (must have no connected containers)$ podman network rm example-net
# Remove all unused networks$ podman network prune
# Connect a new container to a network$ podman run -d--name my-container --net example-net container-image:latest
# Connect a container to multiple networks at launch$ podman run -d--name gateway --net frontend-net,backend-net my-gateway
# Connect an already running container to a network$ podman network connect example-net my-container
DNS Resolution
Containers on the same user-defined network can reach each other by container name. For example, a web app on app-net can reach a database named db at db:5432. The default podman network does not provide DNS.
Chapter 6 · DO188
Troubleshooting Containers
Log Access and Debugging
debugging commands
# Stream container logs in real time$ podman logs -f myapp
# Show last 100 lines of logs$ podman logs --tail 100 myapp
# Show logs with timestamps$ podman logs -t myapp
# Run a debug shell in a running container$ podman exec -it myapp /bin/sh
# Run a debug shell in a new temporary container from same image$ podman run --rm -it myapp:latest /bin/sh
# View running processes inside a container$ podman top myapp
# Live resource usage stats$ podman stats myapp
# Check container exit code after failure$ podman inspect myapp -f'{{.State.ExitCode}}'
Chapter 7 · DO188
Multi-Container Applications with Podman Compose
Podman Compose reads a docker-compose.yml / compose.yaml file and translates each service into Podman containers, networks, and volumes. It is ideal for local development environments.
# Start all services (detached)$ podman-compose up -d# Start and rebuild images if changed$ podman-compose up -d --build# View running compose services$ podman-compose ps
# Stream logs from all services$ podman-compose logs -f# Stop and remove all containers/networks created by compose$ podman-compose down
# Scale a service to N replicas$ podman-compose up --scale backend=3 -d# Also filter logs by project label$ podman ps -a --filter label=io.podman.compose.project=myproject
Chapter 1 · DO180
Kubernetes Architecture
Kubernetes is an open-source container orchestration system. It groups containers into Pods, ensures desired state, scales workloads, and manages networking and storage across a cluster of nodes.
🖥️
Control Plane
Runs the API server (kube-apiserver), scheduler, controller manager, and etcd. Manages cluster state and makes scheduling decisions.
⚙️
Worker Nodes
Run the kubelet (node agent), kube-proxy, and a container runtime (CRI-O). Execute the actual workloads.
📦
Pod
The smallest deployable unit — a group of one or more tightly coupled containers sharing a network namespace (same IP) and storage volumes.
🔁
ReplicaSet
Ensures a specified number of Pod replicas are always running. Replaces failed pods automatically.
🚀
Deployment
Manages ReplicaSets declaratively. Enables rolling updates, rollbacks, and scaling. The standard way to deploy stateless apps.
🔌
Service
A stable virtual IP and DNS name that load-balances traffic to a set of matching Pods. Types: ClusterIP, NodePort, LoadBalancer.
🗃️
etcd
Distributed key-value store that holds all cluster state (resource definitions, secrets, configuration). The single source of truth.
📋
Namespace
Virtual cluster within Kubernetes for multi-tenancy. Resources in different namespaces are isolated, and resource quotas can be applied per namespace.
kubectl — Kubernetes CLI
kubectl essentials
# Imperative deployment (quick, not reproducible)$ kubectl create deployment db-pod --port 3306 \
--image registry.example.com/rhel8/mysql-80
# Set environment variables on a deployment$ kubectl set env deployment/db-pod \
MYSQL_USER='user1' \
MYSQL_PASSWORD='mypass' \
MYSQL_DATABASE='mydb'# Apply a manifest declaratively (creates or updates)$ kubectl apply -f deployment.yaml
# Apply all manifests in a directory recursively$ kubectl apply -f manifests/ -R# Preview what apply would change (dry-run diff)$ kubectl diff -f deployment.yaml
# Delete resources from a manifest$ kubectl delete -f deployment.yaml
# Generate YAML manifest from an imperative command$ kubectl create deployment hello -o yaml --dry-run=client \
--image registry.example.com/redhattraining/hello:latest > hello.yaml
# Explain any field of a resource kind$ kubectl explain deployment.spec.template.spec.containers
Chapter 1 · DO180 / DO288
Red Hat OpenShift Container Platform
Red Hat OpenShift (RHOCP) is an enterprise Kubernetes distribution that adds developer tools, security hardening, integrated CI/CD (Tekton/Pipelines), a web console, and enterprise support on top of upstream Kubernetes.
Foundation
Linux Kernel
→
Runtime
CRI-O
→
Orchestration
Kubernetes
→
Platform
OpenShift
→
Your App
Pods & Services
Logging In and Basic Navigation
oc login & navigation
# Log in to an OpenShift cluster$ oc login -u developer -p developer https://api.ocp4.example.com:6443
# Print current user$ oc whoami
# Get web console URL$ oc whoami --show-console# Get API server token (used for registry auth)$ oc whoami -t# Switch to a project (namespace)$ oc project myproject
# List all accessible projects$ oc projects
# Create a new project$ oc new-project myapp --description"My Application"
Key OpenShift-Specific Concepts
🗂️
Project
OpenShift's enhanced Namespace. Adds access control, network policies, and resource quota isolation per team or application.
🛣️
Route
OpenShift extension to expose services to external traffic via a hostname. Backed by HAProxy; supports TLS termination (edge, passthrough, re-encrypt).
🔒
Security Context Constraint (SCC)
Cluster-level policy controlling what a pod can do (run as root, mount host paths, use host network, etc.). Default SCC prevents privileged operations.
🖼️
ImageStream
A pointer to container images that triggers automatic redeployments when the referenced image changes. Decouples image location from deployment config.
🏗️
BuildConfig
Defines how to build a container image from source — supports Docker, Source-to-Image (S2I), and custom build strategies.
⚙️
Operator
A Kubernetes controller that encodes operational knowledge for managing a complex application (e.g., database clusters). Extends the Kubernetes API with CRDs.
Chapters 2–5 · DO180
Managing Workloads in OpenShift
Deploying Applications with oc new-app
oc new-app
# Deploy from a container image$ oc new-app --image registry.access.redhat.com/ubi8/httpd-24
# Deploy from a Git repository (auto-detects language via S2I)$ oc new-app https://github.com/myorg/myapp
# Specify S2I builder + source repository$ oc new-app php~http://gitserver.example.com/mygitrepo
# Deploy from local source directory$ oc new-app . --name myapp
# Pass environment variables$ oc new-app --image rhel8/mysql-80 \
-e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=mydb
Viewing and Managing Resources
oc get / describe / delete
# List all resources in current project$ oc get all
# List pods with status$ oc get pods
# List pods with node and IP info$ oc get pods -o wide# Watch pods until ready$ watch oc get pods
# Describe a pod (events, conditions, container states)$ oc describe pod myapp-6d8c4-xyz
# Get pod logs$ oc logs myapp-6d8c4-xyz
$ oc logs -f myapp-6d8c4-xyz # follow$ oc logs -c sidecar myapp-6d8c4-xyz # specific container# Get deployments$ oc get deployments
# Scale a deployment to 3 replicas$ oc scale deployment myapp --replicas 3
# Rollout status$ oc rollout status deployment/myapp
# Roll back a deployment$ oc rollout undo deployment/myapp
# Delete a resource$ oc delete pod myapp-6d8c4-xyz
$ oc delete deployment myapp
Exposing Services via Routes
services & routes
# Expose a deployment as a ClusterIP service on port 8080$ oc expose deployment myapp --port 8080
# Create a Route (external hostname) from a service$ oc expose service myapp
# Get routes to see assigned hostnames$ oc get routes
# Create a TLS edge-terminated route with a custom hostname$ oc create route edge --service myapp \
--hostname myapp.apps.ocp4.example.com \
--cert tls.crt --key tls.key
StatefulSets — Stateful Workloads
For stateful applications (databases, message queues), use a StatefulSet. Pods get stable network identifiers (mydb-0, mydb-1) and each gets its own PersistentVolumeClaim.
statefulsets
# List StatefulSets$ oc get statefulsets
# Scale a StatefulSet (pods are added/removed in order)$ oc scale statefulset mydb --replicas 3
# Delete a StatefulSet without deleting its pods$ oc delete statefulset mydb --cascade=orphan
Chapters 4–5 · DO288
Builds — Source-to-Image (S2I)
Source-to-Image (S2I) is an OpenShift build strategy that takes application source code and a builder image, injects the source, runs the build scripts, and produces a ready-to-run container image — without writing a Containerfile.
Input
Source Code (Git)
+
Input
S2I Builder Image
→
S2I Process
assemble script
→
Output
App Container Image
→
Deploy
Pods on Cluster
S2I Build Commands
s2i & oc start-build
# Build locally with s2i CLI (for testing the S2I process)$ s2i build https://github.com/myorg/myapp \
registry.access.redhat.com/ubi8/python-38 \
myorg/myapp:latest
# Trigger a new build in OpenShift$ oc start-build myapp-build
# Trigger a build from local source (binary input)$ oc start-build myapp-build --from-dir .
# Watch build logs in real time$ oc logs -f bc/myapp-build
# List all builds$ oc get builds
# Cancel a running build$ oc cancel-build myapp-build-3
# List imagestreams in current project$ oc get imagestreams
# List imagestream tags$ oc get imagestreamtags
# Import an external image into an imagestream$ oc import-image myapp:latest \
--from quay.io/myorg/myapp:latest \
--confirm# Tag an existing imagestream tag$ oc tag myapp:latest myapp:stable
Chapter 2 · DO288 / Chapter 5 · DO180
Injecting Configuration — ConfigMaps & Secrets
Applications should not bake configuration into images. ConfigMaps hold non-sensitive configuration; Secrets hold sensitive data (passwords, tokens, keys) as base64-encoded values.
ConfigMaps
configmap
# Create from literal values$ oc create configmap app-config \
--from-literal APP_PORT=8080 \
--from-literal LOG_LEVEL=info
# Create from a file$ oc create configmap nginx-conf --from-file nginx.conf
# Create from all files in a directory$ oc create configmap app-props --from-file ./config/
# View the configmap$ oc get configmap app-config -o yaml# Use as environment variables in a pod spec# spec.containers[].envFrom:# - configMapRef:# name: app-config# Mount as a volume (files in the pod)# spec.volumes[]: configMap: name: nginx-conf# spec.containers[].volumeMounts[]: mountPath: /etc/nginx/conf.d
Secrets
secrets
# Create a generic secret from literals$ oc create secret generic db-credentials \
--from-literal username=myuser \
--from-literal password='S3cr3t!'# Create a TLS secret from certificate files$ oc create secret tls my-tls --cert tls.crt --key tls.key
# Create a docker-registry secret (for pulling private images)$ oc create secret docker-registry quay-pull-secret \
--docker-server quay.io \
--docker-username myuser \
--docker-password mytoken
# Link pull secret to service account$ oc secrets link default quay-pull-secret --for pull
# View a secret (base64-encoded)$ oc get secret db-credentials -o yaml# Decode a secret value$ oc get secret db-credentials -o jsonpath='{.data.password}' | base64 -d
Chapter 5 · DO180
Persistent Storage in OpenShift
Kubernetes storage model decouples how storage is provisioned (PersistentVolume) from how it is requested (PersistentVolumeClaim).
Storage Object Hierarchy
StorageClass (admin creates once)
↓ dynamically provisions
PersistentVolume (PV) — actual storage backing (NFS, iSCSI, AWS EBS, Ceph, etc.)
↓ bound to
PersistentVolumeClaim (PVC) — developer's storage request (size, access mode)
↓ mounted by
Pod — uses the PVC as a volume
Access Modes
Mode
Short Form
Meaning
ReadWriteOnce
RWO
One node can read and write. Suitable for block storage (AWS EBS, iSCSI).
ReadOnlyMany
ROX
Many nodes can read. Suitable for shared read-only configuration.
ReadWriteMany
RWX
Many nodes can read and write. Requires distributed storage (NFS, CephFS, GlusterFS).
ReadWriteOncePod
RWOP
Only a single Pod can access at a time (K8s 1.22+). Strongest isolation.
Working with PVCs
persistent volume claims
# List storage classes available in the cluster$ oc get storageclasses
# List PersistentVolumes (admin view)$ oc get pv
# List PersistentVolumeClaims in current project$ oc get pvc
# Describe a PVC (status, bound PV, capacity)$ oc describe pvc mydata-pvc
pvc.yaml — request 5Gi of ReadWriteOnce storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mydata-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: standard # omit to use default StorageClass
Chapter 6 · DO180
Application Reliability — Health Probes & Autoscaling
Health Probe Types
🚦
Liveness Probe
Checks if the application is still running. If it fails, Kubernetes restarts the container. Detects deadlocks and infinite loops.
✅
Readiness Probe
Checks if the application is ready to serve traffic. A failing readiness probe removes the pod from Service endpoints, preventing traffic to a still-starting app.
🔥
Startup Probe
Gates the liveness and readiness probes until the application has started. Critical for slow-starting applications to prevent premature restarts.
# Create an HPA targeting 70% CPU utilization, scaling from 2 to 10 pods$ oc autoscale deployment myapp \
--min 2 --max 10 --cpu-percent 70
# View HPA status (shows current / desired replicas)$ oc get hpa
# Describe HPA events and scaling history$ oc describe hpa myapp
Resource Quotas & LimitRanges
quota & limits
# View resource quotas for the project$ oc get resourcequota
# Describe quota usage (current vs. limit)$ oc describe resourcequota compute-resources
# View LimitRange defaults applied to new pods$ oc get limitrange
$ oc describe limitrange core-resource-limits
Chapter 3 · DO280
Authentication, Authorization & RBAC
OpenShift supports multiple identity providers. The most common lab/on-premises choice is HTPasswd. Role-Based Access Control (RBAC) governs what authenticated users can do.
HTPasswd Identity Provider
htpasswd identity provider setup
# Create an htpasswd file with a new user$ htpasswd -c -B htpasswd-users user1
# Add another user to the existing file$ htpasswd -B htpasswd-users user2
# Create a secret from the htpasswd file$ oc create secret generic htpasswd-secret \
--from-file htpasswd=htpasswd-users \
-n openshift-config
# Update the OAuth cluster resource to use HTPasswd$ oc edit oauth cluster
# Add under spec.identityProviders:# - name: htpasswd-provider# mappingMethod: claim# type: HTPasswd# htpasswd:# fileData:# name: htpasswd-secret# Verify OAuth pods are rolling out$ oc get pods -n openshift-authentication
RBAC — Roles and RoleBindings
Resource
Scope
Purpose
Role
Namespace
Defines a set of allowed API verbs (get, list, watch, create, update, delete) on specific resources within a namespace.
RoleBinding
Namespace
Binds a Role (or ClusterRole) to users/groups/service accounts within a namespace.
ClusterRole
Cluster-wide
Like Role but applies across all namespaces. Also used for non-namespaced resources (nodes, PVs).
ClusterRoleBinding
Cluster-wide
Binds a ClusterRole to subjects for cluster-wide access.
rbac commands
# List roles in current project$ oc get roles
# List cluster roles$ oc get clusterroles
# Grant user the edit role in current project$ oc adm policy add-role-to-user edit user1
# Grant user the view role in a specific namespace$ oc adm policy add-role-to-user view user2 -n production
# Grant cluster-admin rights (use with caution!)$ oc adm policy add-cluster-role-to-user cluster-admin admin-user
# Remove a role from a user$ oc adm policy remove-role-from-user edit user1
# Check what a user can do$ oc auth can-i create pods --as user1
$ oc auth can-i '*' '*'--as system:serviceaccount:myproject:default
# Grant a service account the anyuid SCC (allows running as any user)$ oc adm policy add-scc-to-user anyuid -z myserviceaccount
Groups
groups
# Create a group$ oc adm groups new developers
# Add users to a group$ oc adm groups add-users developers user1 user2
# Grant the group edit role in a namespace$ oc adm policy add-role-to-group edit developers -n myproject
# List groups$ oc get groups
Chapter 1 · DO280
Declarative Resource Management & Kustomize
The declarative workflow describes desired state in YAML manifests and uses kubectl apply to reconcile the cluster to that state. This is reproducible, auditable (Git-based), and supports GitOps workflows.
Imperative vs. Declarative
Imperative
Declarative
How
kubectl create / delete commands
YAML files + kubectl apply
Reproducibility
Difficult — depends on command history
High — files define the exact desired state
GitOps compatible
No
Yes
Best for
Quick one-off tasks, debugging
Production deployments, CI/CD pipelines
Kustomize — Configuration Overlays
Kustomize generates Kubernetes manifests from a base and environment-specific overlays without duplicating YAML. It is natively integrated into kubectl and oc.
# Preview rendered manifests without applying$ kubectl kustomize overlay/production
# Apply a kustomization overlay to the cluster$ kubectl apply -k overlay/production
$ oc apply -k overlay/staging
# Delete resources created by a kustomization$ oc delete -k overlay/production
# Diff current cluster state vs. kustomization$ kubectl diff -k overlay/production
OpenShift Templates
Templates are OpenShift-native packaged resource sets with parameters. They are stored in the openshift namespace and deployable via the web console or CLI.
templates
# List available templates in the global template library$ oc get templates -n openshift
# Describe a template (parameters, resources it creates)$ oc describe template cache-service -n openshift
# Process a template with custom parameters$ oc process -f mytemplate.yaml \
-p APP_NAME=myapp \
-p REPLICAS=3 | oc apply -f -
# Process a template from the openshift namespace$ oc process -n openshift cakephp-mysql-persistent \
-p NAME=myweb | oc apply -f -
# Export an existing resource as a template$ oc get all -o yaml | oc export -f - > exported-template.yaml
Chapter 5 · DO280
Operators & Helm Charts
Kubernetes Operators
An Operator is a Kubernetes controller that manages a specific application using domain knowledge encoded in software. It watches Custom Resources (CRs) and reconciles the cluster to match the desired state defined in those CRs.
🔩
Custom Resource Definition (CRD)
Extends the Kubernetes API with new resource types. An Operator registers CRDs and watches for instances to reconcile.
📡
Operator Lifecycle Manager (OLM)
OpenShift's system for installing, updating, and managing Operators cluster-wide. Provides the OperatorHub web UI.
📦
ClusterServiceVersion (CSV)
Operator metadata file describing capabilities, required CRDs, install strategy, and lifecycle details. OLM uses the CSV to install an Operator.
📋
Subscription
Tells OLM which Operator to install, from which catalog, and update channel. OLM keeps the Operator updated as new versions are published.
operator management
# List all installed operators (CSVs) in a namespace$ oc get clusterserviceversions
# List operator subscriptions$ oc get subscriptions -n openshift-operators
# Describe a subscription (current/desired CSV, catalog)$ oc describe subscription my-operator -n openshift-operators
# List Custom Resource Definitions$ oc get crds
# List resources created by an operator (e.g., a database cluster CR)$ oc get myoperatorresource
Helm — Package Management for Kubernetes
Helm packages Kubernetes manifests into charts (versioned, shareable archives). It uses Go templates to parameterize manifests, and tracks installed releases.
helm
# Add a chart repository$ helm repo add bitnami https://charts.bitnami.com/bitnami
# Update local repo cache$ helm repo update
# Search for a chart$ helm search repo nginx
# Install a chart with a custom release name$ helm install my-nginx bitnami/nginx
# Install with custom values$ helm install my-nginx bitnami/nginx \
--set replicaCount=2 \
--set service.type=ClusterIP
# Install with a values file$ helm install my-nginx bitnami/nginx -f values.yaml
# List all Helm releases in the current namespace$ helm list
# Upgrade a release (apply changed values)$ helm upgrade my-nginx bitnami/nginx --set replicaCount=4
# Roll back to a previous release revision$ helm rollback my-nginx 1
# Uninstall a release (removes all its resources)$ helm uninstall my-nginx
# Show rendered templates without installing (debug)$ helm template my-nginx bitnami/nginx -f values.yaml
Chapter 6 · DO280
Network Policies & TLS
NetworkPolicy resources enforce firewall-like rules between pods and namespaces. By default in OpenShift, all pods in a project can communicate with each other. NetworkPolicies restrict this.
Default Deny + Allow Pattern
network-policy — deny-all ingress, then allow specific traffic
# 1. Deny all ingress to this namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
spec:
podSelector: {} # applies to ALL pods in namespace
policyTypes:
- Ingress
---
# 2. Allow ingress only from pods with label app=frontend
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
spec:
podSelector:
matchLabels:
app: backend
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
network policy & TLS
# List network policies in current project$ oc get networkpolicies
# Describe a network policy$ oc describe networkpolicy deny-all-ingress
# Apply a network policy$ oc apply -f deny-all.yaml
# Generate a self-signed TLS certificate (for testing)$ openssl req -newkey rsa:4096 -nodes -keyout tls.key \
-x509 -days 365 -out tls.crt \
-subj"/CN=myapp.apps.ocp4.example.com"# Create a TLS edge-terminated route$ oc create route edge myapp-tls \
--service myapp \
--cert tls.crt \
--key tls.key \
--hostname myapp.apps.ocp4.example.com
# Create a passthrough route (TLS terminated at the pod)$ oc create route passthrough myapp-passthrough \
--service myapp \
--hostname myapp.apps.ocp4.example.com
Egress Controls
egress network policy
# Allow egress only to a specific external CIDR# (EgressNetworkPolicy is an OpenShift extension)$ oc apply -f egress-policy.yaml
# Test connectivity from a pod$ oc exec -it myapp-pod -- curl -v http://external-service:8080
Chapter 8 · DO280
Cluster & Operator Updates
OpenShift uses the Cluster Version Operator (CVO) to manage cluster updates. Updates follow a graph of supported upgrade paths and can be performed with minimal downtime using rolling node replacements.
Update Channels
Channel
Purpose
stable-4.x
Fully tested, production-recommended updates for minor version 4.x
fast-4.x
Updates released faster than stable; less bake time but still tested
candidate-4.x
Release candidates; not for production
eus-4.x
Extended Update Support — for customers on a fixed minor version
cluster update commands
# Check current cluster version and available updates$ oc get clusterversion
$ oc describe clusterversion version
# View available update targets$ oc adm upgrade
# Start a cluster upgrade to a specific version$ oc adm upgrade --to 4.14.15
# Trigger an upgrade to the latest recommended version$ oc adm upgrade --to-latest# Watch cluster operators during upgrade$ watch oc get clusteroperators
# View cluster operator health$ oc get clusteroperators
# Check node status during rolling update$ oc get nodes
# View machine config pools (controls node update batching)$ oc get machineconfigpool
# Pause updates to the worker pool (e.g., during critical period)$ oc patch mcp/worker --type merge \
-p'{"spec":{"paused":true}}'# Resume worker pool updates$ oc patch mcp/worker --type merge \
-p'{"spec":{"paused":false}}'
Updating Operators via OLM
operator updates
# List operator subscriptions and their update channels$ oc get subscriptions -A# Check install plans (pending/approved operator updates)$ oc get installplans -n openshift-operators
# Approve a manual install plan (for manually-approved update policy)$ oc patch installplan install-xxxxx --type merge \
-p'{"spec":{"approved":true}}' \
-n openshift-operators
# Watch operator CSV rollout$ watch oc get csv -n openshift-operators
Important
OpenShift supports updates only between adjacent minor versions (e.g., 4.12 → 4.13 → 4.14). Skipping minor versions requires following the official upgrade graph from Red Hat's upgrade graph tool. Always update the cluster before updating Operators that depend on it.
Setting Up an OpenShift Cluster User-Provisioned Infrastructure in Air-Gapped Environments
A complete, step-by-step guide to manually provisioning VMs, load balancers, and DNS for an OpenShift cluster in a disconnected network.
1 What is UPI?
With User-Provisioned Infrastructure (UPI), you have maximum control over the cluster setup. You are responsible for manually preparing all virtual machines, load balancers, and DNS records before the OpenShift installer runs. This is the preferred approach for secure, air-gapped, or heavily regulated environments.
2 Air-Gapped Prerequisites
The primary challenge in a disconnected environment is the absence of direct access to the Red Hat Container Registry. You must bridge this gap before initiating the installation.
Mirror Registry
Establish a local container registry (e.g., Red Hat Quay or JFrog Artifactory) within your secure perimeter. Use the oc mirror plugin to sync OpenShift release images, operator catalogs, and Helm charts from the internet to a portable medium, then load them into your local registry.
Internal DNS & NTP
Precise time synchronization and split-horizon DNS are non-negotiable. Every node must be able to resolve the local registry hostname and all internal API endpoints.
3 OS Strategy
Red Hat enforces a specific OS strategy to ensure the self-healing nature of OpenShift.
⚠️
Control Plane (Masters): RHCOS is mandatory. Standard RHEL, Ubuntu, or any other OS cannot be used. Masters are managed by the Machine Config Operator (MCO), which requires an immutable, container-optimized OS to push updates, roll back kernel changes, and manage configurations automatically.
ℹ️
Compute Nodes (Workers): RHCOS is strongly recommended. When you update OpenShift, the OS on the workers updates automatically via rpm-ostree for safe, transactional updates. You have some flexibility here, but RHCOS is the supported default.
4 Architecture & Helper Node
This setup uses a Helper Node as the backbone — it acts as a bridge between the external network and the internal cluster network using two network interfaces.
Network Interfaces
Interface
Zone
Subnet
Role
ens192
External
192.168.0.X
Front-end / internet-facing traffic and corporate LB
ens224
Internal
192.168.22.X
Back-end cluster communication and storage traffic
Infrastructure Services Provided by the Helper Node
DNS (BIND)
Resolves cluster hostnames and API endpoints
DHCP
Assigns static IPs to all cluster nodes
NAT Gateway
Routes internal node traffic through the helper
HAProxy
Load balances API and Ingress traffic
Apache Web Server
Hosts Ignition files for automated installation
NFS Server
Provides persistent storage for the registry
5 Cluster Node Roles
Temporary
Bootstrap Node
Used only during the initial installation to orchestrate Control Plane creation. Decommissioned once the control plane is healthy.
×3 Nodes — HA
Control Plane (Masters)
The "brains" of the cluster. Runs the API server, etcd database, and controllers. Three nodes ensure high availability.
Compute
Worker Nodes
Where your actual applications, containers, and pods run. CSRs must be manually approved in UPI mode.
6 Core Deployment Workflow
Phase I — Configuration & Manifest Generation
On the Bastion host, define the cluster in install-config.yaml, pointing to your local mirror registry. Generate Kubernetes manifests and convert them to Ignition configs (.ign files) that RHCOS nodes execute on first boot.
Phase II — Infrastructure Provisioning
Configure a high-availability load balancer (HAProxy/F5) for the API (port 6443), Machine Config Server (port 22623), and Ingress (ports 80/443). Host the generated .ign files on an internal HTTP server.
Phase III — Bootstrap Sequence
Boot the Bootstrap node → it pulls its config and initiates control plane creation. Boot Masters → they form the etcd quorum. Boot Workers → manually approve their CSRs to join the cluster.
Select 'Create Cluster' from the 'Clusters' navigation menu
Select 'RedHat OpenShift Container Platform'
Select 'Run on Bare Metal'
Download the following files:
Openshift Installer for Linux (openshift-install-linux.tar.gz)
Pull secret
Command Line Interface for Linux and your workstations OS (openshift-client-linux.tar.gz)
Red Hat Enterprise Linux CoreOS (RHCOS)
rhcos-X.X.X-x86_64-metal.x86_64.raw.gz
rhcos-X.X.X-x86_64-installer.x86_64.iso (or rhcos-X.X.X-x86_64-live.x86_64.iso for newer versions)
Notes: Before powering on a single node, these must be ready: 1) Load Balancer:
Port 6443 (API): Points to Bootstrap + 3 Masters.
Port 22623 (Machine Config): Points to Bootstrap + 3 Masters.
Ports 80/443 (Apps): Points to all Worker nodes.
2) DNS:
api.<cluster>.<domain> -> LB VIP for 6443.
api-int.<cluster>.<domain> -> LB VIP for 6443/22623.
*.apps.<cluster>.<domain> -> LB VIP for 80/443/8443.
Step 1 — Install Client Tools
# Extract and install the OpenShift client tools
tar xvf openshift-client-linux.tar.gz
mv oc kubectl /usr/local/bin
# Verify installation
kubectl version
oc version
# Extract the OpenShift Installer
tar xvf openshift-install-linux.tar.gz
Step 2 — Configure Static IP for Internal NIC
Run nmtui-edit ens224 or edit /etc/sysconfig/network-scripts/ifcfg-ens224 with these values:
dnf install bind bind-utils -y
cp ~/ocp4-metal-install/dns/named.conf /etc/named.conf
cp -R ~/ocp4-metal-install/dns/zones /etc/named/
# Open firewall for DNS
firewall-cmd --add-port=53/udp --zone=internal --permanent
firewall-cmd --add-port=53/tcp --zone=internal --permanent # Required for OCP 4.9+
firewall-cmd --reload
# Enable and start BIND
systemctl enable named && systemctl start named && systemctl status named
Update the external NIC (ens192) to use 127.0.0.1 as its DNS server and enable "Ignore automatically obtained DNS parameters" via nmtui-edit ens192, then restart NetworkManager:
systemctl restart NetworkManager
# Verify DNS resolution
dig ocp.lan
dig -x 192.168.22.200 # Should resolve to ocp-bootstrap.lab.ocp.lan
Step 6 — Install & Configure DHCP
⚠️
Before copying the config, update ~/ocp4-metal-install/dhcpd.conf with the actual MAC addresses of each cluster machine.
# Allow HAProxy SELinux binding and start the service
setsebool -P haproxy_connect_any 1
systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy
Step 9 — Install & Configure NFS Server
Network File System (NFS) is a distributed file system protocol that allows a user on a client computer to access files over a network much like local storage is accessed. Originally developed by Sun Microsystems, it has become the standard for file sharing between Unix and Linux systems.
How NFS Works
NFS Server: Hosts the physical storage and "exports" (shares) specific directories to the network. It manages permissions and handles requests from clients.
NFS Client: Mounts the exported directory from the server onto its own local file system. To the user or application on the client side, the files appear to be stored locally.
To control whether workloads can run on Control Plane nodes, edit the scheduler manifest:
ls ~/ocp-install/manifests/cluster-scheduler-02-config.yml
# Set mastersSchedulable: true → allow workloads on masters
# Set mastersSchedulable: false → prevent workloads (default)
# Move the RHCOS metal image to the web server
mv ~/rhcos-X.X.X-x86_64-metal.x86_64.raw.gz /var/www/html/ocp4/rhcos
# Set correct SELinux context, ownership, and permissions
chcon -R -t httpd_sys_content_t /var/www/html/ocp4/
chown -R apache: /var/www/html/ocp4/
chmod 755 /var/www/html/ocp4/
# Confirm all files are accessible
curl localhost:8080/ocp4/
Step 12 — Boot Cluster Nodes
Boot each node type using the RHCOS ISO or PXE, passing the appropriate Ignition file URL via kernel arguments.
You boot your nodes using the RHCOS (Red Hat Enterprise Linux CoreOS) ISO or PXE. During the boot process, you must pass a kernel argument to tell the node where its "brain" (Ignition file) is:coreos.inst.ignition_url=http:///bootstrap.ign
Order of Operations:
# Monitor bootstrap progress from the Helper Node
~/openshift-install --dir /var/www/html/ocp4/ wait-for bootstrap-complete --log-level=debug
Once bootstrapping completes, remove the Bootstrap node from HAProxy and shut it down:
# Remove ocp-bootstrap from /etc/haproxy/haproxy.cfg, then reload
systemctl reload haproxy
# Approve Worker CSRs so workers can join the cluster
oc get csr
oc adm certificate approve <csr-name>
# Verify all nodes are Ready
oc get nodes
Configure Storage: Define StorageClasses (NFS, OCS, or local storage) so applications can persist data. See the Configure Storage section below.
Set Up Identity Providers: Replace the temporary kubeadmin user with a permanent solution such as LDAP or OAuth. See the Identity Providers section below.
★ Configure Storage Post-Install
Once the cluster is healthy and all nodes are Ready, you must configure persistent storage. Without a working StorageClass, the internal image registry, monitoring stack, and most operators cannot persist data.
ℹ️
Why this matters immediately: The OpenShift internal image registry is set to Removed or EmptyDir by default after a UPI install. You must back it with persistent storage before pushing any images.
Option A — NFS StorageClass (Lab / Air-Gapped)
If you provisioned an NFS share on the Helper Node (Step 9), expose it as a dynamic StorageClass using the NFS Subdir External Provisioner. This is the fastest path for lab and air-gapped environments.
1. Configure storage for the Image Registry
If you check the cluster operators oc get co, you will likely see the image-registry operator reporting AVAILABLE=False or PROGRESSING=True (but stuck) because it lacks the resources to deploy the registry pods.
Run the next command to create the 'image-registry-storage' PVC by updating the management state to 'Managed' and adding 'pvc' and 'claim' keys in the storage key
After a short wait the 'image-registry-storage' pvc should now be in a 'bound' state
oc get pvc -n openshift-image-registry
Option B — OpenShift Data Foundation / ODF (Production)
ODF provides software-defined block, file, and object storage via Ceph running directly on your worker nodes. Minimum requirement: 3 worker nodes, each with at least one raw, unformatted additional disk.
oc create namespace openshift-storage
# Create OperatorGroup
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-storage-operatorgroup
namespace: openshift-storage
spec:
targetNamespaces:
- openshift-storage
EOF
# Subscribe to ODF (adjust channel to match your OCP version)
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: odf-operator
namespace: openshift-storage
spec:
channel: stable-4.14
installPlanApproval: Automatic
name: odf-operator
source: redhat-operators # Replace with mirrored CatalogSource in air-gapped
sourceNamespace: openshift-marketplace
EOF
# Wait for all operator pods to be Running
oc get pods -n openshift-storage -w
3. Create the StorageCluster
cat <<EOF | oc apply -f -
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
name: ocs-storagecluster
namespace: openshift-storage
spec:
manageNodes: false
monDataDirHostPath: /var/lib/rook
storageDeviceSets:
- name: ocs-deviceset
count: 1 # 1 OSD per node x 3 nodes = 3 OSDs total
replica: 3
portable: true
dataPVCTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi # Size of each raw disk to claim
volumeMode: Block
storageClassName: localblock # SC that presents raw block devices
EOF
# Monitor cluster initialisation (typically 5-15 minutes)
oc get storagecluster -n openshift-storage -w
4. StorageClasses created by ODF
StorageClass
Type
Access Mode
Best For
ocs-storagecluster-ceph-rbd
Block (Ceph RBD)
RWO
Databases (PostgreSQL, MongoDB), stateful apps
ocs-storagecluster-cephfs
File (CephFS)
RWX
Shared media folders, CMS uploads, ML pipelines
openshift-storage.noobaa.io
Object (S3 API)
S3
Backups, AI/ML datasets, image registry
Option C — Local Storage Operator (LSO)
LSO presents raw node-local disks as PersistentVolumes without requiring a SAN or NFS server. It is commonly used as the backing layer for ODF.
# Install via Subscription
# channel: stable-4.14 | name: local-storage-operator
# After operator is Running, declare which disks to expose:
cat <<EOF | oc apply -f -
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
name: local-disks
namespace: openshift-local-storage
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-0.lab.ocp.lan
- worker-1.lab.ocp.lan
- worker-2.lab.ocp.lan
storageClassDevices:
- storageClassName: localblock
volumeMode: Block
devicePaths:
- /dev/sdb # The second raw disk on each node
EOF
Configure the Internal Image Registry
After storage is ready, switch the registry from Removed to Managed and back it with a PVC:
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{
"spec": {
"managementState": "Managed",
"storage": {"pvc": {"claim": ""}},
"replicas": 1
}
}'
# The operator auto-creates a PVC; watch it bind
oc get pvc -n openshift-image-registry
# Confirm the registry pod is Running
oc get pods -n openshift-image-registry
ℹ️
Running more than 1 registry replica requires a ReadWriteMany (RWX) PVC, such as NFS or CephFS. For a single replica, ReadWriteOnce (RWO) is sufficient.
Set the Default StorageClass
# Mark one SC as the cluster default
oc patch storageclass nfs-client -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}'
# Remove the default annotation from any previously default SC
oc patch storageclass old-sc -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "false"}}}'
# Verify
oc get storageclass
Enable Persistent Storage for the Monitoring Stack
Prometheus and Alertmanager use ephemeral storage by default. Configure persistence so metrics survive pod restarts:
After installation the only user is the temporary kubeadmin. You must configure a permanent Identity Provider and then delete kubeadmin to enforce proper authentication and RBAC across the cluster.
⚠️
Do not delete kubeadmin until at least one other user has been granted cluster-admin privileges and you have confirmed that you can log in successfully as that user.
Option A — HTPasswd (Simplest / Lab)
HTPasswd stores usernames and bcrypt-hashed passwords in a flat file. Ideal for small teams and fully air-gapped labs where an external directory is not available.
1. Create the htpasswd file and Kubernetes Secret
dnf install httpd-tools -y
# -c creates a new file; omit -c when appending users
htpasswd -c -B -b /tmp/htpasswd admin RedHatAdmin1!
htpasswd -B -b /tmp/htpasswd developer DevPass123!
# Store the file as a Secret in openshift-config
oc create secret generic htpasswd-secret --from-file=htpasswd=/tmp/htpasswd -n openshift-config
2. Register the provider in the OAuth cluster object
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: htpasswd_provider
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd-secret # Must match the Secret name above
EOF
3. Grant cluster-admin and test login
# Allow ~30 s for oauth-server pods to restart, then:
oc adm policy add-cluster-role-to-user cluster-admin admin
oc login -u admin -p RedHatAdmin1! https://api.lab.ocp.lan:6443
oc whoami # Should return: admin
4. Adding or changing users later
# Pull the current file out of the Secret
oc extract secret/htpasswd-secret -n openshift-config --to=/tmp --confirm
# Modify it — add a user, change a password, etc.
htpasswd -B -b /tmp/htpasswd newuser NewPass456!
# Push the updated file back — oauth pods restart automatically
oc set data secret/htpasswd-secret --from-file=htpasswd=/tmp/htpasswd -n openshift-config
Option B — LDAP / Active Directory
Integrate OpenShift with an existing LDAP directory (Microsoft AD, Red Hat Directory Server, OpenLDAP). Authentication is delegated to the directory; no separate password management is needed inside OpenShift.
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: github
mappingMethod: claim
type: GitHub
github:
clientID: "<your-github-client-id>"
clientSecret:
name: github-client-secret
organizations:
- my-github-org # Restrict access to members of this org
EOF
Assign Roles with RBAC
After users or groups are created, assign them the appropriate role. OpenShift ships with five built-in cluster roles:
Role
Scope
What It Allows
cluster-admin
Cluster-wide
Full, unrestricted access to every resource
cluster-reader
Cluster-wide
Read-only access to all resources
admin
Namespace
Full control within a specific project/namespace
edit
Namespace
Create, update, and delete most resources in a project
view
Namespace
Read-only access within a project
# Cluster-wide role assignments
oc adm policy add-cluster-role-to-user cluster-admin admin
oc adm policy add-cluster-role-to-group cluster-reader ops-team
# Namespace-scoped role assignments
oc adm policy add-role-to-user admin alice -n my-project
oc adm policy add-role-to-group edit dev-team -n my-project
oc adm policy add-role-to-user view bob -n my-project
# Verify what a user is allowed to do
oc auth can-i get pods --as=alice -n my-project
Delete the kubeadmin User
Once your permanent IdP is working and at least one user has cluster-admin, delete the kubeadmin Secret. This is a hard security requirement — the kubeadmin password is stored in etcd and must not remain permanently.
⚠️
This is irreversible. Confirm you can run oc get nodes as your new admin user before executing the delete command. You cannot recover kubeadmin without reinstalling the cluster.
# Log out of kubeadmin and verify your new admin works
oc logout
oc login -u admin -p RedHatAdmin1! https://api.lab.ocp.lan:6443
oc get nodes # Must return all nodes in Ready state
# Now it is safe to delete kubeadmin
oc delete secret kubeadmin -n kube-system
Verify the Complete Authentication Configuration
# List all configured identity providers
oc get oauth cluster -o jsonpath='{.spec.identityProviders[*].name}'
# List every user OpenShift knows about
oc get users
# List all identities (shows which provider created each entry)
oc get identity
# Check all cluster-admin bindings
oc get clusterrolebindings -o wide | grep cluster-admin
ℹ️
Multiple IdPs at once: You can list more than one provider in spec.identityProviders. Each entry needs a unique name field. Users authenticating via different providers are treated as separate identities unless you configure a lookup or add mapping method to merge them.
info StorageClasses: Explain OpenShift Data Foundation (ODF)
After the cluster installation completes, the environment is "empty." You must now configure Persistent Storage for cluster operations.
OpenShift requires a StorageClass to fulfill Persistent Volume Claims (PVCs). Best storage depends on the environment (NFS for simplicity, ODF/OCS for production-grade software-defined storage).
Dive into the storage architecture of OpenShift, moving from the basic concepts of the Container Storage Interface (CSI) to advanced software-defined storage like OpenShift Data Foundation (ODF).
1. The OpenShift Storage Hierarchy
To understand OpenShift storage, you must distinguish between the physical storage and the virtualized requests made by applications.
Persistent Volume (PV): The actual "disk" (network-attached or local) provisioned by the administrator.
Persistent Volume Claim (PVC): The request made by a developer for a certain amount of storage.
StorageClass (SC): The "template" that defines how a PV is created (e.g., fast SSD vs. slow HDD).
2. Core Storage Types
OpenShift categorizes storage based on how many nodes can access it simultaneously.
Behavior: Only one node can mount the volume at a time. It is highly performant and supports low-latency transactions.
B. File Storage (RWX - ReadWriteMany)
Technology: NFS, Azure Files, ODF CephFS.
Best For: Shared media folders, CMS uploads (WordPress), or data pipelines where multiple pods need to read/write the same files.
Behavior: Multiple nodes can mount the same volume simultaneously.
C. Object Storage
Technology: S3, MinIO, ODF NooBaa.
Best For: Backups, AI/ML datasets, and cloud-native applications.
Behavior: Accessed via API (HTTP/HTTPS) rather than a filesystem mount. It is virtually infinitely scalable.
3. Deep Dive: OpenShift Data Foundation (ODF)
Formerly known as OCS (OpenShift Container Storage), ODF is the "Gold Standard" for OpenShift storage. It is built on Ceph, Rook, and NooBaa.
Key Advantages:
Platform Agnostic: Whether you are on-premise (VMware/Bare Metal) or in the cloud (AWS/Azure), ODF provides the same StorageClasses.
Hyper-Converged: You don't need an external SAN. ODF uses the spare disks already inside your worker nodes.
Dynamic Provisioning: It automatically creates volumes as soon as a developer creates a PVC.
Resilience: By default, data is replicated across 3 different nodes. If one node fails, the data remains available.
ODF Component Breakdown:
Component
Function
Storage Type
Ceph RBD
High-performance block storage
Block (RWO)
CephFS
Shared filesystem storage
File (RWX)
NooBaa
Multi-cloud object gateway
Object (S3)
4. Hostpath and Local Storage
For edge cases or small-scale labs, you may encounter these:
HostPath: Uses a directory on the node’s local disk. Warning: If the pod moves to another node, the data stays behind and the pod loses access.
Local Storage Operator (LSO): A more robust way to use local NVMe/SSD disks. Unlike HostPath, LSO allows the scheduler to track which node "owns" the data.
5. Architectural Decision Matrix
As a Solution Architect, use this table to choose your storage backend:
Use Case
Recommended Storage
Access Mode
Database (Prod)
ODF RBD (Block)
RWO
Content Management
ODF CephFS or NFS
RWX
Machine Learning Models
ODF NooBaa (S3)
Object
Temporary Scratch Space
emptyDir
RWO
Registry Storage
ODF CephFS
RWX
6. Pro-Tips for Production
Snapshotting: Ensure your storage provider supports CSI Snapshots for quick backups before application updates.
Expansion: Use a StorageClass with allowVolumeExpansion: true. This allows you to grow a disk without deleting the pod.
IOPS Limiting: In multi-tenant clusters, use StorageQuotas to prevent one team from consuming all the storage bandwidth or capacity.
info Yaml Configuration Details
The install-config.yaml Blueprint
This is the only file you create manually. It acts as the blueprint for the entire installation. Key fields to populate:
pullSecret — Authorizes nodes to pull OpenShift images from Red Hat registries.
sshKey — Allows SSH access into RHCOS nodes as the core user for troubleshooting.
networking — Defines cluster and service network CIDRs.
imageContentSources — Points to your local mirror registry (required for air-gapped installs).
How to Get the Pull Secret
Log in to the Red Hat OpenShift Cluster Manager at cloud.redhat.com/openshift.
Download the pull secret using the "Download pull secret" button.
Paste the entire single-line JSON string into your install-config.yaml inside single quotes.
How to Get the SSH Key
# Check for existing keys
ls ~/.ssh/id_rsa.pub || ls ~/.ssh/id_ed25519.pub
# Generate a new key pair (if needed)
ssh-keygen -t ed25519 -f ~/.ssh/id_ocp -C "admin@ocp-cluster"
# Output the public key to copy into install-config.yaml
cat ~/.ssh/id_ocp.pub
OpenShift Client vs. Installer — Quick Reference
Feature
OpenShift Client (oc)
OpenShift Installer
Filename
openshift-client-linux.tar.gz
openshift-install-linux.tar.gz
Primary Goal
Managing an existing cluster
Creating or destroying a cluster
Main Binary
oc (and kubectl)
openshift-install
Usage Period
Daily, for the life of the cluster
Primarily during Day 1 setup
Capabilities
Deploy apps, check logs, manage users
Provision VMs, generate Ignition files
Helper Node Interface Roles
Interface
Typical Role
Description
ens192
External / Public
Front-end traffic — connects to the internet or corporate load balancer to serve applications.
info Building an OpenShift cluster on-premises - Installation Methods
Building an OpenShift cluster on-premises requires shifting from the "push-button" automation of public clouds to a more hands-on infrastructure management approach. In 2026, the process is largely standardized through Red Hat's Assisted Installer or Agent-based methods.
1. Assisted Installer
A user-friendly web interface (hosted at https://www.google.com/search?q=console.redhat.com) that generates a discovery ISO. You boot your on-prem servers with this ISO, and they "call home" to the web console, allowing you to configure the cluster graphically.
2. IPI (Installer-Provisioned Infrastructure)
Full automation. The installer has API access to your infrastructure (like VMware vSphere or OpenStack) and creates the VMs, storage, and networking for you.
3. UPI (User-Provisioned Infrastructure)
Maximum control. You manually prepare the VMs, load balancers, and DNS. This is typical for Bare Metal or highly restricted "Air-Gapped" environments.
info Minimum Cluster Hardware (Production Grade)
Node Type
CPU
RAM
Disk
Control Plane (3x)
4 vCPU
16 GB
120 GB (SSD preferred)
Compute/Worker (2x+)
4 vCPU
16 GB
120 GB
Bootstrap (1x)
4 vCPU
16 GB
120 GB (Deleted after install)
info Do we need Boostrap node to add new Control or Worker Node?
Once the cluster is up and running (Day 2 operations), the Control Plane (Masters) takes over all management tasks.
1. Adding a New Worker Node
When you boot a new Worker node with its ignition file, it communicates directly with the API Server on the Master nodes, CSR Approval: You will need to approve the Certificate Signing Requests (CSRs) using oc get csr and oc adm certificate approve {name}.
2. Adding a New Control Plane (Master) Node
OpenShift clusters are typically designed with an odd number of Control Plane nodes (usually 3) to maintain Etcd Quorum, If you want to move from 3 to 5 Masters, you add them to the existing, healthy cluster. The new Master joins the existing Etcd cluster managed by the current Masters..
Key Considerations for UPI
Ignition Expiry
Ignition files contain certificates valid for 24 hours. If you don't finish the install by then, you must regenerate them.
Disk Cleanup
If an install fails, you must wipe the disks of the nodes before retrying. RHCOS will not overwrite an existing partition table automatically.
How to Clean Disk after installation fails?
Wiping the disks is a critical step because if RHCOS detects an existing ignition configuration or a partition table, it may fail to apply the new configuration, leading to a "zombie" node state.
RHCOS uses Ignition, which runs in the initramfs stage. If Ignition sees a partition labeled boot or root already on the disk, it might assume the installation was already completed and skip critical configuration steps.
Pro-Tip: If you are debugging a failed bootstrap, always wipe the Bootstrap node first. It is the source of truth for the rest of the cluster. If the Bootstrap node has old data, it will feed incorrect information to the Master nodes.
The "Live ISO" Method (Easiest for Manual Labs)
Boot the node using the RHCOS Live ISO.
Once you reach the prompt (or press CTRL+ALT+F2 to get a console), identify your disk (Usually, it is /dev/sda or /dev/nvme0n1):
lsblk
Then clean the found partition
sudo wipefs -a /dev/sda
Reboot the node and start the installation again.
info How Nodes Know Which Images to Pull from the Mirror Registry
When you run coreos-installer with the -u flag, the node downloads the raw RHCOS operating system image from your local web server — this is just the base OS with no OpenShift components. After first boot, the node needs to pull dozens of OpenShift container images (API server, etcd, operators, etc.) from a registry. In an air-gapped environment, two fields in install-config.yaml work together to make this seamless.
1. imageContentSources (Mirror Redirect Rules)
This field tells every node: "whenever you need an image from quay.io or registry.redhat.io, silently redirect that request to my local mirror instead." The node never needs to know it's in a disconnected environment — it requests images by their original Red Hat names and OpenShift handles the redirect automatically.
2. additionalTrustBundle (Internal CA Certificate)
Your local mirror registry uses a self-signed or internally-issued TLS certificate. Without this field, nodes would reject connections to it as untrusted. The additionalTrustBundle injects your internal CA certificate into every node's trust store so HTTPS connections to the mirror registry are accepted without error.
Below is a fully annotated install-config.yaml covering every field you need for a UPI air-gapped deployment. Every line is commented so you know exactly what it controls and why it exists.
⚠️
The installer consumes and deletes this file. Always keep a backup copy before running openshift-install create manifests. Once deleted, you cannot recover it from the generated output.
# ─────────────────────────────────────────────────────────────────
# API VERSION
# Must always be v1. This is the only supported version.
# ─────────────────────────────────────────────────────────────────
apiVersion: v1
# ─────────────────────────────────────────────────────────────────
# BASE DOMAIN
# The parent DNS domain for your cluster.
# The cluster name below is prepended to form the full domain:
# <clusterName>.<baseDomain> → lab.ocp.lan
# Your DNS must have records for:
# api.lab.ocp.lan → Load Balancer IP (port 6443)
# *.apps.lab.ocp.lan → Ingress Load Balancer IP
# ─────────────────────────────────────────────────────────────────
baseDomain: ocp.lan
# ─────────────────────────────────────────────────────────────────
# CLUSTER NAME
# Short name for this cluster. Combined with baseDomain above.
# Used in all internal DNS names and TLS certificates.
# ─────────────────────────────────────────────────────────────────
metadata:
name: lab #Cluster name
# ─────────────────────────────────────────────────────────────────
# COMPUTE (WORKER) NODES
# Defines the default worker MachineSet.
# In UPI mode, the installer does NOT create machines automatically.
# Set replicas: 0 — you will boot workers manually.
# hyperthreading: Enabled is the default and recommended setting.
# ─────────────────────────────────────────────────────────────────
compute:
- name: worker
replicas: 0 # Must be 0 for UPI — you provision workers manually
hyperthreading: Enabled
architecture: amd64 # Use arm64 for ARM-based nodes
# ─────────────────────────────────────────────────────────────────
# CONTROL PLANE (MASTER) NODES
# Always set replicas: 3 for a production HA cluster.
# A single master (replicas: 1) is supported only for dev/test.
# ─────────────────────────────────────────────────────────────────
controlPlane:
name: master
replicas: 3 # 3 = HA. Never use 2 (no quorum).
hyperthreading: Enabled
architecture: amd64
# ─────────────────────────────────────────────────────────────────
# NETWORKING
# Defines the internal IP address ranges used inside the cluster.
# These are virtual ranges — they do NOT need to exist on your
# physical network. They must not overlap with your node IPs.
#
# networkType: OVNKubernetes is the current default and recommended.
# OpenShiftSDN is deprecated as of OCP 4.15.
#
# clusterNetwork: The CIDR for pod IP addresses.
# hostPrefix: /23 means each node gets a /23 subnet (~510 pod IPs).
#
# serviceNetwork: The CIDR for Kubernetes Service (ClusterIP) objects.
# Must be a single entry. /16 gives 65,534 service IPs.
#
# machineNetwork: The CIDR of your physical node network.
# Must match the real subnet your nodes are on (192.168.22.0/24).
# ─────────────────────────────────────────────────────────────────
networking:
networkType: OVNKubernetes
clusterNetwork:
- cidr: 10.128.0.0/14 # Pod IP range across the cluster
hostPrefix: 23 # Subnet size allocated per node
serviceNetwork:
- 172.30.0.0/16 # Kubernetes service (ClusterIP) range
machineNetwork:
- cidr: 192.168.22.0/24 # Must match your physical node subnet
# ─────────────────────────────────────────────────────────────────
# PLATFORM
# Set to "none" for UPI — tells the installer not to create any
# cloud or virtualization resources automatically.
# ─────────────────────────────────────────────────────────────────
platform:
none: {}
# ─────────────────────────────────────────────────────────────────
# FIPS MODE (Optional)
# Enables FIPS 140-2/3 validated cryptographic modules.
# Required for US federal / DoD environments.
# Cannot be changed after installation.
# ─────────────────────────────────────────────────────────────────
fips: false
# ─────────────────────────────────────────────────────────────────
# PUBLISH STRATEGY
# Controls how the API server endpoint is exposed.
# External: API accessible from outside the cluster network (default)
# Internal: API accessible only within the cluster network
# For air-gapped environments, "Internal" is typically used.
# ─────────────────────────────────────────────────────────────────
publish: Internal
# ─────────────────────────────────────────────────────────────────
# PULL SECRET
# Authenticates nodes to pull container images from:
# - registry.redhat.io (Red Hat operator images)
# - quay.io (OpenShift release images)
# - your local mirror (air-gapped environments)
#
# For air-gapped installs, add your mirror registry credentials
# into this JSON alongside the Red Hat entries.
#
# Get from: https://console.redhat.com/openshift/install/pull-secret
# Must be a single-line JSON string inside single quotes.
# ─────────────────────────────────────────────────────────────────
pullSecret: '{"auths":{"registry.redhat.io":{"auth":"<base64-encoded-credentials>"},"quay.io":{"auth":"<base64-encoded-credentials>"},"mirror-registry.ocp.lan:8443":{"auth":"<base64-encoded-mirror-credentials>"}}}'
# ─────────────────────────────────────────────────────────────────
# SSH KEY
# Your SSH public key, injected into every RHCOS node.
# Allows SSH access as the built-in "core" user for troubleshooting.
# Only the PUBLIC key goes here — never the private key.
# Generate with: ssh-keygen -t ed25519 -f ~/.ssh/id_ocp
# ─────────────────────────────────────────────────────────────────
sshKey: 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAA... admin@ocp-cluster'
# ─────────────────────────────────────────────────────────────────
# ADDITIONAL TRUST BUNDLE
# Your internal CA certificate in PEM format.
# Required when your mirror registry uses a self-signed or
# internally-issued TLS certificate.
# Injected into every node's system trust store on first boot.
# Must be indented under the key with 2 spaces.
# to allow access to port 8443, generate the next CA then add to additionalTrustBundle
# openssl s_client -showcerts -connect mirror-registry.ocp.lan:8443 </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > registry-ca.crt # Note: The | symbol is mandatory in YAML; it allows for a multi-line string.
# ─────────────────────────────────────────────────────────────────
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
MIIFazCCA1OgAwIBAgIUYourInternalCAcertificateHere...
<full PEM certificate content>
-----END CERTIFICATE-----
# ─────────────────────────────────────────────────────────────────
# If you need to ignore get the certificate, ignore the previous step and add this step:
# insecureRegistries:
# - mirror-registry.ocp.lan:8443
# ─────────────────────────────────────────────────────────────────
# IMAGE CONTENT SOURCES (imageDigestMirrors in OCP 4.13+)
# Tells every node to redirect image pulls from Red Hat registries
# to your local mirror registry instead.
# The "source" is the original Red Hat registry path.
# The "mirrors" list is where to redirect requests.
# The installer bakes these rules into the .ign files and creates
# an ImageContentSourcePolicy object in the cluster on first boot.
# ─────────────────────────────────────────────────────────────────
imageContentSources:
- mirrors:
- mirror-registry.ocp.lan:8443/openshift/release
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- mirror-registry.ocp.lan:8443/openshift/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
- mirrors:
- mirror-registry.ocp.lan:8443/redhat
source: registry.redhat.io/redhat
- mirrors:
- mirror-registry.ocp.lan:8443/ubi8
source: registry.redhat.io/ubi8
# ─────────────────────────────────────────────────────────────────
# PROXY (Optional)
# Only needed if your nodes reach the mirror registry through
# an HTTP/HTTPS proxy. Leave out entirely if no proxy is used.
# noProxy: comma-separated list of hosts/CIDRs to bypass the proxy.
# ─────────────────────────────────────────────────────────────────
# proxy:
# httpProxy: http://proxy.example.com:3128
# httpsProxy: http://proxy.example.com:3128
# noProxy: 192.168.22.0/24,mirror-registry.ocp.lan,.ocp.lan
# ─────────────────────────────────────────────────────────────────
# CLUSTER CAPABILITIES (Optional — OCP 4.12+)
# Controls which optional cluster components get installed.
# Use to reduce footprint in resource-constrained environments.
# "vCurrent" installs all capabilities for your OCP version.
# ─────────────────────────────────────────────────────────────────
# capabilities:
# baselineCapabilitySet: vCurrent
# additionalEnabledCapabilities:
# - marketplace
# - openShiftSamples