Thursday, April 30, 2026

OpenShift & Containers — Complete Guide

Unified Study Guide · DO188 · DO180 · DO288 · DO280

OpenShift & Containers
Complete Guide

All key concepts, bash commands, and architectures extracted from four official Red Hat student workbooks and organized into one progressive knowledge reference.

DO188 Intro to Containers with Podman (4.12) DO180 OpenShift Admin I — Managing Containers & K8s (4.12) DO288 Dev I — Containerizing Applications (4.2) DO280 Admin II — Configuring a Production Cluster (4.14)

Chapter 1 · DO188

Container Fundamentals

A container is an encapsulated process that includes all required runtime dependencies. Unlike a virtual machine, a container shares the host kernel but isolates its filesystem, network, and process tree using Linux kernel primitives.

🧱

Namespaces

Kernel feature that isolates processes — each container gets its own PID, network, mount, UTS, IPC, and user namespaces, providing the illusion of a dedicated machine.

⚙️

Control Groups (cgroups)

Kernel mechanism for resource management. Limits and tracks CPU time, memory, disk I/O, and network bandwidth allocated to containers.

📦

Container Image

An immutable, layered archive defining an application and its libraries. Read-only layers are stacked via a union filesystem; a writable layer is added at runtime.

🏃

Container Instance

A running process created from a container image. Analogous to an object instantiated from a class. Many instances can run from a single image simultaneously.

📋

OCI Standard

The Open Container Initiative defines image-spec and runtime-spec so any compliant engine (Podman, Docker, CRI-O) can run the same images interchangeably.

🔄

Ephemeral by Default

Container engines remove the writable layer when a container is deleted. Any data written inside a container is lost unless explicitly persisted via a volume or bind mount.

Containers vs. Virtual Machines

Attribute	Virtual Machine	Container
Machine-level component	Hypervisor (KVM, VMware, Hyper-V)	Container engine (Podman, CRI-O)
Virtualization level	Fully virtualized environment + own kernel	Shared host kernel; isolated user-space only
Typical size	Gigabytes	Megabytes
Startup time	Minutes	Milliseconds to seconds
Portability	Usually tied to same hypervisor	Any OCI-compliant engine
Best for	Full OS isolation, non-Linux workloads	Microservices, scale-out applications

Chapter 2 · DO188

Podman — Container Engine

Podman (Pod Manager) is a daemonless, rootless-capable OCI container engine from Red Hat. Unlike Docker, it does not require a background daemon — each podman invocation runs as a regular process, reducing attack surface.

Checking the Installation

verify podman

$ podman -v # Output: podman version 4.x.x $ podman info # Shows host OS, kernel, storage driver, registry configuration

Running Your First Container

run containers

Run a one-shot command inside a RHEL container (image is pulled automatically if not local)

$ podman run registry.redhat.io/rhel7/rhel:7.9 echo 'Red Hat' # Run interactively with a bash shell $ podman run -it registry.redhat.io/rhel7/rhel:7.9 /bin/bash # Run detached (background) with port mapping host:container $ podman run -d -p 8080:8080 registry.access.redhat.com/ubi8/httpd-24:latest # Run with auto-remove on exit, environment variable, and custom name $ podman run --rm --name myapp -e NAME='Red Hat' registry.redhat.io/rhel7/rhel:7.9 printenv NAME # Bind only to localhost (prevents external access) $ podman run -p 127.0.0.1:8075:80 my-app

Managing Container Lifecycle

Command	Description
`podman ps`	List running containers
`podman ps --all`	List all containers including stopped ones
`podman ps --all --format=json`	Output container list as JSON
`podman stop <name\|id>`	Send SIGTERM, then SIGKILL after timeout (default 10 s)
`podman stop -t 30 <name>`	Graceful stop with custom timeout
`podman kill <name>`	Send SIGKILL immediately
`podman start <name>`	Restart a stopped container
`podman restart <name>`	Stop then start a container
`podman rm <name\|id>`	Remove a stopped container
`podman rm -f <name>`	Force-remove a running container
`podman pause / unpause <name>`	Freeze / resume all processes in a container using cgroup freezer

Inspecting and Interacting with Containers

inspect & exec

# Run a command in a running container $ podman exec -it myapp /bin/bash $ podman exec myapp cat /etc/hostname # Get full JSON metadata (IP, mounts, env vars, etc.) $ podman inspect myapp # Extract a specific field using Go template $ podman inspect myapp -f '{{.NetworkSettings.Networks.apps.IPAddress}}' # Copy files between host and container $ podman cp myapp:/etc/hosts /tmp/hosts $ podman cp /tmp/config.yaml myapp:/app/config.yaml # Show port mappings for a container $ podman port myapp $ podman port --all # Tail container logs (follow mode) $ podman logs -f myapp $ podman logs --tail 50 myapp

Rootless Containers

Rootless Podman runs containers without root privileges. The user namespace maps the container's internal root (UID 0) to the current unprivileged host user. This limits the blast radius of container escapes.

rootless podman

# Run a container as your unprivileged user (no sudo needed) $ podman run --userns=auto registry.access.redhat.com/ubi8/ubi-minimal bash # View the user namespace mapping $ podman unshare cat /proc/self/uid_map # Storage location for rootless images # Default: ~/.local/share/containers/storage

Chapter 3 · DO188 / DO288

Container Images & Registries

A container image is a read-only, layered archive (OCI image-spec). Each layer is a diff of filesystem changes. Layers are cached and shared between images to save disk space.

Image Naming Convention

[registry/][namespace/]image-name[:tag][@digest] Examples: registry.redhat.io/rhel9/rhel:9.2 ← Red Hat Registry, versioned tag registry.access.redhat.com/ubi8:latest ← UBI image, floating tag quay.io/myorg/myapp:v2.1.3 ← Quay.io, semantic version myapp@sha256:a1b2c3... ← Pinned by digest (immutable)

Pulling, Listing, Tagging, Removing Images

image management

# Pull an image from a registry $ podman pull registry.redhat.io/rhel7/rhel:7.9 # List all local images $ podman images # Tag a local image (does not copy data, creates an alias) $ podman tag myapp:latest quay.io/myorg/myapp:v2.1.3 # Push image to a remote registry $ podman push quay.io/myorg/myapp:v2.1.3 # Inspect image metadata (layers, env vars, entrypoint, etc.) $ podman inspect registry.redhat.io/ubi8/ubi-minimal:latest # Remove a local image $ podman rmi myapp:latest # Remove all unused images $ podman image prune # Search for images in registries $ podman search ubi8

Registry Authentication

registry auth

# Log in to the Red Hat Registry (prompts for credentials) $ podman login registry.redhat.io # Log in to Quay.io $ podman login quay.io # Log in to the internal OpenShift registry using oc token $ podman login -u $(oc whoami) -p $(oc whoami -t) \ default-route-openshift-image-registry.apps.ocp4.example.com # Log out of a specific registry $ podman logout quay.io # Log out of all registries $ podman logout --all

Skopeo — Registry Manipulation Without Pulling

Skopeo works with container images at the registry level, without needing a running container engine. It inspects, copies, and deletes images across registries efficiently.

skopeo

# Inspect image metadata without downloading it $ skopeo inspect docker://registry.redhat.io/ubi8/ubi-minimal:latest # Inspect with credentials $ skopeo inspect --creds user:password \ docker://registry.redhat.io/rhscl/postgresql-96-rhel7 # Copy image between registries (no intermediate local storage needed) $ skopeo copy --dest-tls-verify=false \ docker://registry.redhat.io/ubi8/ubi-minimal:latest \ docker://registry.example.com/myorg/ubi-minimal:latest # Copy from local storage to remote registry $ skopeo copy --dest-tls-verify=false \ containers-storage:myimage \ docker://registry.example.com/myorg/myimage # Copy between two private registries with different credentials $ skopeo copy --src-creds=user1:pass1 --dest-creds=user2:pass2 \ docker://src-registry.example.com/myimage \ docker://dest-registry.example.com/myimage # Delete an image from a registry $ skopeo delete docker://registry.example.com/myorg/old-image:tag

Chapter 4 · DO188 / DO288

Building Custom Images — Containerfile

A Containerfile (compatible with Dockerfile syntax) is a text recipe for building a container image. Each instruction creates a new immutable layer.

Essential Containerfile Instructions

Instruction	Purpose	Example
`FROM`	Base image to build upon. Every Containerfile starts here.	`FROM ubi8/ubi-minimal:8.8`
`RUN`	Execute a shell command during build (creates a layer).	`RUN dnf install -y python3 && dnf clean all`
`COPY`	Copy files from build context into the image.	`COPY app.py /app/app.py`
`ADD`	Like COPY but also handles URLs and auto-extracts tar archives.	`ADD app.tar.gz /app/`
`WORKDIR`	Set working directory for subsequent instructions.	`WORKDIR /app`
`ENV`	Set environment variables available at build and runtime.	`ENV PORT=8080 DEBUG=false`
`ARG`	Build-time variable (not available at runtime unless set in ENV too).	`ARG VERSION=1.0`
`EXPOSE`	Documents which port the container listens on (informational only).	`EXPOSE 8080`
`USER`	Switch to a non-root user for all subsequent instructions and the final container.	`USER 1001`
`ENTRYPOINT`	Main command that always runs (use exec form JSON array).	`ENTRYPOINT ["python3", "-m", "http.server"]`
`CMD`	Default arguments for ENTRYPOINT, or the default command if no ENTRYPOINT.	`CMD ["8080"]`
`LABEL`	Attach key-value metadata to the image.	`LABEL version="1.0" maintainer="team@example.com"`
`VOLUME`	Declare a mount point for persistent storage.	`VOLUME /data`
`HEALTHCHECK`	Define a command to probe container health.	`HEALTHCHECK CMD curl -f http://localhost:8080/health \|\| exit 1`

Multi-Stage Build Example

Multi-stage builds produce small production images by separating the build environment from the runtime environment.

CONTAINERFILE — multi-stage Go application

# ── Stage 1: Build ────────────────────────────────────── FROM registry.access.redhat.com/ubi8/go-toolset:1.17 AS build WORKDIR /app COPY go.mod go.sum ./ RUN go mod download COPY . . RUN go build -o myapp . # ── Stage 2: Runtime (only the binary, no build tools) ── FROM registry.access.redhat.com/ubi8/ubi-minimal:latest WORKDIR /app # Copy only the compiled binary from the build stage COPY --from=build /app/myapp . EXPOSE 8080 USER 1001 ENTRYPOINT ["./myapp"]

Building Images with Podman

podman build

# Build from current directory Containerfile $ podman build -t myapp:latest . # Build from a specific Containerfile in a different context $ podman build -f Containerfile.prod -t myapp:prod /path/to/context # Pass build arguments $ podman build --build-arg VERSION=2.0 -t myapp:2.0 . # Squash all layers into one (reduces image size) $ podman build --squash -t myapp:squashed . # Multi-platform build $ podman build --platform linux/amd64,linux/arm64 -t myapp:multi . # View image history (layers) $ podman history myapp:latest

OpenShift Note

OpenShift requires containers to run as non-root. Always add USER 1001 (or any UID above 1000) as the last USER instruction. OpenShift will reject pods whose containers run as UID 0 unless a special SCC is granted.

Chapter 5 · DO188

Persisting Data — Volumes & Bind Mounts

Containers are ephemeral; data written inside is lost on removal. There are two main mechanisms to persist data outside the container lifecycle.

Podman Named Volumes

Named volumes are managed by Podman, stored under ~/.local/share/containers/storage/volumes/ (rootless) or /var/lib/containers/storage/volumes/ (root). They survive container removal.

named volumes

# Create a named volume $ podman volume create mydata # List volumes $ podman volume ls # Inspect volume (shows mountpoint on host) $ podman volume inspect mydata # Mount volume into a container at /var/lib/myapp $ podman run -d -v mydata:/var/lib/myapp:Z myapp:latest # Remove a volume (fails if in use) $ podman volume rm mydata # Remove all unused volumes $ podman volume prune

Bind Mounts

Bind mounts map a host directory into the container. Useful during development to share source code.

bind mounts

# Mount host directory /host/data into /app/data inside container $ podman run -v /host/data:/app/data:Z myapp:latest # Read-only bind mount $ podman run -v /host/config:/app/config:ro,Z myapp:latest # :Z label tells Podman to relabel the files for SELinux (required on RHEL/CentOS) # :z shares the label between containers (less restrictive)

Running a Database Container with Persistent Storage

postgresql with volume

$ podman volume create pgdata $ podman run -d --name postgres \ -e POSTGRESQL_ADMIN_PASSWORD=redhat \ -e POSTGRESQL_DATABASE=mydb \ -e POSTGRESQL_USER=myuser \ -e POSTGRESQL_PASSWORD=mypass \ -v pgdata:/var/lib/pgsql/data:Z \ -p 5432:5432 \ registry.redhat.io/rhel8/postgresql-13:latest # Connect to the running database $ podman exec -it postgres psql -U myuser -d mydb

Chapter 2 · DO188

Container Networking

Podman uses a software-defined network layer (CNI or Netavark) to connect containers. By default, each container gets a private IP address on the podman bridge network. DNS-based name resolution works between containers on the same user-defined network.

Podman Network Commands

podman network

# Create a custom bridge network $ podman network create example-net # Create a network with specific subnet $ podman network create --subnet 192.168.100.0/24 my-subnet # List all networks $ podman network ls # Inspect a network (shows subnet, gateway, connected containers) $ podman network inspect example-net # Remove a network (must have no connected containers) $ podman network rm example-net # Remove all unused networks $ podman network prune # Connect a new container to a network $ podman run -d --name my-container --net example-net container-image:latest # Connect a container to multiple networks at launch $ podman run -d --name gateway --net frontend-net,backend-net my-gateway # Connect an already running container to a network $ podman network connect example-net my-container

DNS Resolution

Containers on the same user-defined network can reach each other by container name. For example, a web app on app-net can reach a database named db at db:5432. The default podman network does not provide DNS.

Chapter 6 · DO188

Troubleshooting Containers

Log Access and Debugging

debugging commands

# Stream container logs in real time $ podman logs -f myapp # Show last 100 lines of logs $ podman logs --tail 100 myapp # Show logs with timestamps $ podman logs -t myapp # Run a debug shell in a running container $ podman exec -it myapp /bin/sh # Run a debug shell in a new temporary container from same image $ podman run --rm -it myapp:latest /bin/sh # View running processes inside a container $ podman top myapp # Live resource usage stats $ podman stats myapp # Check container exit code after failure $ podman inspect myapp -f '{{.State.ExitCode}}'

Chapter 7 · DO188

Multi-Container Applications with Podman Compose

Podman Compose reads a docker-compose.yml / compose.yaml file and translates each service into Podman containers, networks, and volumes. It is ideal for local development environments.

Sample Compose File

compose.yaml — web + api + database

version: "3.8" services: frontend: image: registry.example.com/myapp/frontend:latest ports: - "8080:8080" networks: - app-net depends_on: - backend backend: image: registry.example.com/myapp/backend:latest environment: DB_HOST: db DB_PORT: "5432" networks: - app-net - db-net depends_on: - db db: image: registry.redhat.io/rhel8/postgresql-13:latest environment: POSTGRESQL_ADMIN_PASSWORD: redhat POSTGRESQL_DATABASE: mydb volumes: - pgdata:/var/lib/pgsql/data networks: - db-net volumes: pgdata: networks: app-net: db-net:

Podman Compose Commands

podman-compose

# Start all services (detached) $ podman-compose up -d # Start and rebuild images if changed $ podman-compose up -d --build # View running compose services $ podman-compose ps # Stream logs from all services $ podman-compose logs -f # Stop and remove all containers/networks created by compose $ podman-compose down # Scale a service to N replicas $ podman-compose up --scale backend=3 -d # Also filter logs by project label $ podman ps -a --filter label=io.podman.compose.project=myproject

Chapter 1 · DO180

Kubernetes Architecture

Kubernetes is an open-source container orchestration system. It groups containers into Pods, ensures desired state, scales workloads, and manages networking and storage across a cluster of nodes.

🖥️

Control Plane

Runs the API server (kube-apiserver), scheduler, controller manager, and etcd. Manages cluster state and makes scheduling decisions.

⚙️

Worker Nodes

Run the kubelet (node agent), kube-proxy, and a container runtime (CRI-O). Execute the actual workloads.

📦

Pod

The smallest deployable unit — a group of one or more tightly coupled containers sharing a network namespace (same IP) and storage volumes.

🔁

ReplicaSet

Ensures a specified number of Pod replicas are always running. Replaces failed pods automatically.

🚀

Deployment

Manages ReplicaSets declaratively. Enables rolling updates, rollbacks, and scaling. The standard way to deploy stateless apps.

🔌

Service

A stable virtual IP and DNS name that load-balances traffic to a set of matching Pods. Types: ClusterIP, NodePort, LoadBalancer.

🗃️

etcd

Distributed key-value store that holds all cluster state (resource definitions, secrets, configuration). The single source of truth.

📋

Namespace

Virtual cluster within Kubernetes for multi-tenancy. Resources in different namespaces are isolated, and resource quotas can be applied per namespace.

kubectl — Kubernetes CLI

kubectl essentials

# Imperative deployment (quick, not reproducible) $ kubectl create deployment db-pod --port 3306 \ --image registry.example.com/rhel8/mysql-80 # Set environment variables on a deployment $ kubectl set env deployment/db-pod \ MYSQL_USER='user1' \ MYSQL_PASSWORD='mypass' \ MYSQL_DATABASE='mydb' # Apply a manifest declaratively (creates or updates) $ kubectl apply -f deployment.yaml # Apply all manifests in a directory recursively $ kubectl apply -f manifests/ -R # Preview what apply would change (dry-run diff) $ kubectl diff -f deployment.yaml # Delete resources from a manifest $ kubectl delete -f deployment.yaml # Generate YAML manifest from an imperative command $ kubectl create deployment hello -o yaml --dry-run=client \ --image registry.example.com/redhattraining/hello:latest > hello.yaml # Explain any field of a resource kind $ kubectl explain deployment.spec.template.spec.containers

Chapter 1 · DO180 / DO288

Red Hat OpenShift Container Platform

Red Hat OpenShift (RHOCP) is an enterprise Kubernetes distribution that adds developer tools, security hardening, integrated CI/CD (Tekton/Pipelines), a web console, and enterprise support on top of upstream Kubernetes.

Foundation

Linux Kernel

→

Runtime

CRI-O

→

Orchestration

Kubernetes

→

Platform

OpenShift

→

Your App

Pods & Services

Logging In and Basic Navigation

oc login & navigation

# Log in to an OpenShift cluster $ oc login -u developer -p developer https://api.ocp4.example.com:6443 # Print current user $ oc whoami # Get web console URL $ oc whoami --show-console # Get API server token (used for registry auth) $ oc whoami -t # Switch to a project (namespace) $ oc project myproject # List all accessible projects $ oc projects # Create a new project $ oc new-project myapp --description "My Application"

Key OpenShift-Specific Concepts

🗂️

Project

OpenShift's enhanced Namespace. Adds access control, network policies, and resource quota isolation per team or application.

🛣️

Route

OpenShift extension to expose services to external traffic via a hostname. Backed by HAProxy; supports TLS termination (edge, passthrough, re-encrypt).

🔒

Security Context Constraint (SCC)

Cluster-level policy controlling what a pod can do (run as root, mount host paths, use host network, etc.). Default SCC prevents privileged operations.

🖼️

ImageStream

A pointer to container images that triggers automatic redeployments when the referenced image changes. Decouples image location from deployment config.

🏗️

BuildConfig

Defines how to build a container image from source — supports Docker, Source-to-Image (S2I), and custom build strategies.

⚙️

Operator

A Kubernetes controller that encodes operational knowledge for managing a complex application (e.g., database clusters). Extends the Kubernetes API with CRDs.

Chapters 2–5 · DO180

Managing Workloads in OpenShift

Deploying Applications with oc new-app

oc new-app

# Deploy from a container image $ oc new-app --image registry.access.redhat.com/ubi8/httpd-24 # Deploy from a Git repository (auto-detects language via S2I) $ oc new-app https://github.com/myorg/myapp # Specify S2I builder + source repository $ oc new-app php~http://gitserver.example.com/mygitrepo # Deploy from local source directory $ oc new-app . --name myapp # Pass environment variables $ oc new-app --image rhel8/mysql-80 \ -e MYSQL_USER=user -e MYSQL_PASSWORD=pass -e MYSQL_DATABASE=mydb

Viewing and Managing Resources

oc get / describe / delete

# List all resources in current project $ oc get all # List pods with status $ oc get pods # List pods with node and IP info $ oc get pods -o wide # Watch pods until ready $ watch oc get pods # Describe a pod (events, conditions, container states) $ oc describe pod myapp-6d8c4-xyz # Get pod logs $ oc logs myapp-6d8c4-xyz $ oc logs -f myapp-6d8c4-xyz # follow $ oc logs -c sidecar myapp-6d8c4-xyz # specific container # Get deployments $ oc get deployments # Scale a deployment to 3 replicas $ oc scale deployment myapp --replicas 3 # Rollout status $ oc rollout status deployment/myapp # Roll back a deployment $ oc rollout undo deployment/myapp # Delete a resource $ oc delete pod myapp-6d8c4-xyz $ oc delete deployment myapp

Exposing Services via Routes

services & routes

# Expose a deployment as a ClusterIP service on port 8080 $ oc expose deployment myapp --port 8080 # Create a Route (external hostname) from a service $ oc expose service myapp # Get routes to see assigned hostnames $ oc get routes # Create a TLS edge-terminated route with a custom hostname $ oc create route edge --service myapp \ --hostname myapp.apps.ocp4.example.com \ --cert tls.crt --key tls.key

StatefulSets — Stateful Workloads

For stateful applications (databases, message queues), use a StatefulSet. Pods get stable network identifiers (mydb-0, mydb-1) and each gets its own PersistentVolumeClaim.

statefulsets

# List StatefulSets $ oc get statefulsets # Scale a StatefulSet (pods are added/removed in order) $ oc scale statefulset mydb --replicas 3 # Delete a StatefulSet without deleting its pods $ oc delete statefulset mydb --cascade=orphan

Chapters 4–5 · DO288

Builds — Source-to-Image (S2I)

Source-to-Image (S2I) is an OpenShift build strategy that takes application source code and a builder image, injects the source, runs the build scripts, and produces a ready-to-run container image — without writing a Containerfile.

Input

Source Code (Git)

Input

S2I Builder Image

→

S2I Process

assemble script

→

Output

App Container Image

→

Deploy

Pods on Cluster

S2I Build Commands

s2i & oc start-build

# Build locally with s2i CLI (for testing the S2I process) $ s2i build https://github.com/myorg/myapp \ registry.access.redhat.com/ubi8/python-38 \ myorg/myapp:latest # Trigger a new build in OpenShift $ oc start-build myapp-build # Trigger a build from local source (binary input) $ oc start-build myapp-build --from-dir . # Watch build logs in real time $ oc logs -f bc/myapp-build # List all builds $ oc get builds # Cancel a running build $ oc cancel-build myapp-build-3

BuildConfig YAML Example

buildconfig.yaml

apiVersion: build.openshift.io/v1 kind: BuildConfig metadata: name: myapp spec: source: type: Git git: uri: https://github.com/myorg/myapp ref: main strategy: type: Source # S2I strategy sourceStrategy: from: kind: ImageStreamTag name: python:3.9-ubi8 namespace: openshift output: to: kind: ImageStreamTag name: myapp:latest triggers: - type: GitHub # Rebuild on push github: secret: my-github-secret - type: ImageChange # Rebuild when builder image updates

ImageStreams

imagestream commands

# List imagestreams in current project $ oc get imagestreams # List imagestream tags $ oc get imagestreamtags # Import an external image into an imagestream $ oc import-image myapp:latest \ --from quay.io/myorg/myapp:latest \ --confirm # Tag an existing imagestream tag $ oc tag myapp:latest myapp:stable

Chapter 2 · DO288 / Chapter 5 · DO180

Injecting Configuration — ConfigMaps & Secrets

Applications should not bake configuration into images. ConfigMaps hold non-sensitive configuration; Secrets hold sensitive data (passwords, tokens, keys) as base64-encoded values.

ConfigMaps

configmap

# Create from literal values $ oc create configmap app-config \ --from-literal APP_PORT=8080 \ --from-literal LOG_LEVEL=info # Create from a file $ oc create configmap nginx-conf --from-file nginx.conf # Create from all files in a directory $ oc create configmap app-props --from-file ./config/ # View the configmap $ oc get configmap app-config -o yaml # Use as environment variables in a pod spec # spec.containers[].envFrom: # - configMapRef: # name: app-config # Mount as a volume (files in the pod) # spec.volumes[]: configMap: name: nginx-conf # spec.containers[].volumeMounts[]: mountPath: /etc/nginx/conf.d

Secrets

secrets

# Create a generic secret from literals $ oc create secret generic db-credentials \ --from-literal username=myuser \ --from-literal password='S3cr3t!' # Create a TLS secret from certificate files $ oc create secret tls my-tls --cert tls.crt --key tls.key # Create a docker-registry secret (for pulling private images) $ oc create secret docker-registry quay-pull-secret \ --docker-server quay.io \ --docker-username myuser \ --docker-password mytoken # Link pull secret to service account $ oc secrets link default quay-pull-secret --for pull # View a secret (base64-encoded) $ oc get secret db-credentials -o yaml # Decode a secret value $ oc get secret db-credentials -o jsonpath='{.data.password}' | base64 -d

Chapter 5 · DO180

Persistent Storage in OpenShift

Kubernetes storage model decouples how storage is provisioned (PersistentVolume) from how it is requested (PersistentVolumeClaim).

Storage Object Hierarchy

StorageClass (admin creates once) ↓ dynamically provisions PersistentVolume (PV) — actual storage backing (NFS, iSCSI, AWS EBS, Ceph, etc.) ↓ bound to PersistentVolumeClaim (PVC) — developer's storage request (size, access mode) ↓ mounted by Pod — uses the PVC as a volume

Access Modes

Mode	Short Form	Meaning
`ReadWriteOnce`	`RWO`	One node can read and write. Suitable for block storage (AWS EBS, iSCSI).
`ReadOnlyMany`	`ROX`	Many nodes can read. Suitable for shared read-only configuration.
`ReadWriteMany`	`RWX`	Many nodes can read and write. Requires distributed storage (NFS, CephFS, GlusterFS).
`ReadWriteOncePod`	`RWOP`	Only a single Pod can access at a time (K8s 1.22+). Strongest isolation.

Working with PVCs

persistent volume claims

# List storage classes available in the cluster $ oc get storageclasses # List PersistentVolumes (admin view) $ oc get pv # List PersistentVolumeClaims in current project $ oc get pvc # Describe a PVC (status, bound PV, capacity) $ oc describe pvc mydata-pvc

pvc.yaml — request 5Gi of ReadWriteOnce storage

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: mydata-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: standard # omit to use default StorageClass

Chapter 6 · DO180

Application Reliability — Health Probes & Autoscaling

Health Probe Types

🚦

Liveness Probe

Checks if the application is still running. If it fails, Kubernetes restarts the container. Detects deadlocks and infinite loops.

✅

Readiness Probe

Checks if the application is ready to serve traffic. A failing readiness probe removes the pod from Service endpoints, preventing traffic to a still-starting app.

🔥

Startup Probe

Gates the liveness and readiness probes until the application has started. Critical for slow-starting applications to prevent premature restarts.

deployment.yaml — health probes + resource limits

spec: containers: - name: myapp image: myapp:latest resources: requests: cpu: "100m" # minimum guaranteed CPU (1000m = 1 core) memory: "256Mi" # minimum guaranteed memory limits: cpu: "500m" # maximum allowed CPU memory: "512Mi" # maximum allowed memory readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 10 failureThreshold: 3 livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 15 periodSeconds: 20

Horizontal Pod Autoscaler (HPA)

autoscaling

# Create an HPA targeting 70% CPU utilization, scaling from 2 to 10 pods $ oc autoscale deployment myapp \ --min 2 --max 10 --cpu-percent 70 # View HPA status (shows current / desired replicas) $ oc get hpa # Describe HPA events and scaling history $ oc describe hpa myapp

Resource Quotas & LimitRanges

quota & limits

# View resource quotas for the project $ oc get resourcequota # Describe quota usage (current vs. limit) $ oc describe resourcequota compute-resources # View LimitRange defaults applied to new pods $ oc get limitrange $ oc describe limitrange core-resource-limits

Chapter 3 · DO280

Authentication, Authorization & RBAC

OpenShift supports multiple identity providers. The most common lab/on-premises choice is HTPasswd. Role-Based Access Control (RBAC) governs what authenticated users can do.

HTPasswd Identity Provider

htpasswd identity provider setup

# Create an htpasswd file with a new user $ htpasswd -c -B htpasswd-users user1 # Add another user to the existing file $ htpasswd -B htpasswd-users user2 # Create a secret from the htpasswd file $ oc create secret generic htpasswd-secret \ --from-file htpasswd=htpasswd-users \ -n openshift-config # Update the OAuth cluster resource to use HTPasswd $ oc edit oauth cluster # Add under spec.identityProviders: # - name: htpasswd-provider # mappingMethod: claim # type: HTPasswd # htpasswd: # fileData: # name: htpasswd-secret # Verify OAuth pods are rolling out $ oc get pods -n openshift-authentication

RBAC — Roles and RoleBindings

Resource	Scope	Purpose
`Role`	Namespace	Defines a set of allowed API verbs (get, list, watch, create, update, delete) on specific resources within a namespace.
`RoleBinding`	Namespace	Binds a Role (or ClusterRole) to users/groups/service accounts within a namespace.
`ClusterRole`	Cluster-wide	Like Role but applies across all namespaces. Also used for non-namespaced resources (nodes, PVs).
`ClusterRoleBinding`	Cluster-wide	Binds a ClusterRole to subjects for cluster-wide access.

rbac commands

# List roles in current project $ oc get roles # List cluster roles $ oc get clusterroles # Grant user the edit role in current project $ oc adm policy add-role-to-user edit user1 # Grant user the view role in a specific namespace $ oc adm policy add-role-to-user view user2 -n production # Grant cluster-admin rights (use with caution!) $ oc adm policy add-cluster-role-to-user cluster-admin admin-user # Remove a role from a user $ oc adm policy remove-role-from-user edit user1 # Check what a user can do $ oc auth can-i create pods --as user1 $ oc auth can-i '*' '*' --as system:serviceaccount:myproject:default # Grant a service account the anyuid SCC (allows running as any user) $ oc adm policy add-scc-to-user anyuid -z myserviceaccount

Groups

groups

# Create a group $ oc adm groups new developers # Add users to a group $ oc adm groups add-users developers user1 user2 # Grant the group edit role in a namespace $ oc adm policy add-role-to-group edit developers -n myproject # List groups $ oc get groups

Chapter 1 · DO280

Declarative Resource Management & Kustomize

The declarative workflow describes desired state in YAML manifests and uses kubectl apply to reconcile the cluster to that state. This is reproducible, auditable (Git-based), and supports GitOps workflows.

Imperative vs. Declarative

	Imperative	Declarative
How	`kubectl create / delete` commands	YAML files + `kubectl apply`
Reproducibility	Difficult — depends on command history	High — files define the exact desired state
GitOps compatible	No	Yes
Best for	Quick one-off tasks, debugging	Production deployments, CI/CD pipelines

Kustomize — Configuration Overlays

Kustomize generates Kubernetes manifests from a base and environment-specific overlays without duplicating YAML. It is natively integrated into kubectl and oc.

my-app/ ├── base/ ← shared configuration for all environments │ ├── kustomization.yaml │ ├── deployment.yaml │ └── service.yaml └── overlays/ ├── staging/ │ ├── kustomization.yaml ← bases: [../../base]; namespace: myapp-stage │ └── patch-replicas.yaml ← overrides replicas: 1 └── production/ ├── kustomization.yaml ← bases: [../../base]; commonLabels: {env: prod} └── patch-replicas.yaml ← overrides replicas: 5

kustomize commands

# Preview rendered manifests without applying $ kubectl kustomize overlay/production # Apply a kustomization overlay to the cluster $ kubectl apply -k overlay/production $ oc apply -k overlay/staging # Delete resources created by a kustomization $ oc delete -k overlay/production # Diff current cluster state vs. kustomization $ kubectl diff -k overlay/production

OpenShift Templates

Templates are OpenShift-native packaged resource sets with parameters. They are stored in the openshift namespace and deployable via the web console or CLI.

templates

# List available templates in the global template library $ oc get templates -n openshift # Describe a template (parameters, resources it creates) $ oc describe template cache-service -n openshift # Process a template with custom parameters $ oc process -f mytemplate.yaml \ -p APP_NAME=myapp \ -p REPLICAS=3 | oc apply -f - # Process a template from the openshift namespace $ oc process -n openshift cakephp-mysql-persistent \ -p NAME=myweb | oc apply -f - # Export an existing resource as a template $ oc get all -o yaml | oc export -f - > exported-template.yaml

Chapter 5 · DO280

Operators & Helm Charts

Kubernetes Operators

An Operator is a Kubernetes controller that manages a specific application using domain knowledge encoded in software. It watches Custom Resources (CRs) and reconciles the cluster to match the desired state defined in those CRs.

🔩

Custom Resource Definition (CRD)

Extends the Kubernetes API with new resource types. An Operator registers CRDs and watches for instances to reconcile.

📡

Operator Lifecycle Manager (OLM)

OpenShift's system for installing, updating, and managing Operators cluster-wide. Provides the OperatorHub web UI.

📦

ClusterServiceVersion (CSV)

Operator metadata file describing capabilities, required CRDs, install strategy, and lifecycle details. OLM uses the CSV to install an Operator.

📋

Subscription

Tells OLM which Operator to install, from which catalog, and update channel. OLM keeps the Operator updated as new versions are published.

operator management

# List all installed operators (CSVs) in a namespace $ oc get clusterserviceversions # List operator subscriptions $ oc get subscriptions -n openshift-operators # Describe a subscription (current/desired CSV, catalog) $ oc describe subscription my-operator -n openshift-operators # List Custom Resource Definitions $ oc get crds # List resources created by an operator (e.g., a database cluster CR) $ oc get myoperatorresource

Helm — Package Management for Kubernetes

Helm packages Kubernetes manifests into charts (versioned, shareable archives). It uses Go templates to parameterize manifests, and tracks installed releases.

helm

# Add a chart repository $ helm repo add bitnami https://charts.bitnami.com/bitnami # Update local repo cache $ helm repo update # Search for a chart $ helm search repo nginx # Install a chart with a custom release name $ helm install my-nginx bitnami/nginx # Install with custom values $ helm install my-nginx bitnami/nginx \ --set replicaCount=2 \ --set service.type=ClusterIP # Install with a values file $ helm install my-nginx bitnami/nginx -f values.yaml # List all Helm releases in the current namespace $ helm list # Upgrade a release (apply changed values) $ helm upgrade my-nginx bitnami/nginx --set replicaCount=4 # Roll back to a previous release revision $ helm rollback my-nginx 1 # Uninstall a release (removes all its resources) $ helm uninstall my-nginx # Show rendered templates without installing (debug) $ helm template my-nginx bitnami/nginx -f values.yaml

Chapter 6 · DO280

Network Policies & TLS

NetworkPolicy resources enforce firewall-like rules between pods and namespaces. By default in OpenShift, all pods in a project can communicate with each other. NetworkPolicies restrict this.

Default Deny + Allow Pattern

network-policy — deny-all ingress, then allow specific traffic

# 1. Deny all ingress to this namespace apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-all-ingress spec: podSelector: {} # applies to ALL pods in namespace policyTypes: - Ingress --- # 2. Allow ingress only from pods with label app=frontend apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-frontend-to-backend spec: podSelector: matchLabels: app: backend ingress: - from: - podSelector: matchLabels: app: frontend ports: - protocol: TCP port: 8080

network policy & TLS

# List network policies in current project $ oc get networkpolicies # Describe a network policy $ oc describe networkpolicy deny-all-ingress # Apply a network policy $ oc apply -f deny-all.yaml # Generate a self-signed TLS certificate (for testing) $ openssl req -newkey rsa:4096 -nodes -keyout tls.key \ -x509 -days 365 -out tls.crt \ -subj "/CN=myapp.apps.ocp4.example.com" # Create a TLS edge-terminated route $ oc create route edge myapp-tls \ --service myapp \ --cert tls.crt \ --key tls.key \ --hostname myapp.apps.ocp4.example.com # Create a passthrough route (TLS terminated at the pod) $ oc create route passthrough myapp-passthrough \ --service myapp \ --hostname myapp.apps.ocp4.example.com

Egress Controls

egress network policy

# Allow egress only to a specific external CIDR # (EgressNetworkPolicy is an OpenShift extension) $ oc apply -f egress-policy.yaml # Test connectivity from a pod $ oc exec -it myapp-pod -- curl -v http://external-service:8080

Chapter 8 · DO280

Cluster & Operator Updates

OpenShift uses the Cluster Version Operator (CVO) to manage cluster updates. Updates follow a graph of supported upgrade paths and can be performed with minimal downtime using rolling node replacements.

Update Channels

Channel	Purpose
`stable-4.x`	Fully tested, production-recommended updates for minor version 4.x
`fast-4.x`	Updates released faster than stable; less bake time but still tested
`candidate-4.x`	Release candidates; not for production
`eus-4.x`	Extended Update Support — for customers on a fixed minor version

cluster update commands

# Check current cluster version and available updates $ oc get clusterversion $ oc describe clusterversion version # View available update targets $ oc adm upgrade # Start a cluster upgrade to a specific version $ oc adm upgrade --to 4.14.15 # Trigger an upgrade to the latest recommended version $ oc adm upgrade --to-latest # Watch cluster operators during upgrade $ watch oc get clusteroperators # View cluster operator health $ oc get clusteroperators # Check node status during rolling update $ oc get nodes # View machine config pools (controls node update batching) $ oc get machineconfigpool # Pause updates to the worker pool (e.g., during critical period) $ oc patch mcp/worker --type merge \ -p '{"spec":{"paused":true}}' # Resume worker pool updates $ oc patch mcp/worker --type merge \ -p '{"spec":{"paused":false}}'

Updating Operators via OLM

operator updates

# List operator subscriptions and their update channels $ oc get subscriptions -A # Check install plans (pending/approved operator updates) $ oc get installplans -n openshift-operators # Approve a manual install plan (for manually-approved update policy) $ oc patch installplan install-xxxxx --type merge \ -p '{"spec":{"approved":true}}' \ -n openshift-operators # Watch operator CSV rollout $ watch oc get csv -n openshift-operators

Important

OpenShift supports updates only between adjacent minor versions (e.g., 4.12 → 4.13 → 4.14). Skipping minor versions requires following the official upgrade graph from Red Hat's upgrade graph tool. Always update the cluster before updating Operators that depend on it.

Appendix

Quick Reference — Most-Used Commands

Podman Cheat Sheet

Goal	Command
Run a container	`podman run -d -p 8080:8080 --name myapp image:tag`
List running	`podman ps`
List all (including stopped)	`podman ps -a`
Shell into running container	`podman exec -it myapp /bin/sh`
View logs	`podman logs -f myapp`
Stop / remove container	`podman stop myapp && podman rm myapp`
Build image	`podman build -t myapp:latest .`
Push image	`podman push quay.io/myorg/myapp:latest`
List local images	`podman images`
Remove image	`podman rmi myapp:latest`
Multi-container app up	`podman-compose up -d`
Multi-container app down	`podman-compose down`

OpenShift / oc Cheat Sheet

Goal	Command
Log in	`oc login -u user -p pass https://api.cluster:6443`
Switch project	`oc project myproject`
Create project	`oc new-project myapp`
Deploy from image	`oc new-app --image registry.io/img:tag`
List everything	`oc get all`
Watch pods	`watch oc get pods`
Shell into pod	`oc rsh pod-name`
Pod logs	`oc logs -f pod-name`
Scale deployment	`oc scale deployment myapp --replicas 3`
Expose service as route	`oc expose service myapp`
Apply manifest	`oc apply -f manifest.yaml`
Apply kustomize overlay	`oc apply -k overlays/production`
Grant edit role	`oc adm policy add-role-to-user edit user1`
Check cluster version	`oc get clusterversion`
Start cluster upgrade	`oc adm upgrade --to 4.14.15`

Wednesday, April 22, 2026

Setting Up an OpenShift Cluster User-Provisioned Infrastructure in Air-Gapped Environments

OpenShift · UPI · Air-Gapped

Setting Up an OpenShift Cluster
User-Provisioned Infrastructure
in Air-Gapped Environments

A complete, step-by-step guide to manually provisioning VMs, load balancers, and DNS for an OpenShift cluster in a disconnected network.

1 What is UPI?

With User-Provisioned Infrastructure (UPI), you have maximum control over the cluster setup. You are responsible for manually preparing all virtual machines, load balancers, and DNS records before the OpenShift installer runs. This is the preferred approach for secure, air-gapped, or heavily regulated environments.

2 Air-Gapped Prerequisites

The primary challenge in a disconnected environment is the absence of direct access to the Red Hat Container Registry. You must bridge this gap before initiating the installation.

Mirror Registry

Establish a local container registry (e.g., Red Hat Quay or JFrog Artifactory) within your secure perimeter. Use the oc mirror plugin to sync OpenShift release images, operator catalogs, and Helm charts from the internet to a portable medium, then load them into your local registry.

Internal DNS & NTP

Precise time synchronization and split-horizon DNS are non-negotiable. Every node must be able to resolve the local registry hostname and all internal API endpoints.

3 OS Strategy

Red Hat enforces a specific OS strategy to ensure the self-healing nature of OpenShift.

⚠️

Control Plane (Masters): RHCOS is mandatory. Standard RHEL, Ubuntu, or any other OS cannot be used. Masters are managed by the Machine Config Operator (MCO), which requires an immutable, container-optimized OS to push updates, roll back kernel changes, and manage configurations automatically.

ℹ️

Compute Nodes (Workers): RHCOS is strongly recommended. When you update OpenShift, the OS on the workers updates automatically via rpm-ostree for safe, transactional updates. You have some flexibility here, but RHCOS is the supported default.

4 Architecture & Helper Node

This setup uses a Helper Node as the backbone — it acts as a bridge between the external network and the internal cluster network using two network interfaces.

Network Interfaces

Interface	Zone	Subnet	Role
`ens192`	External	192.168.0.X	Front-end / internet-facing traffic and corporate LB
`ens224`	Internal	192.168.22.X	Back-end cluster communication and storage traffic

Infrastructure Services Provided by the Helper Node

DNS (BIND)

Resolves cluster hostnames and API endpoints

DHCP

Assigns static IPs to all cluster nodes

NAT Gateway

Routes internal node traffic through the helper

HAProxy

Load balances API and Ingress traffic

Apache Web Server

Hosts Ignition files for automated installation

NFS Server

Provides persistent storage for the registry

5 Cluster Node Roles

Temporary

Bootstrap Node

Used only during the initial installation to orchestrate Control Plane creation. Decommissioned once the control plane is healthy.

×3 Nodes — HA

Control Plane (Masters)

The "brains" of the cluster. Runs the API server, etcd database, and controllers. Three nodes ensure high availability.

Compute

Worker Nodes

Where your actual applications, containers, and pods run. CSRs must be manually approved in UPI mode.

6 Core Deployment Workflow

Phase I — Configuration & Manifest Generation

On the Bastion host, define the cluster in install-config.yaml, pointing to your local mirror registry. Generate Kubernetes manifests and convert them to Ignition configs (.ign files) that RHCOS nodes execute on first boot.

Phase II — Infrastructure Provisioning

Configure a high-availability load balancer (HAProxy/F5) for the API (port 6443), Machine Config Server (port 22623), and Ingress (ports 80/443). Host the generated .ign files on an internal HTTP server.

Phase III — Bootstrap Sequence

Boot the Bootstrap node → it pulls its config and initiates control plane creation. Boot Masters → they form the etcd quorum. Boot Workers → manually approve their CSRs to join the cluster.

7 Detailed Setup Steps

Prepare Bastion /Helper Node.

Helper Node OS shloud be RedHat or CentOS 8 x86_64 image
Login to RedHat OpenShift Cluster Manager
Select 'Create Cluster' from the 'Clusters' navigation menu
Select 'RedHat OpenShift Container Platform'
Select 'Run on Bare Metal'
Download the following files:
- Openshift Installer for Linux (openshift-install-linux.tar.gz)
- Pull secret
- Command Line Interface for Linux and your workstations OS (openshift-client-linux.tar.gz)
- Red Hat Enterprise Linux CoreOS (RHCOS)
  - rhcos-X.X.X-x86_64-metal.x86_64.raw.gz
  - rhcos-X.X.X-x86_64-installer.x86_64.iso (or rhcos-X.X.X-x86_64-live.x86_64.iso for newer versions)

Notes: Before powering on a single node, these must be ready:
1) Load Balancer:

Port 6443 (API): Points to Bootstrap + 3 Masters.
Port 22623 (Machine Config): Points to Bootstrap + 3 Masters.
Ports 80/443 (Apps): Points to all Worker nodes.

2) DNS:

api.<cluster>.<domain> -> LB VIP for 6443.
api-int.<cluster>.<domain> -> LB VIP for 6443/22623.
*.apps.<cluster>.<domain> -> LB VIP for 80/443/8443.

Step 1 — Install Client Tools

# Extract and install the OpenShift client tools
tar xvf openshift-client-linux.tar.gz
mv oc kubectl /usr/local/bin

# Verify installation
kubectl version
oc version

# Extract the OpenShift Installer
tar xvf openshift-install-linux.tar.gz

Step 2 — Configure Static IP for Internal NIC

Run nmtui-edit ens224 or edit /etc/sysconfig/network-scripts/ifcfg-ens224 with these values:

Address:       192.168.22.1
DNS Server:    127.0.0.1
Search Domain: ocp.lan
Default Route: Disabled
Auto-connect:  Enabled

If changes don't apply, bounce the NIC: nmcli connection down ens224 && nmcli connection up ens224

Step 3 — Configure Firewall Zones

# Assign interfaces to zones
nmcli connection modify ens224 connection.zone internal
nmcli connection modify ens192 connection.zone external

# Enable masquerading (NAT) on both zones
firewall-cmd --zone=external --add-masquerade --permanent
firewall-cmd --zone=internal --add-masquerade --permanent
firewall-cmd --reload

# Verify zones and IP forwarding
firewall-cmd --get-active-zones
firewall-cmd --list-all --zone=internal
firewall-cmd --list-all --zone=external
cat /proc/sys/net/ipv4/ip_forward   # Should return 1

Step 4 — Clone Config Repository

dnf update -y
dnf install git -y
git clone https://github.com/ryanhay/ocp4-metal-install

Step 5 — Install & Configure DNS (BIND)

dnf install bind bind-utils -y
cp ~/ocp4-metal-install/dns/named.conf /etc/named.conf
cp -R ~/ocp4-metal-install/dns/zones /etc/named/

# Open firewall for DNS
firewall-cmd --add-port=53/udp --zone=internal --permanent
firewall-cmd --add-port=53/tcp --zone=internal --permanent  # Required for OCP 4.9+
firewall-cmd --reload

# Enable and start BIND
systemctl enable named && systemctl start named && systemctl status named

Update the external NIC (ens192) to use 127.0.0.1 as its DNS server and enable "Ignore automatically obtained DNS parameters" via nmtui-edit ens192, then restart NetworkManager:

systemctl restart NetworkManager

# Verify DNS resolution
dig ocp.lan
dig -x 192.168.22.200   # Should resolve to ocp-bootstrap.lab.ocp.lan

Step 6 — Install & Configure DHCP

⚠️

Before copying the config, update ~/ocp4-metal-install/dhcpd.conf with the actual MAC addresses of each cluster machine.

dnf install dhcp-server -y
cp ~/ocp4-metal-install/dhcpd.conf /etc/dhcp/dhcpd.conf

firewall-cmd --add-service=dhcp --zone=internal --permanent
firewall-cmd --reload

systemctl enable dhcpd && systemctl start dhcpd && systemctl status dhcpd

Step 7 — Install & Configure Apache Web Server

dnf install httpd -y
# Change Apache to listen on port 8080 (avoids conflicts)
sed -i 's/Listen 80/Listen 0.0.0.0:8080/' /etc/httpd/conf/httpd.conf

firewall-cmd --add-port=8080/tcp --zone=internal --permanent
firewall-cmd --reload

systemctl enable httpd && systemctl start httpd && systemctl status httpd

# Verify it's running
curl localhost:8080

Step 8 — Install & Configure HAProxy

dnf install haproxy -y
cp ~/ocp4-metal-install/haproxy.cfg /etc/haproxy/haproxy.cfg

Open the required firewall ports:

# Control plane API
firewall-cmd --add-port=6443/tcp --zone=internal --permanent
firewall-cmd --add-port=6443/tcp --zone=external --permanent

# Machine Config Server
firewall-cmd --add-port=22623/tcp --zone=internal --permanent

# Application ingress (HTTP/HTTPS)
firewall-cmd --add-service=http --zone=internal --permanent
firewall-cmd --add-service=http --zone=external --permanent
firewall-cmd --add-service=https --zone=internal --permanent
firewall-cmd --add-service=https --zone=external --permanent

# HAProxy stats UI (accessible at http://<helper-ip>:9000/stats)
firewall-cmd --add-port=9000/tcp --zone=external --permanent
firewall-cmd --reload

# Allow HAProxy SELinux binding and start the service
setsebool -P haproxy_connect_any 1
systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy

Step 9 — Install & Configure NFS Server

Network File System (NFS) is a distributed file system protocol that allows a user on a client computer to access files over a network much like local storage is accessed. Originally developed by Sun Microsystems, it has become the standard for file sharing between Unix and Linux systems.

How NFS Works

NFS Server: Hosts the physical storage and "exports" (shares) specific directories to the network. It manages permissions and handles requests from clients.
NFS Client: Mounts the exported directory from the server onto its own local file system. To the user or application on the client side, the files appear to be stored locally.

dnf install nfs-utils -y

mkdir -p /shares/registry
chown -R nobody:nobody /shares/registry
chmod -R 777 /shares/registry

echo "/shares/registry  192.168.22.0/24(rw,sync,root_squash,no_subtree_check,no_wdelay)" > /etc/exports
exportfs -rv

firewall-cmd --zone=internal --add-service=mountd --permanent
firewall-cmd --zone=internal --add-service=rpc-bind --permanent
firewall-cmd --zone=internal --add-service=nfs --permanent
firewall-cmd --reload

systemctl enable nfs-server rpcbind
systemctl start nfs-server rpcbind nfs-mountd

Step 10 — Generate Installation Files

mkdir /var/www/html/ocp4
cp ~/ocp4-metal-install/install-config.yaml /var/www/html/ocp4

ℹ️

Edit install-config.yaml before proceeding: insert your Pull Secret and SSH public key. See the Configuration Details section below for guidance.

# Generate Kubernetes manifests
~/openshift-install create manifests --dir /var/www/html/ocp4

To control whether workloads can run on Control Plane nodes, edit the scheduler manifest:

ls ~/ocp-install/manifests/cluster-scheduler-02-config.yml

# Set mastersSchedulable: true  → allow workloads on masters
# Set mastersSchedulable: false → prevent workloads (default)

# Generate Ignition configs and auth files
~/openshift-install create ignition-configs --dir /var/www/html/ocp4

Step 11 — Host RHCOS Image and Set Permissions

# Move the RHCOS metal image to the web server
mv ~/rhcos-X.X.X-x86_64-metal.x86_64.raw.gz /var/www/html/ocp4/rhcos

# Set correct SELinux context, ownership, and permissions
chcon -R -t httpd_sys_content_t /var/www/html/ocp4/
chown -R apache: /var/www/html/ocp4/
chmod 755 /var/www/html/ocp4/

# Confirm all files are accessible
curl localhost:8080/ocp4/

Step 12 — Boot Cluster Nodes

Boot each node type using the RHCOS ISO or PXE, passing the appropriate Ignition file URL via kernel arguments.

You boot your nodes using the RHCOS (Red Hat Enterprise Linux CoreOS) ISO or PXE. During the boot process, you must pass a kernel argument to tell the node where its "brain" (Ignition file) is:coreos.inst.ignition_url=http:///bootstrap.ign
Order of Operations:

1. Start Bootstrap node.
2. Start 3 Master nodes.
3. Wait for the API to come up.
4. Start Worker nodes.

# Bootstrap Node
sudo coreos-installer install /dev/sda \
  -u http://192.168.22.1:8080/ocp4/rhcos \
  -I http://192.168.22.1:8080/ocp4/bootstrap.ign \
  --insecure --insecure-ignition

# Control Plane (Master) Nodes
sudo coreos-installer install /dev/sda \
  -u http://192.168.22.1:8080/ocp4/rhcos \
  -I http://192.168.22.1:8080/ocp4/master.ign \
  --insecure --insecure-ignition

# Worker Nodes
sudo coreos-installer install /dev/sda \
  -u http://192.168.22.1:8080/ocp4/rhcos \
  -I http://192.168.22.1:8080/ocp4/worker.ign \
  --insecure --insecure-ignition

Step 13 — Monitor Bootstrap & Finalize

# Monitor bootstrap progress from the Helper Node
~/openshift-install --dir /var/www/html/ocp4/ wait-for bootstrap-complete --log-level=debug

Once bootstrapping completes, remove the Bootstrap node from HAProxy and shut it down:

# Remove ocp-bootstrap from /etc/haproxy/haproxy.cfg, then reload
systemctl reload haproxy

# Approve Worker CSRs so workers can join the cluster
oc get csr
oc adm certificate approve <csr-name>

# Verify all nodes are Ready
oc get nodes

Step 14 — Post-Installation

# Retrieve console URL and kubeadmin password
cat ~/ocp-install/auth/console-url
cat ~/ocp-install/auth/kubeadmin-password

Configure Storage: Define StorageClasses (NFS, OCS, or local storage) so applications can persist data. See the Configure Storage section below.
Set Up Identity Providers: Replace the temporary kubeadmin user with a permanent solution such as LDAP or OAuth. See the Identity Providers section below.

★ Configure Storage Post-Install

Once the cluster is healthy and all nodes are Ready, you must configure persistent storage. Without a working StorageClass, the internal image registry, monitoring stack, and most operators cannot persist data.

ℹ️

Why this matters immediately: The OpenShift internal image registry is set to Removed or EmptyDir by default after a UPI install. You must back it with persistent storage before pushing any images.

Option A — NFS StorageClass (Lab / Air-Gapped)

If you provisioned an NFS share on the Helper Node (Step 9), expose it as a dynamic StorageClass using the NFS Subdir External Provisioner. This is the fastest path for lab and air-gapped environments.

1. Configure storage for the Image Registry

If you check the cluster operators oc get co, you will likely see the image-registry operator reporting AVAILABLE=False or PROGRESSING=True (but stuck) because it lacks the resources to deploy the registry pods.

Run the next command to create the 'image-registry-storage' PVC by updating the management state to 'Managed' and adding 'pvc' and 'claim' keys in the storage key

oc edit configs.imageregistry.operator.openshift.io

here is the default output before updates

apiVersion: imageregistry.operator.openshift.io/v1
kind: Config
metadata:
  name: cluster
spec:
  managementState: Removed        # <--- KEY OBSERVATION 1
  storage: {}                     # <--- KEY OBSERVATION 2

here is the file after our updates

apiVersion: imageregistry.operator.openshift.io/v1
kind: Config
metadata:
  name: cluster
spec:
  managementState: Managed        # <--- KEY OBSERVATION 1
  storage:                        # <--- KEY OBSERVATION 2
  	pvc:
  	   claim: # leave the claim blank

2. Verify the Storage

Confirm the 'image-registry-storage' pvc has been created and is currently in a 'Pending' state

oc get pvc -n openshift-image-registry

3. Create Persistent Volume

Create the persistent volume for the 'image-registry-storage' pvc to bind to NFS.

oc create -f ~/ocp4-metal-install/manifest/registry-pv.yaml

noted that registry-pv.yaml file contains the next contents

apiVersion: v1
kind: PersistentVolume
metadata:
  name: registry-pv
spec:
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 100Gi
  persistentVolumeReclaimPolicy: Retain
  nfs:
    path: /shares/registry
    server: 192.168.22.1

4. Verify the Storage again!

After a short wait the 'image-registry-storage' pvc should now be in a 'bound' state

oc get pvc -n openshift-image-registry

Option B — OpenShift Data Foundation / ODF (Production)

ODF provides software-defined block, file, and object storage via Ceph running directly on your worker nodes. Minimum requirement: 3 worker nodes, each with at least one raw, unformatted additional disk.

1. Label the storage nodes

oc label node worker-0.lab.ocp.lan cluster.ocs.openshift.io/openshift-storage=""
oc label node worker-1.lab.ocp.lan cluster.ocs.openshift.io/openshift-storage=""
oc label node worker-2.lab.ocp.lan cluster.ocs.openshift.io/openshift-storage=""

2. Install the ODF Operator

oc create namespace openshift-storage

# Create OperatorGroup
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-storage-operatorgroup
  namespace: openshift-storage
spec:
  targetNamespaces:
    - openshift-storage
EOF

# Subscribe to ODF (adjust channel to match your OCP version)
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: odf-operator
  namespace: openshift-storage
spec:
  channel: stable-4.14
  installPlanApproval: Automatic
  name: odf-operator
  source: redhat-operators       # Replace with mirrored CatalogSource in air-gapped
  sourceNamespace: openshift-marketplace
EOF

# Wait for all operator pods to be Running
oc get pods -n openshift-storage -w

3. Create the StorageCluster

cat <<EOF | oc apply -f -
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
  name: ocs-storagecluster
  namespace: openshift-storage
spec:
  manageNodes: false
  monDataDirHostPath: /var/lib/rook
  storageDeviceSets:
    - name: ocs-deviceset
      count: 1           # 1 OSD per node x 3 nodes = 3 OSDs total
      replica: 3
      portable: true
      dataPVCTemplate:
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 500Gi   # Size of each raw disk to claim
          volumeMode: Block
          storageClassName: localblock   # SC that presents raw block devices
EOF

# Monitor cluster initialisation (typically 5-15 minutes)
oc get storagecluster -n openshift-storage -w

4. StorageClasses created by ODF

StorageClass	Type	Access Mode	Best For
`ocs-storagecluster-ceph-rbd`	Block (Ceph RBD)	RWO	Databases (PostgreSQL, MongoDB), stateful apps
`ocs-storagecluster-cephfs`	File (CephFS)	RWX	Shared media folders, CMS uploads, ML pipelines
`openshift-storage.noobaa.io`	Object (S3 API)	S3	Backups, AI/ML datasets, image registry

Option C — Local Storage Operator (LSO)

LSO presents raw node-local disks as PersistentVolumes without requiring a SAN or NFS server. It is commonly used as the backing layer for ODF.

# Install via Subscription
# channel: stable-4.14 | name: local-storage-operator

# After operator is Running, declare which disks to expose:
cat <<EOF | oc apply -f -
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: local-disks
  namespace: openshift-local-storage
spec:
  nodeSelector:
    nodeSelectorTerms:
      - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
              - worker-0.lab.ocp.lan
              - worker-1.lab.ocp.lan
              - worker-2.lab.ocp.lan
  storageClassDevices:
    - storageClassName: localblock
      volumeMode: Block
      devicePaths:
        - /dev/sdb      # The second raw disk on each node
EOF

Configure the Internal Image Registry

After storage is ready, switch the registry from Removed to Managed and back it with a PVC:

oc patch configs.imageregistry.operator.openshift.io cluster   --type merge   --patch '{
    "spec": {
      "managementState": "Managed",
      "storage": {"pvc": {"claim": ""}},
      "replicas": 1
    }
  }'

# The operator auto-creates a PVC; watch it bind
oc get pvc -n openshift-image-registry

# Confirm the registry pod is Running
oc get pods -n openshift-image-registry

ℹ️

Running more than 1 registry replica requires a ReadWriteMany (RWX) PVC, such as NFS or CephFS. For a single replica, ReadWriteOnce (RWO) is sufficient.

Set the Default StorageClass

# Mark one SC as the cluster default
oc patch storageclass nfs-client   -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}'

# Remove the default annotation from any previously default SC
oc patch storageclass old-sc   -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "false"}}}'

# Verify
oc get storageclass

Enable Persistent Storage for the Monitoring Stack

Prometheus and Alertmanager use ephemeral storage by default. Configure persistence so metrics survive pod restarts:

cat <<EOF | oc apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusK8s:
      retention: 15d
      volumeClaimTemplate:
        spec:
          storageClassName: nfs-client    # Or ocs-storagecluster-ceph-rbd
          resources:
            requests:
              storage: 50Gi
    alertmanagerMain:
      volumeClaimTemplate:
        spec:
          storageClassName: nfs-client
          resources:
            requests:
              storage: 10Gi
EOF

★ Set Up Identity Providers Post-Install

After installation the only user is the temporary kubeadmin. You must configure a permanent Identity Provider and then delete kubeadmin to enforce proper authentication and RBAC across the cluster.

⚠️

Do not delete kubeadmin until at least one other user has been granted cluster-admin privileges and you have confirmed that you can log in successfully as that user.

Option A — HTPasswd (Simplest / Lab)

HTPasswd stores usernames and bcrypt-hashed passwords in a flat file. Ideal for small teams and fully air-gapped labs where an external directory is not available.

1. Create the htpasswd file and Kubernetes Secret

dnf install httpd-tools -y

# -c creates a new file; omit -c when appending users
htpasswd -c -B -b /tmp/htpasswd admin        RedHatAdmin1!
htpasswd    -B -b /tmp/htpasswd developer    DevPass123!

# Store the file as a Secret in openshift-config
oc create secret generic htpasswd-secret   --from-file=htpasswd=/tmp/htpasswd   -n openshift-config

2. Register the provider in the OAuth cluster object

oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
    - name: htpasswd_provider
      mappingMethod: claim
      type: HTPasswd
      htpasswd:
        fileData:
          name: htpasswd-secret    # Must match the Secret name above
EOF

3. Grant cluster-admin and test login

# Allow ~30 s for oauth-server pods to restart, then:
oc adm policy add-cluster-role-to-user cluster-admin admin

oc login -u admin -p RedHatAdmin1! https://api.lab.ocp.lan:6443
oc whoami    # Should return: admin

4. Adding or changing users later

# Pull the current file out of the Secret
oc extract secret/htpasswd-secret -n openshift-config --to=/tmp --confirm

# Modify it — add a user, change a password, etc.
htpasswd -B -b /tmp/htpasswd newuser NewPass456!

# Push the updated file back — oauth pods restart automatically
oc set data secret/htpasswd-secret   --from-file=htpasswd=/tmp/htpasswd   -n openshift-config

Option B — LDAP / Active Directory

Integrate OpenShift with an existing LDAP directory (Microsoft AD, Red Hat Directory Server, OpenLDAP). Authentication is delegated to the directory; no separate password management is needed inside OpenShift.

1. Store the LDAP bind password as a Secret

oc create secret generic ldap-bind-password   --from-literal=bindPassword='BindUserPassword123!'   -n openshift-config

2. Store the LDAP CA certificate (required for ldaps://)

oc create configmap ldap-ca-cert   --from-file=ca.crt=/path/to/your-ldap-ca.crt   -n openshift-config

3. Configure the OAuth object for LDAP

oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
    - name: ldap_provider
      mappingMethod: claim
      type: LDAP
      ldap:
        attributes:
          id:                [dn]
          email:             [mail]
          name:              [cn]
          preferredUsername: [sAMAccountName]   # For AD; use uid for OpenLDAP
        bindDN: "CN=ocp-svc,OU=ServiceAccounts,DC=corp,DC=lan"
        bindPassword:
          name: ldap-bind-password
        ca:
          name: ldap-ca-cert           # Remove this block for public-CA LDAP
        insecure: false
        url: "ldaps://dc01.corp.lan:636/OU=Users,DC=corp,DC=lan?sAMAccountName?sub?(objectClass=person)"
EOF

ℹ️

LDAP URL format: ldaps://<host>:<port>/<base-dn>?<attribute>?<scope>?<filter>. Always prefer ldaps:// (port 636) over ldap:// (port 389) to encrypt credentials in transit.

4. Sync LDAP groups into OpenShift groups

cat > /tmp/ldap-sync.yaml <<EOF
kind: LDAPSyncConfig
apiVersion: v1
url: ldaps://dc01.corp.lan:636
bindDN: "CN=ocp-svc,OU=ServiceAccounts,DC=corp,DC=lan"
bindPassword: "BindUserPassword123!"
ca: /path/to/your-ldap-ca.crt
rfc2307:
  groupsQuery:
    baseDN: "OU=OCP-Groups,DC=corp,DC=lan"
    scope: sub
    derefAliases: never
    filter: (objectClass=group)
  groupUIDAttribute: dn
  groupNameAttributes: [cn]
  groupMembershipAttributes: [member]
  usersQuery:
    baseDN: "OU=Users,DC=corp,DC=lan"
    scope: sub
    derefAliases: never
  userUIDAttribute: dn
  userNameAttributes: [sAMAccountName]
EOF

# Dry-run — preview what will change without applying
oc adm groups sync --sync-config=/tmp/ldap-sync.yaml

# Apply the sync
oc adm groups sync --sync-config=/tmp/ldap-sync.yaml --confirm

# View synced groups
oc get groups

Option C — GitHub / GitLab OAuth

For teams already using GitHub Enterprise or self-hosted GitLab. Only applicable when the cluster can reach the OAuth server endpoint.

1. Register an OAuth Application on GitHub or GitLab

GitHub path: Settings → Developer settings → OAuth Apps → New OAuth App
Set Authorization callback URL to: https://oauth-openshift.apps.lab.ocp.lan/oauth2callback/github
Copy the generated Client ID and Client Secret

2. Store the Client Secret

oc create secret generic github-client-secret   --from-literal=clientSecret=<your-github-client-secret>   -n openshift-config

3. Configure the OAuth object

oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
  name: cluster
spec:
  identityProviders:
    - name: github
      mappingMethod: claim
      type: GitHub
      github:
        clientID: "<your-github-client-id>"
        clientSecret:
          name: github-client-secret
        organizations:
          - my-github-org      # Restrict access to members of this org
EOF

Assign Roles with RBAC

After users or groups are created, assign them the appropriate role. OpenShift ships with five built-in cluster roles:

Role	Scope	What It Allows
`cluster-admin`	Cluster-wide	Full, unrestricted access to every resource
`cluster-reader`	Cluster-wide	Read-only access to all resources
`admin`	Namespace	Full control within a specific project/namespace
`edit`	Namespace	Create, update, and delete most resources in a project
`view`	Namespace	Read-only access within a project

# Cluster-wide role assignments
oc adm policy add-cluster-role-to-user  cluster-admin  admin
oc adm policy add-cluster-role-to-group cluster-reader ops-team

# Namespace-scoped role assignments
oc adm policy add-role-to-user  admin    alice    -n my-project
oc adm policy add-role-to-group edit     dev-team -n my-project
oc adm policy add-role-to-user  view     bob      -n my-project

# Verify what a user is allowed to do
oc auth can-i get pods --as=alice -n my-project

Delete the kubeadmin User

Once your permanent IdP is working and at least one user has cluster-admin, delete the kubeadmin Secret. This is a hard security requirement — the kubeadmin password is stored in etcd and must not remain permanently.

⚠️

This is irreversible. Confirm you can run oc get nodes as your new admin user before executing the delete command. You cannot recover kubeadmin without reinstalling the cluster.

# Log out of kubeadmin and verify your new admin works
oc logout
oc login -u admin -p RedHatAdmin1! https://api.lab.ocp.lan:6443
oc get nodes     # Must return all nodes in Ready state

# Now it is safe to delete kubeadmin
oc delete secret kubeadmin -n kube-system

Verify the Complete Authentication Configuration

# List all configured identity providers
oc get oauth cluster -o jsonpath='{.spec.identityProviders[*].name}'

# List every user OpenShift knows about
oc get users

# List all identities (shows which provider created each entry)
oc get identity

# Check all cluster-admin bindings
oc get clusterrolebindings -o wide | grep cluster-admin

ℹ️

Multiple IdPs at once: You can list more than one provider in spec.identityProviders. Each entry needs a unique name field. Users authenticating via different providers are treated as separate identities unless you configure a lookup or add mapping method to merge them.

info StorageClasses: Explain OpenShift Data Foundation (ODF)

After the cluster installation completes, the environment is "empty." You must now configure Persistent Storage for cluster operations.

OpenShift requires a StorageClass to fulfill Persistent Volume Claims (PVCs). Best storage depends on the environment (NFS for simplicity, ODF/OCS for production-grade software-defined storage).

Dive into the storage architecture of OpenShift, moving from the basic concepts of the Container Storage Interface (CSI) to advanced software-defined storage like OpenShift Data Foundation (ODF).

1. The OpenShift Storage Hierarchy

To understand OpenShift storage, you must distinguish between the physical storage and the virtualized requests made by applications.

Persistent Volume (PV): The actual "disk" (network-attached or local) provisioned by the administrator.
Persistent Volume Claim (PVC): The request made by a developer for a certain amount of storage.
StorageClass (SC): The "template" that defines how a PV is created (e.g., fast SSD vs. slow HDD).

2. Core Storage Types

OpenShift categorizes storage based on how many nodes can access it simultaneously.

A. Block Storage (RWO - ReadWriteOnce)

Technology: iSCSI, Fibre Channel, AWS EBS, OpenStack Cinder, ODF RBD.
Best For: Databases (PostgreSQL, MongoDB).
Behavior: Only one node can mount the volume at a time. It is highly performant and supports low-latency transactions.

B. File Storage (RWX - ReadWriteMany)

Technology: NFS, Azure Files, ODF CephFS.
Best For: Shared media folders, CMS uploads (WordPress), or data pipelines where multiple pods need to read/write the same files.
Behavior: Multiple nodes can mount the same volume simultaneously.

C. Object Storage

Technology: S3, MinIO, ODF NooBaa.
Best For: Backups, AI/ML datasets, and cloud-native applications.
Behavior: Accessed via API (HTTP/HTTPS) rather than a filesystem mount. It is virtually infinitely scalable.

3. Deep Dive: OpenShift Data Foundation (ODF)

Formerly known as OCS (OpenShift Container Storage), ODF is the "Gold Standard" for OpenShift storage. It is built on Ceph, Rook, and NooBaa.

Key Advantages:

Platform Agnostic: Whether you are on-premise (VMware/Bare Metal) or in the cloud (AWS/Azure), ODF provides the same StorageClasses.

Hyper-Converged: You don't need an external SAN. ODF uses the spare disks already inside your worker nodes.

Dynamic Provisioning: It automatically creates volumes as soon as a developer creates a PVC.

Resilience: By default, data is replicated across 3 different nodes. If one node fails, the data remains available.

ODF Component Breakdown:

Component	Function	Storage Type
Ceph RBD	High-performance block storage	Block (RWO)
CephFS	Shared filesystem storage	File (RWX)
NooBaa	Multi-cloud object gateway	Object (S3)

4. Hostpath and Local Storage

For edge cases or small-scale labs, you may encounter these:

HostPath: Uses a directory on the node’s local disk. Warning: If the pod moves to another node, the data stays behind and the pod loses access.
Local Storage Operator (LSO): A more robust way to use local NVMe/SSD disks. Unlike HostPath, LSO allows the scheduler to track which node "owns" the data.

5. Architectural Decision Matrix

As a Solution Architect, use this table to choose your storage backend:

Use Case	Recommended Storage	Access Mode
Database (Prod)	ODF RBD (Block)	RWO
Content Management	ODF CephFS or NFS	RWX
Machine Learning Models	ODF NooBaa (S3)	Object
Temporary Scratch Space	emptyDir	RWO
Registry Storage	ODF CephFS	RWX

6. Pro-Tips for Production

Snapshotting: Ensure your storage provider supports CSI Snapshots for quick backups before application updates.
Expansion: Use a StorageClass with allowVolumeExpansion: true. This allows you to grow a disk without deleting the pod.
IOPS Limiting: In multi-tenant clusters, use StorageQuotas to prevent one team from consuming all the storage bandwidth or capacity.

info Yaml Configuration Details

The install-config.yaml Blueprint

This is the only file you create manually. It acts as the blueprint for the entire installation. Key fields to populate:

pullSecret — Authorizes nodes to pull OpenShift images from Red Hat registries.
sshKey — Allows SSH access into RHCOS nodes as the core user for troubleshooting.
networking — Defines cluster and service network CIDRs.
imageContentSources — Points to your local mirror registry (required for air-gapped installs).

How to Get the Pull Secret

Log in to the Red Hat OpenShift Cluster Manager at cloud.redhat.com/openshift.
Download the pull secret using the "Download pull secret" button.
Paste the entire single-line JSON string into your install-config.yaml inside single quotes.

How to Get the SSH Key

# Check for existing keys
ls ~/.ssh/id_rsa.pub || ls ~/.ssh/id_ed25519.pub

# Generate a new key pair (if needed)
ssh-keygen -t ed25519 -f ~/.ssh/id_ocp -C "admin@ocp-cluster"

# Output the public key to copy into install-config.yaml
cat ~/.ssh/id_ocp.pub

OpenShift Client vs. Installer — Quick Reference

Feature	OpenShift Client (`oc`)	OpenShift Installer
Filename	openshift-client-linux.tar.gz	openshift-install-linux.tar.gz
Primary Goal	Managing an existing cluster	Creating or destroying a cluster
Main Binary	`oc` (and `kubectl`)	`openshift-install`
Usage Period	Daily, for the life of the cluster	Primarily during Day 1 setup
Capabilities	Deploy apps, check logs, manage users	Provision VMs, generate Ignition files

Helper Node Interface Roles

Interface	Typical Role	Description
`ens192`	External / Public	Front-end traffic — connects to the internet or corporate load balancer to serve applications.
`ens224`	Internal / Private	Back-end traffic — master/worker node communication, storage traffic (CSI/NFS).

info Building an OpenShift cluster on-premises - Installation Methods

Building an OpenShift cluster on-premises requires shifting from the "push-button" automation of public clouds to a more hands-on infrastructure management approach. In 2026, the process is largely standardized through Red Hat's Assisted Installer or Agent-based methods.

1. Assisted Installer

A user-friendly web interface (hosted at https://www.google.com/search?q=console.redhat.com) that generates a discovery ISO. You boot your on-prem servers with this ISO, and they "call home" to the web console, allowing you to configure the cluster graphically.

2. IPI (Installer-Provisioned Infrastructure)

Full automation. The installer has API access to your infrastructure (like VMware vSphere or OpenStack) and creates the VMs, storage, and networking for you.

3. UPI (User-Provisioned Infrastructure)

Maximum control. You manually prepare the VMs, load balancers, and DNS. This is typical for Bare Metal or highly restricted "Air-Gapped" environments.

info Minimum Cluster Hardware (Production Grade)

Node Type	CPU	RAM	Disk
Control Plane (3x)	4 vCPU	16 GB	120 GB (SSD preferred)
Compute/Worker (2x+)	4 vCPU	16 GB	120 GB
Bootstrap (1x)	4 vCPU	16 GB	120 GB (Deleted after install)

info Do we need Boostrap node to add new Control or Worker Node?

Once the cluster is up and running (Day 2 operations), the Control Plane (Masters) takes over all management tasks.

1. Adding a New Worker Node

When you boot a new Worker node with its ignition file, it communicates directly with the API Server on the Master nodes, CSR Approval: You will need to approve the Certificate Signing Requests (CSRs) using oc get csr and oc adm certificate approve {name}.

2. Adding a New Control Plane (Master) Node

OpenShift clusters are typically designed with an odd number of Control Plane nodes (usually 3) to maintain Etcd Quorum, If you want to move from 3 to 5 Masters, you add them to the existing, healthy cluster. The new Master joins the existing Etcd cluster managed by the current Masters..

Key Considerations for UPI

Ignition Expiry

Ignition files contain certificates valid for 24 hours. If you don't finish the install by then, you must regenerate them.

Disk Cleanup

If an install fails, you must wipe the disks of the nodes before retrying. RHCOS will not overwrite an existing partition table automatically.

How to Clean Disk after installation fails?

Wiping the disks is a critical step because if RHCOS detects an existing ignition configuration or a partition table, it may fail to apply the new configuration, leading to a "zombie" node state.

RHCOS uses Ignition, which runs in the initramfs stage. If Ignition sees a partition labeled boot or root already on the disk, it might assume the installation was already completed and skip critical configuration steps.

Pro-Tip: If you are debugging a failed bootstrap, always wipe the Bootstrap node first. It is the source of truth for the rest of the cluster. If the Bootstrap node has old data, it will feed incorrect information to the Master nodes.

The "Live ISO" Method (Easiest for Manual Labs)

Boot the node using the RHCOS Live ISO.
Once you reach the prompt (or press CTRL+ALT+F2 to get a console), identify your disk (Usually, it is /dev/sda or /dev/nvme0n1):
```
lsblk
```
Then clean the found partition
```
sudo wipefs -a /dev/sda 
```
Reboot the node and start the installation again.

info How Nodes Know Which Images to Pull from the Mirror Registry

When you run coreos-installer with the -u flag, the node downloads the raw RHCOS operating system image from your local web server — this is just the base OS with no OpenShift components. After first boot, the node needs to pull dozens of OpenShift container images (API server, etcd, operators, etc.) from a registry. In an air-gapped environment, two fields in install-config.yaml work together to make this seamless.

1. imageContentSources (Mirror Redirect Rules)

This field tells every node: "whenever you need an image from quay.io or registry.redhat.io, silently redirect that request to my local mirror instead." The node never needs to know it's in a disconnected environment — it requests images by their original Red Hat names and OpenShift handles the redirect automatically.

imageContentSources:
  - mirrors:
    - mirror-registry.ocp.lan:8443/openshift/release
    source: quay.io/openshift-release-dev/ocp-release
  - mirrors:
    - mirror-registry.ocp.lan:8443/openshift/release
    source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

2. additionalTrustBundle (Internal CA Certificate)

Your local mirror registry uses a self-signed or internally-issued TLS certificate. Without this field, nodes would reject connections to it as untrusted. The additionalTrustBundle injects your internal CA certificate into every node's trust store so HTTPS connections to the mirror registry are accepted without error.

additionalTrustBundle: |
  -----BEGIN CERTIFICATE-----
  <your internal CA certificate here>
  -----END CERTIFICATE-----

info Complete `install-config.yaml`

Below is a fully annotated install-config.yaml covering every field you need for a UPI air-gapped deployment. Every line is commented so you know exactly what it controls and why it exists.

⚠️

The installer consumes and deletes this file. Always keep a backup copy before running openshift-install create manifests. Once deleted, you cannot recover it from the generated output.

# ─────────────────────────────────────────────────────────────────
# API VERSION
# Must always be v1. This is the only supported version.
# ─────────────────────────────────────────────────────────────────
apiVersion: v1

# ─────────────────────────────────────────────────────────────────
# BASE DOMAIN
# The parent DNS domain for your cluster.
# The cluster name below is prepended to form the full domain:
#   <clusterName>.<baseDomain>  →  lab.ocp.lan
# Your DNS must have records for:
#   api.lab.ocp.lan        → Load Balancer IP (port 6443)
#   *.apps.lab.ocp.lan     → Ingress Load Balancer IP
# ─────────────────────────────────────────────────────────────────
baseDomain: ocp.lan

# ─────────────────────────────────────────────────────────────────
# CLUSTER NAME
# Short name for this cluster. Combined with baseDomain above.
# Used in all internal DNS names and TLS certificates.
# ─────────────────────────────────────────────────────────────────
metadata:
  name: lab             #Cluster name

# ─────────────────────────────────────────────────────────────────
# COMPUTE (WORKER) NODES
# Defines the default worker MachineSet.
# In UPI mode, the installer does NOT create machines automatically.
# Set replicas: 0 — you will boot workers manually.
# hyperthreading: Enabled is the default and recommended setting.
# ─────────────────────────────────────────────────────────────────
compute:
  - name: worker
    replicas: 0                  # Must be 0 for UPI — you provision workers manually
    hyperthreading: Enabled
    architecture: amd64          # Use arm64 for ARM-based nodes

# ─────────────────────────────────────────────────────────────────
# CONTROL PLANE (MASTER) NODES
# Always set replicas: 3 for a production HA cluster.
# A single master (replicas: 1) is supported only for dev/test.
# ─────────────────────────────────────────────────────────────────
controlPlane:
  name: master
  replicas: 3                    # 3 = HA. Never use 2 (no quorum).
  hyperthreading: Enabled
  architecture: amd64

# ─────────────────────────────────────────────────────────────────
# NETWORKING
# Defines the internal IP address ranges used inside the cluster.
# These are virtual ranges — they do NOT need to exist on your
# physical network. They must not overlap with your node IPs.
#
# networkType: OVNKubernetes is the current default and recommended.
#              OpenShiftSDN is deprecated as of OCP 4.15.
#
# clusterNetwork: The CIDR for pod IP addresses.
#   hostPrefix: /23 means each node gets a /23 subnet (~510 pod IPs).
#
# serviceNetwork: The CIDR for Kubernetes Service (ClusterIP) objects.
#   Must be a single entry. /16 gives 65,534 service IPs.
#
# machineNetwork: The CIDR of your physical node network.
#   Must match the real subnet your nodes are on (192.168.22.0/24).
# ─────────────────────────────────────────────────────────────────
networking:
  networkType: OVNKubernetes
  clusterNetwork:
    - cidr: 10.128.0.0/14        # Pod IP range across the cluster
      hostPrefix: 23             # Subnet size allocated per node
  serviceNetwork:
    - 172.30.0.0/16              # Kubernetes service (ClusterIP) range
  machineNetwork:
    - cidr: 192.168.22.0/24      # Must match your physical node subnet

# ─────────────────────────────────────────────────────────────────
# PLATFORM
# Set to "none" for UPI — tells the installer not to create any
# cloud or virtualization resources automatically.
# ─────────────────────────────────────────────────────────────────
platform:
  none: {}

# ─────────────────────────────────────────────────────────────────
# FIPS MODE (Optional)
# Enables FIPS 140-2/3 validated cryptographic modules.
# Required for US federal / DoD environments.
# Cannot be changed after installation.
# ─────────────────────────────────────────────────────────────────
fips: false

# ─────────────────────────────────────────────────────────────────
# PUBLISH STRATEGY
# Controls how the API server endpoint is exposed.
#   External: API accessible from outside the cluster network (default)
#   Internal: API accessible only within the cluster network
# For air-gapped environments, "Internal" is typically used.
# ─────────────────────────────────────────────────────────────────
publish: Internal

# ─────────────────────────────────────────────────────────────────
# PULL SECRET
# Authenticates nodes to pull container images from:
#   - registry.redhat.io   (Red Hat operator images)
#   - quay.io              (OpenShift release images)
#   - your local mirror    (air-gapped environments)
#
# For air-gapped installs, add your mirror registry credentials
# into this JSON alongside the Red Hat entries.
#
# Get from: https://console.redhat.com/openshift/install/pull-secret
# Must be a single-line JSON string inside single quotes.
# ─────────────────────────────────────────────────────────────────
pullSecret: '{"auths":{"registry.redhat.io":{"auth":"<base64-encoded-credentials>"},"quay.io":{"auth":"<base64-encoded-credentials>"},"mirror-registry.ocp.lan:8443":{"auth":"<base64-encoded-mirror-credentials>"}}}'

# ─────────────────────────────────────────────────────────────────
# SSH KEY
# Your SSH public key, injected into every RHCOS node.
# Allows SSH access as the built-in "core" user for troubleshooting.
# Only the PUBLIC key goes here — never the private key.
# Generate with: ssh-keygen -t ed25519 -f ~/.ssh/id_ocp
# ─────────────────────────────────────────────────────────────────
sshKey: 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAA... admin@ocp-cluster'

# ─────────────────────────────────────────────────────────────────
# ADDITIONAL TRUST BUNDLE
# Your internal CA certificate in PEM format.
# Required when your mirror registry uses a self-signed or
# internally-issued TLS certificate.
# Injected into every node's system trust store on first boot.
# Must be indented under the key with 2 spaces.
# to allow access to port 8443, generate the next CA then add to additionalTrustBundle
# openssl s_client -showcerts -connect mirror-registry.ocp.lan:8443 </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > registry-ca.crt
# Note: The | symbol is mandatory in YAML; it allows for a multi-line string.

# ─────────────────────────────────────────────────────────────────
additionalTrustBundle: |
  -----BEGIN CERTIFICATE-----
  MIIFazCCA1OgAwIBAgIUYourInternalCAcertificateHere...
  <full PEM certificate content>
  -----END CERTIFICATE-----
# ─────────────────────────────────────────────────────────────────
# If you need to ignore get the certificate, ignore the previous step and add this step:
# insecureRegistries:
# - mirror-registry.ocp.lan:8443
# ─────────────────────────────────────────────────────────────────
# IMAGE CONTENT SOURCES  (imageDigestMirrors in OCP 4.13+)
# Tells every node to redirect image pulls from Red Hat registries
# to your local mirror registry instead.
# The "source" is the original Red Hat registry path.
# The "mirrors" list is where to redirect requests.
# The installer bakes these rules into the .ign files and creates
# an ImageContentSourcePolicy object in the cluster on first boot.
# ─────────────────────────────────────────────────────────────────
imageContentSources:
  - mirrors:
      - mirror-registry.ocp.lan:8443/openshift/release
    source: quay.io/openshift-release-dev/ocp-release

  - mirrors:
      - mirror-registry.ocp.lan:8443/openshift/release
    source: quay.io/openshift-release-dev/ocp-v4.0-art-dev

  - mirrors:
      - mirror-registry.ocp.lan:8443/redhat
    source: registry.redhat.io/redhat

  - mirrors:
      - mirror-registry.ocp.lan:8443/ubi8
    source: registry.redhat.io/ubi8

# ─────────────────────────────────────────────────────────────────
# PROXY (Optional)
# Only needed if your nodes reach the mirror registry through
# an HTTP/HTTPS proxy. Leave out entirely if no proxy is used.
# noProxy: comma-separated list of hosts/CIDRs to bypass the proxy.
# ─────────────────────────────────────────────────────────────────
# proxy:
#   httpProxy: http://proxy.example.com:3128
#   httpsProxy: http://proxy.example.com:3128
#   noProxy: 192.168.22.0/24,mirror-registry.ocp.lan,.ocp.lan

# ─────────────────────────────────────────────────────────────────
# CLUSTER CAPABILITIES (Optional — OCP 4.12+)
# Controls which optional cluster components get installed.
# Use to reduce footprint in resource-constrained environments.
# "vCurrent" installs all capabilities for your OCP version.
# ─────────────────────────────────────────────────────────────────
# capabilities:
#   baselineCapabilitySet: vCurrent
#   additionalEnabledCapabilities:
#     - marketplace
#     - openShiftSamples

Topics

Thursday, April 30, 2026