Setting Up an OpenShift Cluster
User-Provisioned Infrastructure
in Air-Gapped Environments
A complete, step-by-step guide to manually provisioning VMs, load balancers, and DNS for an OpenShift cluster in a disconnected network.
1 What is UPI?
With User-Provisioned Infrastructure (UPI), you have maximum control over the cluster setup. You are responsible for manually preparing all virtual machines, load balancers, and DNS records before the OpenShift installer runs. This is the preferred approach for secure, air-gapped, or heavily regulated environments.
2 Air-Gapped Prerequisites
The primary challenge in a disconnected environment is the absence of direct access to the Red Hat Container Registry. You must bridge this gap before initiating the installation.
Mirror Registry
Establish a local container registry (e.g., Red Hat Quay or JFrog Artifactory) within your secure perimeter. Use the oc mirror plugin to sync OpenShift release images, operator catalogs, and Helm charts from the internet to a portable medium, then load them into your local registry.
Internal DNS & NTP
Precise time synchronization and split-horizon DNS are non-negotiable. Every node must be able to resolve the local registry hostname and all internal API endpoints.
3 OS Strategy
Red Hat enforces a specific OS strategy to ensure the self-healing nature of OpenShift.
rpm-ostree for safe, transactional updates. You have some flexibility here, but RHCOS is the supported default.4 Architecture & Helper Node
This setup uses a Helper Node as the backbone — it acts as a bridge between the external network and the internal cluster network using two network interfaces.
Network Interfaces
| Interface | Zone | Subnet | Role |
|---|---|---|---|
ens192 | External | 192.168.0.X | Front-end / internet-facing traffic and corporate LB |
ens224 | Internal | 192.168.22.X | Back-end cluster communication and storage traffic |
Infrastructure Services Provided by the Helper Node
5 Cluster Node Roles
Bootstrap Node
Used only during the initial installation to orchestrate Control Plane creation. Decommissioned once the control plane is healthy.
Control Plane (Masters)
The "brains" of the cluster. Runs the API server, etcd database, and controllers. Three nodes ensure high availability.
Worker Nodes
Where your actual applications, containers, and pods run. CSRs must be manually approved in UPI mode.
6 Core Deployment Workflow
Phase I — Configuration & Manifest Generation
On the Bastion host, define the cluster in install-config.yaml, pointing to your local mirror registry. Generate Kubernetes manifests and convert them to Ignition configs (.ign files) that RHCOS nodes execute on first boot.
Phase II — Infrastructure Provisioning
Configure a high-availability load balancer (HAProxy/F5) for the API (port 6443), Machine Config Server (port 22623), and Ingress (ports 80/443). Host the generated .ign files on an internal HTTP server.
Phase III — Bootstrap Sequence
Boot the Bootstrap node → it pulls its config and initiates control plane creation. Boot Masters → they form the etcd quorum. Boot Workers → manually approve their CSRs to join the cluster.
7 Detailed Setup Steps
Prepare Bastion /Helper Node.
Helper Node OS shloud be RedHat or CentOS 8 x86_64 image
Login to RedHat OpenShift Cluster Manager
Select 'Create Cluster' from the 'Clusters' navigation menu
Select 'RedHat OpenShift Container Platform'
Select 'Run on Bare Metal'
Download the following files:
- Openshift Installer for Linux (openshift-install-linux.tar.gz)
- Pull secret
- Command Line Interface for Linux and your workstations OS (openshift-client-linux.tar.gz)
- Red Hat Enterprise Linux CoreOS (RHCOS)
- rhcos-X.X.X-x86_64-metal.x86_64.raw.gz
- rhcos-X.X.X-x86_64-installer.x86_64.iso (or rhcos-X.X.X-x86_64-live.x86_64.iso for newer versions)
Notes: Before powering on a single node, these must be ready:
1) Load Balancer:
- Port 6443 (API): Points to Bootstrap + 3 Masters.
- Port 22623 (Machine Config): Points to Bootstrap + 3 Masters.
- Ports 80/443 (Apps): Points to all Worker nodes.
- api.<cluster>.<domain> -> LB VIP for 6443.
- api-int.<cluster>.<domain> -> LB VIP for 6443/22623.
- *.apps.<cluster>.<domain> -> LB VIP for 80/443/8443.
Step 1 — Install Client Tools
# Extract and install the OpenShift client tools
tar xvf openshift-client-linux.tar.gz
mv oc kubectl /usr/local/bin
# Verify installation
kubectl version
oc version
# Extract the OpenShift Installer
tar xvf openshift-install-linux.tar.gz
Step 2 — Configure Static IP for Internal NIC
Run nmtui-edit ens224 or edit /etc/sysconfig/network-scripts/ifcfg-ens224 with these values:
Address: 192.168.22.1
DNS Server: 127.0.0.1
Search Domain: ocp.lan
Default Route: Disabled
Auto-connect: Enabled
If changes don't apply, bounce the NIC: nmcli connection down ens224 && nmcli connection up ens224
Step 3 — Configure Firewall Zones
# Assign interfaces to zones
nmcli connection modify ens224 connection.zone internal
nmcli connection modify ens192 connection.zone external
# Enable masquerading (NAT) on both zones
firewall-cmd --zone=external --add-masquerade --permanent
firewall-cmd --zone=internal --add-masquerade --permanent
firewall-cmd --reload
# Verify zones and IP forwarding
firewall-cmd --get-active-zones
firewall-cmd --list-all --zone=internal
firewall-cmd --list-all --zone=external
cat /proc/sys/net/ipv4/ip_forward # Should return 1
Step 4 — Clone Config Repository
dnf update -y
dnf install git -y
git clone https://github.com/ryanhay/ocp4-metal-install
Step 5 — Install & Configure DNS (BIND)
dnf install bind bind-utils -y
cp ~/ocp4-metal-install/dns/named.conf /etc/named.conf
cp -R ~/ocp4-metal-install/dns/zones /etc/named/
# Open firewall for DNS
firewall-cmd --add-port=53/udp --zone=internal --permanent
firewall-cmd --add-port=53/tcp --zone=internal --permanent # Required for OCP 4.9+
firewall-cmd --reload
# Enable and start BIND
systemctl enable named && systemctl start named && systemctl status named
Update the external NIC (ens192) to use 127.0.0.1 as its DNS server and enable "Ignore automatically obtained DNS parameters" via nmtui-edit ens192, then restart NetworkManager:
systemctl restart NetworkManager
# Verify DNS resolution
dig ocp.lan
dig -x 192.168.22.200 # Should resolve to ocp-bootstrap.lab.ocp.lan
Step 6 — Install & Configure DHCP
~/ocp4-metal-install/dhcpd.conf with the actual MAC addresses of each cluster machine.dnf install dhcp-server -y
cp ~/ocp4-metal-install/dhcpd.conf /etc/dhcp/dhcpd.conf
firewall-cmd --add-service=dhcp --zone=internal --permanent
firewall-cmd --reload
systemctl enable dhcpd && systemctl start dhcpd && systemctl status dhcpd
Step 7 — Install & Configure Apache Web Server
dnf install httpd -y
# Change Apache to listen on port 8080 (avoids conflicts)
sed -i 's/Listen 80/Listen 0.0.0.0:8080/' /etc/httpd/conf/httpd.conf
firewall-cmd --add-port=8080/tcp --zone=internal --permanent
firewall-cmd --reload
systemctl enable httpd && systemctl start httpd && systemctl status httpd
# Verify it's running
curl localhost:8080
Step 8 — Install & Configure HAProxy
dnf install haproxy -y
cp ~/ocp4-metal-install/haproxy.cfg /etc/haproxy/haproxy.cfg
Open the required firewall ports:
# Control plane API
firewall-cmd --add-port=6443/tcp --zone=internal --permanent
firewall-cmd --add-port=6443/tcp --zone=external --permanent
# Machine Config Server
firewall-cmd --add-port=22623/tcp --zone=internal --permanent
# Application ingress (HTTP/HTTPS)
firewall-cmd --add-service=http --zone=internal --permanent
firewall-cmd --add-service=http --zone=external --permanent
firewall-cmd --add-service=https --zone=internal --permanent
firewall-cmd --add-service=https --zone=external --permanent
# HAProxy stats UI (accessible at http://<helper-ip>:9000/stats)
firewall-cmd --add-port=9000/tcp --zone=external --permanent
firewall-cmd --reload
# Allow HAProxy SELinux binding and start the service
setsebool -P haproxy_connect_any 1
systemctl enable haproxy && systemctl start haproxy && systemctl status haproxy
Step 9 — Install & Configure NFS Server
Network File System (NFS) is a distributed file system protocol that allows a user on a client computer to access files over a network much like local storage is accessed. Originally developed by Sun Microsystems, it has become the standard for file sharing between Unix and Linux systems.
How NFS Works
- NFS Server: Hosts the physical storage and "exports" (shares) specific directories to the network. It manages permissions and handles requests from clients.
- NFS Client: Mounts the exported directory from the server onto its own local file system. To the user or application on the client side, the files appear to be stored locally.
dnf install nfs-utils -y
mkdir -p /shares/registry
chown -R nobody:nobody /shares/registry
chmod -R 777 /shares/registry
echo "/shares/registry 192.168.22.0/24(rw,sync,root_squash,no_subtree_check,no_wdelay)" > /etc/exports
exportfs -rv
firewall-cmd --zone=internal --add-service=mountd --permanent
firewall-cmd --zone=internal --add-service=rpc-bind --permanent
firewall-cmd --zone=internal --add-service=nfs --permanent
firewall-cmd --reload
systemctl enable nfs-server rpcbind
systemctl start nfs-server rpcbind nfs-mountd
Step 10 — Generate Installation Files
mkdir /var/www/html/ocp4
cp ~/ocp4-metal-install/install-config.yaml /var/www/html/ocp4
install-config.yaml before proceeding: insert your Pull Secret and SSH public key. See the Configuration Details section below for guidance.# Generate Kubernetes manifests
~/openshift-install create manifests --dir /var/www/html/ocp4
To control whether workloads can run on Control Plane nodes, edit the scheduler manifest:
ls ~/ocp-install/manifests/cluster-scheduler-02-config.yml
# Set mastersSchedulable: true → allow workloads on masters
# Set mastersSchedulable: false → prevent workloads (default)
# Generate Ignition configs and auth files
~/openshift-install create ignition-configs --dir /var/www/html/ocp4
Step 11 — Host RHCOS Image and Set Permissions
# Move the RHCOS metal image to the web server
mv ~/rhcos-X.X.X-x86_64-metal.x86_64.raw.gz /var/www/html/ocp4/rhcos
# Set correct SELinux context, ownership, and permissions
chcon -R -t httpd_sys_content_t /var/www/html/ocp4/
chown -R apache: /var/www/html/ocp4/
chmod 755 /var/www/html/ocp4/
# Confirm all files are accessible
curl localhost:8080/ocp4/
Step 12 — Boot Cluster Nodes
Boot each node type using the RHCOS ISO or PXE, passing the appropriate Ignition file URL via kernel arguments.
You boot your nodes using the RHCOS (Red Hat Enterprise Linux CoreOS) ISO or PXE. During the boot process, you must pass a kernel argument to tell the node where its "brain" (Ignition file) is:coreos.inst.ignition_url=http://
Order of Operations:
- 1. Start Bootstrap node.
- 2. Start 3 Master nodes.
- 3. Wait for the API to come up.
- 4. Start Worker nodes.
# Bootstrap Node
sudo coreos-installer install /dev/sda \
-u http://192.168.22.1:8080/ocp4/rhcos \
-I http://192.168.22.1:8080/ocp4/bootstrap.ign \
--insecure --insecure-ignition
# Control Plane (Master) Nodes
sudo coreos-installer install /dev/sda \
-u http://192.168.22.1:8080/ocp4/rhcos \
-I http://192.168.22.1:8080/ocp4/master.ign \
--insecure --insecure-ignition
# Worker Nodes
sudo coreos-installer install /dev/sda \
-u http://192.168.22.1:8080/ocp4/rhcos \
-I http://192.168.22.1:8080/ocp4/worker.ign \
--insecure --insecure-ignition
Step 13 — Monitor Bootstrap & Finalize
# Monitor bootstrap progress from the Helper Node
~/openshift-install --dir /var/www/html/ocp4/ wait-for bootstrap-complete --log-level=debug
Once bootstrapping completes, remove the Bootstrap node from HAProxy and shut it down:
# Remove ocp-bootstrap from /etc/haproxy/haproxy.cfg, then reload
systemctl reload haproxy
# Approve Worker CSRs so workers can join the cluster
oc get csr
oc adm certificate approve <csr-name>
# Verify all nodes are Ready
oc get nodes
Step 14 — Post-Installation
# Retrieve console URL and kubeadmin password
cat ~/ocp-install/auth/console-url
cat ~/ocp-install/auth/kubeadmin-password
- Configure Storage: Define StorageClasses (NFS, OCS, or local storage) so applications can persist data. See the Configure Storage section below.
- Set Up Identity Providers: Replace the temporary
kubeadminuser with a permanent solution such as LDAP or OAuth. See the Identity Providers section below.
★ Configure Storage Post-Install
Once the cluster is healthy and all nodes are Ready, you must configure persistent storage. Without a working StorageClass, the internal image registry, monitoring stack, and most operators cannot persist data.
Removed or EmptyDir by default after a UPI install. You must back it with persistent storage before pushing any images.Option A — NFS StorageClass (Lab / Air-Gapped)
If you provisioned an NFS share on the Helper Node (Step 9), expose it as a dynamic StorageClass using the NFS Subdir External Provisioner. This is the fastest path for lab and air-gapped environments.
1. Configure storage for the Image Registry
If you check the cluster operators oc get co, you will likely see the image-registry operator reporting AVAILABLE=False or PROGRESSING=True (but stuck) because it lacks the resources to deploy the registry pods.
Run the next command to create the 'image-registry-storage' PVC by updating the management state to 'Managed' and adding 'pvc' and 'claim' keys in the storage key
oc edit configs.imageregistry.operator.openshift.io
here is the default output before updates
apiVersion: imageregistry.operator.openshift.io/v1
kind: Config
metadata:
name: cluster
spec:
managementState: Removed # <--- KEY OBSERVATION 1
storage: {} # <--- KEY OBSERVATION 2
here is the file after our updates
apiVersion: imageregistry.operator.openshift.io/v1
kind: Config
metadata:
name: cluster
spec:
managementState: Managed # <--- KEY OBSERVATION 1
storage: # <--- KEY OBSERVATION 2
pvc:
claim: # leave the claim blank
2. Verify the Storage
Confirm the 'image-registry-storage' pvc has been created and is currently in a 'Pending' state
oc get pvc -n openshift-image-registry
3. Create Persistent Volume
Create the persistent volume for the 'image-registry-storage' pvc to bind to NFS.
oc create -f ~/ocp4-metal-install/manifest/registry-pv.yaml
noted that registry-pv.yaml file contains the next contents
apiVersion: v1
kind: PersistentVolume
metadata:
name: registry-pv
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 100Gi
persistentVolumeReclaimPolicy: Retain
nfs:
path: /shares/registry
server: 192.168.22.1
4. Verify the Storage again!
After a short wait the 'image-registry-storage' pvc should now be in a 'bound' state
oc get pvc -n openshift-image-registry
Option B — OpenShift Data Foundation / ODF (Production)
ODF provides software-defined block, file, and object storage via Ceph running directly on your worker nodes. Minimum requirement: 3 worker nodes, each with at least one raw, unformatted additional disk.
1. Label the storage nodes
oc label node worker-0.lab.ocp.lan cluster.ocs.openshift.io/openshift-storage=""
oc label node worker-1.lab.ocp.lan cluster.ocs.openshift.io/openshift-storage=""
oc label node worker-2.lab.ocp.lan cluster.ocs.openshift.io/openshift-storage=""
2. Install the ODF Operator
oc create namespace openshift-storage
# Create OperatorGroup
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
name: openshift-storage-operatorgroup
namespace: openshift-storage
spec:
targetNamespaces:
- openshift-storage
EOF
# Subscribe to ODF (adjust channel to match your OCP version)
cat <<EOF | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: odf-operator
namespace: openshift-storage
spec:
channel: stable-4.14
installPlanApproval: Automatic
name: odf-operator
source: redhat-operators # Replace with mirrored CatalogSource in air-gapped
sourceNamespace: openshift-marketplace
EOF
# Wait for all operator pods to be Running
oc get pods -n openshift-storage -w
3. Create the StorageCluster
cat <<EOF | oc apply -f -
apiVersion: ocs.openshift.io/v1
kind: StorageCluster
metadata:
name: ocs-storagecluster
namespace: openshift-storage
spec:
manageNodes: false
monDataDirHostPath: /var/lib/rook
storageDeviceSets:
- name: ocs-deviceset
count: 1 # 1 OSD per node x 3 nodes = 3 OSDs total
replica: 3
portable: true
dataPVCTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi # Size of each raw disk to claim
volumeMode: Block
storageClassName: localblock # SC that presents raw block devices
EOF
# Monitor cluster initialisation (typically 5-15 minutes)
oc get storagecluster -n openshift-storage -w
4. StorageClasses created by ODF
| StorageClass | Type | Access Mode | Best For |
|---|---|---|---|
ocs-storagecluster-ceph-rbd | Block (Ceph RBD) | RWO | Databases (PostgreSQL, MongoDB), stateful apps |
ocs-storagecluster-cephfs | File (CephFS) | RWX | Shared media folders, CMS uploads, ML pipelines |
openshift-storage.noobaa.io | Object (S3 API) | S3 | Backups, AI/ML datasets, image registry |
Option C — Local Storage Operator (LSO)
LSO presents raw node-local disks as PersistentVolumes without requiring a SAN or NFS server. It is commonly used as the backing layer for ODF.
# Install via Subscription
# channel: stable-4.14 | name: local-storage-operator
# After operator is Running, declare which disks to expose:
cat <<EOF | oc apply -f -
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
name: local-disks
namespace: openshift-local-storage
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- worker-0.lab.ocp.lan
- worker-1.lab.ocp.lan
- worker-2.lab.ocp.lan
storageClassDevices:
- storageClassName: localblock
volumeMode: Block
devicePaths:
- /dev/sdb # The second raw disk on each node
EOF
Configure the Internal Image Registry
After storage is ready, switch the registry from Removed to Managed and back it with a PVC:
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{
"spec": {
"managementState": "Managed",
"storage": {"pvc": {"claim": ""}},
"replicas": 1
}
}'
# The operator auto-creates a PVC; watch it bind
oc get pvc -n openshift-image-registry
# Confirm the registry pod is Running
oc get pods -n openshift-image-registry
ReadWriteMany (RWX) PVC, such as NFS or CephFS. For a single replica, ReadWriteOnce (RWO) is sufficient.Set the Default StorageClass
# Mark one SC as the cluster default
oc patch storageclass nfs-client -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "true"}}}'
# Remove the default annotation from any previously default SC
oc patch storageclass old-sc -p '{"metadata": {"annotations": {"storageclass.kubernetes.io/is-default-class": "false"}}}'
# Verify
oc get storageclass
Enable Persistent Storage for the Monitoring Stack
Prometheus and Alertmanager use ephemeral storage by default. Configure persistence so metrics survive pod restarts:
cat <<EOF | oc apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
prometheusK8s:
retention: 15d
volumeClaimTemplate:
spec:
storageClassName: nfs-client # Or ocs-storagecluster-ceph-rbd
resources:
requests:
storage: 50Gi
alertmanagerMain:
volumeClaimTemplate:
spec:
storageClassName: nfs-client
resources:
requests:
storage: 10Gi
EOF
★ Set Up Identity Providers Post-Install
After installation the only user is the temporary kubeadmin. You must configure a permanent Identity Provider and then delete kubeadmin to enforce proper authentication and RBAC across the cluster.
cluster-admin privileges and you have confirmed that you can log in successfully as that user.Option A — HTPasswd (Simplest / Lab)
HTPasswd stores usernames and bcrypt-hashed passwords in a flat file. Ideal for small teams and fully air-gapped labs where an external directory is not available.
1. Create the htpasswd file and Kubernetes Secret
dnf install httpd-tools -y
# -c creates a new file; omit -c when appending users
htpasswd -c -B -b /tmp/htpasswd admin RedHatAdmin1!
htpasswd -B -b /tmp/htpasswd developer DevPass123!
# Store the file as a Secret in openshift-config
oc create secret generic htpasswd-secret --from-file=htpasswd=/tmp/htpasswd -n openshift-config
2. Register the provider in the OAuth cluster object
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: htpasswd_provider
mappingMethod: claim
type: HTPasswd
htpasswd:
fileData:
name: htpasswd-secret # Must match the Secret name above
EOF
3. Grant cluster-admin and test login
# Allow ~30 s for oauth-server pods to restart, then:
oc adm policy add-cluster-role-to-user cluster-admin admin
oc login -u admin -p RedHatAdmin1! https://api.lab.ocp.lan:6443
oc whoami # Should return: admin
4. Adding or changing users later
# Pull the current file out of the Secret
oc extract secret/htpasswd-secret -n openshift-config --to=/tmp --confirm
# Modify it — add a user, change a password, etc.
htpasswd -B -b /tmp/htpasswd newuser NewPass456!
# Push the updated file back — oauth pods restart automatically
oc set data secret/htpasswd-secret --from-file=htpasswd=/tmp/htpasswd -n openshift-config
Option B — LDAP / Active Directory
Integrate OpenShift with an existing LDAP directory (Microsoft AD, Red Hat Directory Server, OpenLDAP). Authentication is delegated to the directory; no separate password management is needed inside OpenShift.
1. Store the LDAP bind password as a Secret
oc create secret generic ldap-bind-password --from-literal=bindPassword='BindUserPassword123!' -n openshift-config
2. Store the LDAP CA certificate (required for ldaps://)
oc create configmap ldap-ca-cert --from-file=ca.crt=/path/to/your-ldap-ca.crt -n openshift-config
3. Configure the OAuth object for LDAP
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: ldap_provider
mappingMethod: claim
type: LDAP
ldap:
attributes:
id: [dn]
email: [mail]
name: [cn]
preferredUsername: [sAMAccountName] # For AD; use uid for OpenLDAP
bindDN: "CN=ocp-svc,OU=ServiceAccounts,DC=corp,DC=lan"
bindPassword:
name: ldap-bind-password
ca:
name: ldap-ca-cert # Remove this block for public-CA LDAP
insecure: false
url: "ldaps://dc01.corp.lan:636/OU=Users,DC=corp,DC=lan?sAMAccountName?sub?(objectClass=person)"
EOF
ldaps://<host>:<port>/<base-dn>?<attribute>?<scope>?<filter>. Always prefer ldaps:// (port 636) over ldap:// (port 389) to encrypt credentials in transit.4. Sync LDAP groups into OpenShift groups
cat > /tmp/ldap-sync.yaml <<EOF
kind: LDAPSyncConfig
apiVersion: v1
url: ldaps://dc01.corp.lan:636
bindDN: "CN=ocp-svc,OU=ServiceAccounts,DC=corp,DC=lan"
bindPassword: "BindUserPassword123!"
ca: /path/to/your-ldap-ca.crt
rfc2307:
groupsQuery:
baseDN: "OU=OCP-Groups,DC=corp,DC=lan"
scope: sub
derefAliases: never
filter: (objectClass=group)
groupUIDAttribute: dn
groupNameAttributes: [cn]
groupMembershipAttributes: [member]
usersQuery:
baseDN: "OU=Users,DC=corp,DC=lan"
scope: sub
derefAliases: never
userUIDAttribute: dn
userNameAttributes: [sAMAccountName]
EOF
# Dry-run — preview what will change without applying
oc adm groups sync --sync-config=/tmp/ldap-sync.yaml
# Apply the sync
oc adm groups sync --sync-config=/tmp/ldap-sync.yaml --confirm
# View synced groups
oc get groups
Option C — GitHub / GitLab OAuth
For teams already using GitHub Enterprise or self-hosted GitLab. Only applicable when the cluster can reach the OAuth server endpoint.
1. Register an OAuth Application on GitHub or GitLab
- GitHub path: Settings → Developer settings → OAuth Apps → New OAuth App
- Set Authorization callback URL to:
https://oauth-openshift.apps.lab.ocp.lan/oauth2callback/github - Copy the generated Client ID and Client Secret
2. Store the Client Secret
oc create secret generic github-client-secret --from-literal=clientSecret=<your-github-client-secret> -n openshift-config
3. Configure the OAuth object
oc apply -f - <<EOF
apiVersion: config.openshift.io/v1
kind: OAuth
metadata:
name: cluster
spec:
identityProviders:
- name: github
mappingMethod: claim
type: GitHub
github:
clientID: "<your-github-client-id>"
clientSecret:
name: github-client-secret
organizations:
- my-github-org # Restrict access to members of this org
EOF
Assign Roles with RBAC
After users or groups are created, assign them the appropriate role. OpenShift ships with five built-in cluster roles:
| Role | Scope | What It Allows |
|---|---|---|
cluster-admin | Cluster-wide | Full, unrestricted access to every resource |
cluster-reader | Cluster-wide | Read-only access to all resources |
admin | Namespace | Full control within a specific project/namespace |
edit | Namespace | Create, update, and delete most resources in a project |
view | Namespace | Read-only access within a project |
# Cluster-wide role assignments
oc adm policy add-cluster-role-to-user cluster-admin admin
oc adm policy add-cluster-role-to-group cluster-reader ops-team
# Namespace-scoped role assignments
oc adm policy add-role-to-user admin alice -n my-project
oc adm policy add-role-to-group edit dev-team -n my-project
oc adm policy add-role-to-user view bob -n my-project
# Verify what a user is allowed to do
oc auth can-i get pods --as=alice -n my-project
Delete the kubeadmin User
Once your permanent IdP is working and at least one user has cluster-admin, delete the kubeadmin Secret. This is a hard security requirement — the kubeadmin password is stored in etcd and must not remain permanently.
oc get nodes as your new admin user before executing the delete command. You cannot recover kubeadmin without reinstalling the cluster.# Log out of kubeadmin and verify your new admin works
oc logout
oc login -u admin -p RedHatAdmin1! https://api.lab.ocp.lan:6443
oc get nodes # Must return all nodes in Ready state
# Now it is safe to delete kubeadmin
oc delete secret kubeadmin -n kube-system
Verify the Complete Authentication Configuration
# List all configured identity providers
oc get oauth cluster -o jsonpath='{.spec.identityProviders[*].name}'
# List every user OpenShift knows about
oc get users
# List all identities (shows which provider created each entry)
oc get identity
# Check all cluster-admin bindings
oc get clusterrolebindings -o wide | grep cluster-admin
spec.identityProviders. Each entry needs a unique name field. Users authenticating via different providers are treated as separate identities unless you configure a lookup or add mapping method to merge them.info StorageClasses: Explain OpenShift Data Foundation (ODF)
After the cluster installation completes, the environment is "empty." You must now configure Persistent Storage for cluster operations.
OpenShift requires a StorageClass to fulfill Persistent Volume Claims (PVCs). Best storage depends on the environment (NFS for simplicity, ODF/OCS for production-grade software-defined storage).
Dive into the storage architecture of OpenShift, moving from the basic concepts of the Container Storage Interface (CSI) to advanced software-defined storage like OpenShift Data Foundation (ODF).
1. The OpenShift Storage Hierarchy
To understand OpenShift storage, you must distinguish between the physical storage and the virtualized requests made by applications.
- Persistent Volume (PV): The actual "disk" (network-attached or local) provisioned by the administrator.
- Persistent Volume Claim (PVC): The request made by a developer for a certain amount of storage.
- StorageClass (SC): The "template" that defines how a PV is created (e.g., fast SSD vs. slow HDD).
2. Core Storage Types
OpenShift categorizes storage based on how many nodes can access it simultaneously.
A. Block Storage (RWO - ReadWriteOnce)
- Technology: iSCSI, Fibre Channel, AWS EBS, OpenStack Cinder, ODF RBD.
- Best For: Databases (PostgreSQL, MongoDB).
- Behavior: Only one node can mount the volume at a time. It is highly performant and supports low-latency transactions.
B. File Storage (RWX - ReadWriteMany)
- Technology: NFS, Azure Files, ODF CephFS.
- Best For: Shared media folders, CMS uploads (WordPress), or data pipelines where multiple pods need to read/write the same files.
- Behavior: Multiple nodes can mount the same volume simultaneously.
C. Object Storage
- Technology: S3, MinIO, ODF NooBaa.
- Best For: Backups, AI/ML datasets, and cloud-native applications.
- Behavior: Accessed via API (HTTP/HTTPS) rather than a filesystem mount. It is virtually infinitely scalable.
3. Deep Dive: OpenShift Data Foundation (ODF)
Formerly known as OCS (OpenShift Container Storage), ODF is the "Gold Standard" for OpenShift storage. It is built on Ceph, Rook, and NooBaa.
Key Advantages:
- Platform Agnostic: Whether you are on-premise (VMware/Bare Metal) or in the cloud (AWS/Azure), ODF provides the same StorageClasses.
- Hyper-Converged: You don't need an external SAN. ODF uses the spare disks already inside your worker nodes.
- Dynamic Provisioning: It automatically creates volumes as soon as a developer creates a PVC.
- Resilience: By default, data is replicated across 3 different nodes. If one node fails, the data remains available.
ODF Component Breakdown:
| Component | Function | Storage Type |
|---|---|---|
| Ceph RBD | High-performance block storage | Block (RWO) |
| CephFS | Shared filesystem storage | File (RWX) |
| NooBaa | Multi-cloud object gateway | Object (S3) |
4. Hostpath and Local Storage
For edge cases or small-scale labs, you may encounter these:
- HostPath: Uses a directory on the node’s local disk. Warning: If the pod moves to another node, the data stays behind and the pod loses access.
- Local Storage Operator (LSO): A more robust way to use local NVMe/SSD disks. Unlike HostPath, LSO allows the scheduler to track which node "owns" the data.
5. Architectural Decision Matrix
As a Solution Architect, use this table to choose your storage backend:
| Use Case | Recommended Storage | Access Mode |
|---|---|---|
| Database (Prod) | ODF RBD (Block) | RWO |
| Content Management | ODF CephFS or NFS | RWX |
| Machine Learning Models | ODF NooBaa (S3) | Object |
| Temporary Scratch Space | emptyDir | RWO |
| Registry Storage | ODF CephFS | RWX |
6. Pro-Tips for Production
- Snapshotting: Ensure your storage provider supports CSI Snapshots for quick backups before application updates.
- Expansion: Use a StorageClass with
allowVolumeExpansion: true. This allows you to grow a disk without deleting the pod. - IOPS Limiting: In multi-tenant clusters, use StorageQuotas to prevent one team from consuming all the storage bandwidth or capacity.
info Yaml Configuration Details
The install-config.yaml Blueprint
This is the only file you create manually. It acts as the blueprint for the entire installation. Key fields to populate:
- pullSecret — Authorizes nodes to pull OpenShift images from Red Hat registries.
- sshKey — Allows SSH access into RHCOS nodes as the
coreuser for troubleshooting. - networking — Defines cluster and service network CIDRs.
- imageContentSources — Points to your local mirror registry (required for air-gapped installs).
How to Get the Pull Secret
- Log in to the Red Hat OpenShift Cluster Manager at
cloud.redhat.com/openshift. - Download the pull secret using the "Download pull secret" button.
- Paste the entire single-line JSON string into your
install-config.yamlinside single quotes.
How to Get the SSH Key
# Check for existing keys
ls ~/.ssh/id_rsa.pub || ls ~/.ssh/id_ed25519.pub
# Generate a new key pair (if needed)
ssh-keygen -t ed25519 -f ~/.ssh/id_ocp -C "admin@ocp-cluster"
# Output the public key to copy into install-config.yaml
cat ~/.ssh/id_ocp.pub
OpenShift Client vs. Installer — Quick Reference
| Feature | OpenShift Client (oc) | OpenShift Installer |
|---|---|---|
| Filename | openshift-client-linux.tar.gz | openshift-install-linux.tar.gz |
| Primary Goal | Managing an existing cluster | Creating or destroying a cluster |
| Main Binary | oc (and kubectl) | openshift-install |
| Usage Period | Daily, for the life of the cluster | Primarily during Day 1 setup |
| Capabilities | Deploy apps, check logs, manage users | Provision VMs, generate Ignition files |
Helper Node Interface Roles
| Interface | Typical Role | Description |
|---|---|---|
ens192 | External / Public | Front-end traffic — connects to the internet or corporate load balancer to serve applications. |
ens224 | Internal / Private | Back-end traffic — master/worker node communication, storage traffic (CSI/NFS). |
info Building an OpenShift cluster on-premises - Installation Methods
Building an OpenShift cluster on-premises requires shifting from the "push-button" automation of public clouds to a more hands-on infrastructure management approach. In 2026, the process is largely standardized through Red Hat's Assisted Installer or Agent-based methods.
1. Assisted Installer
A user-friendly web interface (hosted at https://www.google.com/search?q=console.redhat.com) that generates a discovery ISO. You boot your on-prem servers with this ISO, and they "call home" to the web console, allowing you to configure the cluster graphically.
2. IPI (Installer-Provisioned Infrastructure)
Full automation. The installer has API access to your infrastructure (like VMware vSphere or OpenStack) and creates the VMs, storage, and networking for you.
3. UPI (User-Provisioned Infrastructure)
Maximum control. You manually prepare the VMs, load balancers, and DNS. This is typical for Bare Metal or highly restricted "Air-Gapped" environments.
info Minimum Cluster Hardware (Production Grade)
| Node Type | CPU | RAM | Disk |
|---|---|---|---|
| Control Plane (3x) | 4 vCPU | 16 GB | 120 GB (SSD preferred) |
| Compute/Worker (2x+) | 4 vCPU | 16 GB | 120 GB |
| Bootstrap (1x) | 4 vCPU | 16 GB | 120 GB (Deleted after install) |
info Do we need Boostrap node to add new Control or Worker Node?
Once the cluster is up and running (Day 2 operations), the Control Plane (Masters) takes over all management tasks.
1. Adding a New Worker Node
When you boot a new Worker node with its ignition file, it communicates directly with the API Server on the Master nodes, CSR Approval: You will need to approve the Certificate Signing Requests (CSRs) using oc get csr and oc adm certificate approve {name}.
2. Adding a New Control Plane (Master) Node
OpenShift clusters are typically designed with an odd number of Control Plane nodes (usually 3) to maintain Etcd Quorum, If you want to move from 3 to 5 Masters, you add them to the existing, healthy cluster. The new Master joins the existing Etcd cluster managed by the current Masters..
Key Considerations for UPI
Ignition Expiry
Ignition files contain certificates valid for 24 hours. If you don't finish the install by then, you must regenerate them.
Disk Cleanup
If an install fails, you must wipe the disks of the nodes before retrying. RHCOS will not overwrite an existing partition table automatically.
How to Clean Disk after installation fails?
Wiping the disks is a critical step because if RHCOS detects an existing ignition configuration or a partition table, it may fail to apply the new configuration, leading to a "zombie" node state.
RHCOS uses Ignition, which runs in the initramfs stage. If Ignition sees a partition labeled boot or root already on the disk, it might assume the installation was already completed and skip critical configuration steps.
Pro-Tip: If you are debugging a failed bootstrap, always wipe the Bootstrap node first. It is the source of truth for the rest of the cluster. If the Bootstrap node has old data, it will feed incorrect information to the Master nodes.
The "Live ISO" Method (Easiest for Manual Labs)
- Boot the node using the RHCOS Live ISO.
- Once you reach the prompt (or press CTRL+ALT+F2 to get a console), identify your disk (Usually, it is /dev/sda or /dev/nvme0n1):
lsblk
- Then clean the found partition
sudo wipefs -a /dev/sda
- Reboot the node and start the installation again.
info How Nodes Know Which Images to Pull from the Mirror Registry
When you run coreos-installer with the -u flag, the node downloads the raw RHCOS operating system image from your local web server — this is just the base OS with no OpenShift components. After first boot, the node needs to pull dozens of OpenShift container images (API server, etcd, operators, etc.) from a registry. In an air-gapped environment, two fields in install-config.yaml work together to make this seamless.
1. imageContentSources (Mirror Redirect Rules)
This field tells every node: "whenever you need an image from quay.io or registry.redhat.io, silently redirect that request to my local mirror instead." The node never needs to know it's in a disconnected environment — it requests images by their original Red Hat names and OpenShift handles the redirect automatically.
imageContentSources:
- mirrors:
- mirror-registry.ocp.lan:8443/openshift/release
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- mirror-registry.ocp.lan:8443/openshift/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
2. additionalTrustBundle (Internal CA Certificate)
Your local mirror registry uses a self-signed or internally-issued TLS certificate. Without this field, nodes would reject connections to it as untrusted. The additionalTrustBundle injects your internal CA certificate into every node's trust store so HTTPS connections to the mirror registry are accepted without error.
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
<your internal CA certificate here>
-----END CERTIFICATE-----
info Complete install-config.yaml
Below is a fully annotated install-config.yaml covering every field you need for a UPI air-gapped deployment. Every line is commented so you know exactly what it controls and why it exists.
openshift-install create manifests. Once deleted, you cannot recover it from the generated output.# ─────────────────────────────────────────────────────────────────
# API VERSION
# Must always be v1. This is the only supported version.
# ─────────────────────────────────────────────────────────────────
apiVersion: v1
# ─────────────────────────────────────────────────────────────────
# BASE DOMAIN
# The parent DNS domain for your cluster.
# The cluster name below is prepended to form the full domain:
# <clusterName>.<baseDomain> → lab.ocp.lan
# Your DNS must have records for:
# api.lab.ocp.lan → Load Balancer IP (port 6443)
# *.apps.lab.ocp.lan → Ingress Load Balancer IP
# ─────────────────────────────────────────────────────────────────
baseDomain: ocp.lan
# ─────────────────────────────────────────────────────────────────
# CLUSTER NAME
# Short name for this cluster. Combined with baseDomain above.
# Used in all internal DNS names and TLS certificates.
# ─────────────────────────────────────────────────────────────────
metadata:
name: lab #Cluster name
# ─────────────────────────────────────────────────────────────────
# COMPUTE (WORKER) NODES
# Defines the default worker MachineSet.
# In UPI mode, the installer does NOT create machines automatically.
# Set replicas: 0 — you will boot workers manually.
# hyperthreading: Enabled is the default and recommended setting.
# ─────────────────────────────────────────────────────────────────
compute:
- name: worker
replicas: 0 # Must be 0 for UPI — you provision workers manually
hyperthreading: Enabled
architecture: amd64 # Use arm64 for ARM-based nodes
# ─────────────────────────────────────────────────────────────────
# CONTROL PLANE (MASTER) NODES
# Always set replicas: 3 for a production HA cluster.
# A single master (replicas: 1) is supported only for dev/test.
# ─────────────────────────────────────────────────────────────────
controlPlane:
name: master
replicas: 3 # 3 = HA. Never use 2 (no quorum).
hyperthreading: Enabled
architecture: amd64
# ─────────────────────────────────────────────────────────────────
# NETWORKING
# Defines the internal IP address ranges used inside the cluster.
# These are virtual ranges — they do NOT need to exist on your
# physical network. They must not overlap with your node IPs.
#
# networkType: OVNKubernetes is the current default and recommended.
# OpenShiftSDN is deprecated as of OCP 4.15.
#
# clusterNetwork: The CIDR for pod IP addresses.
# hostPrefix: /23 means each node gets a /23 subnet (~510 pod IPs).
#
# serviceNetwork: The CIDR for Kubernetes Service (ClusterIP) objects.
# Must be a single entry. /16 gives 65,534 service IPs.
#
# machineNetwork: The CIDR of your physical node network.
# Must match the real subnet your nodes are on (192.168.22.0/24).
# ─────────────────────────────────────────────────────────────────
networking:
networkType: OVNKubernetes
clusterNetwork:
- cidr: 10.128.0.0/14 # Pod IP range across the cluster
hostPrefix: 23 # Subnet size allocated per node
serviceNetwork:
- 172.30.0.0/16 # Kubernetes service (ClusterIP) range
machineNetwork:
- cidr: 192.168.22.0/24 # Must match your physical node subnet
# ─────────────────────────────────────────────────────────────────
# PLATFORM
# Set to "none" for UPI — tells the installer not to create any
# cloud or virtualization resources automatically.
# ─────────────────────────────────────────────────────────────────
platform:
none: {}
# ─────────────────────────────────────────────────────────────────
# FIPS MODE (Optional)
# Enables FIPS 140-2/3 validated cryptographic modules.
# Required for US federal / DoD environments.
# Cannot be changed after installation.
# ─────────────────────────────────────────────────────────────────
fips: false
# ─────────────────────────────────────────────────────────────────
# PUBLISH STRATEGY
# Controls how the API server endpoint is exposed.
# External: API accessible from outside the cluster network (default)
# Internal: API accessible only within the cluster network
# For air-gapped environments, "Internal" is typically used.
# ─────────────────────────────────────────────────────────────────
publish: Internal
# ─────────────────────────────────────────────────────────────────
# PULL SECRET
# Authenticates nodes to pull container images from:
# - registry.redhat.io (Red Hat operator images)
# - quay.io (OpenShift release images)
# - your local mirror (air-gapped environments)
#
# For air-gapped installs, add your mirror registry credentials
# into this JSON alongside the Red Hat entries.
#
# Get from: https://console.redhat.com/openshift/install/pull-secret
# Must be a single-line JSON string inside single quotes.
# ─────────────────────────────────────────────────────────────────
pullSecret: '{"auths":{"registry.redhat.io":{"auth":"<base64-encoded-credentials>"},"quay.io":{"auth":"<base64-encoded-credentials>"},"mirror-registry.ocp.lan:8443":{"auth":"<base64-encoded-mirror-credentials>"}}}'
# ─────────────────────────────────────────────────────────────────
# SSH KEY
# Your SSH public key, injected into every RHCOS node.
# Allows SSH access as the built-in "core" user for troubleshooting.
# Only the PUBLIC key goes here — never the private key.
# Generate with: ssh-keygen -t ed25519 -f ~/.ssh/id_ocp
# ─────────────────────────────────────────────────────────────────
sshKey: 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAA... admin@ocp-cluster'
# ─────────────────────────────────────────────────────────────────
# ADDITIONAL TRUST BUNDLE
# Your internal CA certificate in PEM format.
# Required when your mirror registry uses a self-signed or
# internally-issued TLS certificate.
# Injected into every node's system trust store on first boot.
# Must be indented under the key with 2 spaces.
# to allow access to port 8443, generate the next CA then add to additionalTrustBundle
# openssl s_client -showcerts -connect mirror-registry.ocp.lan:8443 </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > registry-ca.crt
# Note: The | symbol is mandatory in YAML; it allows for a multi-line string.
# ─────────────────────────────────────────────────────────────────
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
MIIFazCCA1OgAwIBAgIUYourInternalCAcertificateHere...
<full PEM certificate content>
-----END CERTIFICATE-----
# ─────────────────────────────────────────────────────────────────
# If you need to ignore get the certificate, ignore the previous step and add this step:
# insecureRegistries:
# - mirror-registry.ocp.lan:8443
# ─────────────────────────────────────────────────────────────────
# IMAGE CONTENT SOURCES (imageDigestMirrors in OCP 4.13+)
# Tells every node to redirect image pulls from Red Hat registries
# to your local mirror registry instead.
# The "source" is the original Red Hat registry path.
# The "mirrors" list is where to redirect requests.
# The installer bakes these rules into the .ign files and creates
# an ImageContentSourcePolicy object in the cluster on first boot.
# ─────────────────────────────────────────────────────────────────
imageContentSources:
- mirrors:
- mirror-registry.ocp.lan:8443/openshift/release
source: quay.io/openshift-release-dev/ocp-release
- mirrors:
- mirror-registry.ocp.lan:8443/openshift/release
source: quay.io/openshift-release-dev/ocp-v4.0-art-dev
- mirrors:
- mirror-registry.ocp.lan:8443/redhat
source: registry.redhat.io/redhat
- mirrors:
- mirror-registry.ocp.lan:8443/ubi8
source: registry.redhat.io/ubi8
# ─────────────────────────────────────────────────────────────────
# PROXY (Optional)
# Only needed if your nodes reach the mirror registry through
# an HTTP/HTTPS proxy. Leave out entirely if no proxy is used.
# noProxy: comma-separated list of hosts/CIDRs to bypass the proxy.
# ─────────────────────────────────────────────────────────────────
# proxy:
# httpProxy: http://proxy.example.com:3128
# httpsProxy: http://proxy.example.com:3128
# noProxy: 192.168.22.0/24,mirror-registry.ocp.lan,.ocp.lan
# ─────────────────────────────────────────────────────────────────
# CLUSTER CAPABILITIES (Optional — OCP 4.12+)
# Controls which optional cluster components get installed.
# Use to reduce footprint in resource-constrained environments.
# "vCurrent" installs all capabilities for your OCP version.
# ─────────────────────────────────────────────────────────────────
# capabilities:
# baselineCapabilitySet: vCurrent
# additionalEnabledCapabilities:
# - marketplace
# - openShiftSamples

No comments:
Post a Comment