Offline DB2 Backup on OpenShift: A Field-Tested Runbook for Maximo Architects
Why This Is Different From Every DB2 Backup Guide You've Read Before
If you've spent years managing Db2 on traditional bare-metal or VM-based Maximo deployments, the mental model you've built — SSH to the server, switch to the instance owner, run db2 backup — is mostly intact when moving to MAS on OpenShift (OCP). What changes, subtly but critically, is where you run each command. Get this wrong and you'll see cryptic permission errors, SQL1092N, or worse, a backup that completes without error but is internally corrupted.
This article builds a precise mental model around the three execution contexts that govern every DB2 operation on OCP, then walks through a complete offline backup runbook — with the kind of operational nuance that doesn't make it into official IBM documentation.
The Mental Model: Three Domains, Three Identities
The single most common mistake when operating DB2 on OCP is domain confusion — running a DB2 command in the OCP control plane, or trying to use oc commands from inside the container. Understanding the three distinct execution domains is the prerequisite for everything else.
┌─────────────────────┐ oc exec ┌──────────────────────┐ su - ┌─────────────────┐
│ Domain 1 │ ──────────────► │ Domain 2 │ ───────────► │ Domain 3 │
│ OCP Control Plane │ │ Container OS │ │ DB2 Engine │
│ Role: cluster-admin│ │ Role: root → │ │ Role: db2inst1 │
│ Tools: oc get, │ │ db2inst1 │ │ Tools: db2, │
│ oc exec │ │ Tools: su, cd, ls │ │ db2ckbkp│
└─────────────────────┘ └──────────────────────┘ └─────────────────┘
Domain 1 — OCP Control Plane: Your local terminal where oc commands are authoritative. This domain manages Kubernetes/OpenShift objects: namespaces, pods, PVCs. It has zero awareness of DB2 semantics. Running db2 list db directory here simply results in "command not found."
Domain 2 — Container OS: The shell session inside the DB2 pod, reached via oc exec. You land here as root (or a restricted UID depending on your SCC configuration). DB2 binaries exist here, but the DB2 environment variables, profile, and instance permissions are only loaded after you switch to the instance owner.
Domain 3 — DB2 Engine: The shell session after running su - db2inst1. The hyphen (-) is not optional — it performs a full login shell switch, sourcing ~/.bash_profile which loads DB2INSTANCE, DB2DIR, library paths, and the rest of the DB2 environment. Without the hyphen, you'll have the wrong PATH and DB2 commands will either fail or behave unpredictably.
Critical nuance for MAS environments: In some MAS DB2U operator deployments, the instance owner may not be named
db2inst1. Always verify withcat /etc/passwd | grep db2from within the container before proceeding.
Pre-Flight: Locating the Right Pod
MAS on OCP deploys DB2 via the DB2U operator, which creates a StatefulSet. Understanding the naming convention saves you from running the backup against a standby or an auxiliary pod.
Step 1 — Identify the namespace
oc get namespaces | grep -i db2
In a standard MAS deployment, the DB2 namespace typically follows the pattern db2u-<suffix> or may be deployed within the MAS instance namespace (e.g., mas-<instanceid>-core). In HA configurations, DB2U may have its own dedicated namespace.
# List all pods across all namespaces matching DB2 patterns
oc get pods -A | grep db2u
Step 2 — Identify the primary engine pod
oc get pods -n <NAMESPACE> | grep db2u
The pod naming convention for DB2U StatefulSets appends -db2u-0 to the release name. For example: c-db2oltp-wh-db2u-0.
Why -db2u-0 matters: In a high-availability or HADR configuration, -db2u-0 designates the primary (active) database engine. Pods with higher ordinal suffixes (e.g., -db2u-1) are standby nodes. Running an offline backup against a standby is not only ineffective — it can cause the primary to behave unexpectedly if the standby transitions to a primary role mid-backup.
# Confirm pod status and readiness before proceeding
oc get pod c-db2oltp-wh-db2u-0 -n <NAMESPACE> -o wide
Verify the pod shows Running with all containers in Ready state. A pod in CrashLoopBackOff or with degraded readiness containers indicates a pre-existing issue that must be resolved before attempting any backup.
Entering the Engine: Shell Access and Identity Switching
Step 3 — Open an interactive session in the DB2 pod
oc exec -it c-db2oltp-wh-db2u-0 -n <NAMESPACE> -- bash
The -- separator is important when your pod name or namespace contains characters that could be misinterpreted by the shell. After this command, you are inside Domain 2 (Container OS).
Security context consideration: Depending on your OCP Security Context Constraints (SCC), you may land as a non-root UID. Check with whoami and id. You need the ability to su to the DB2 instance owner. If your SCC restricts this, work with your OCP cluster administrator to use a service account with the anyuid SCC or equivalent — this is standard for DB2U deployments.
Step 4 — Switch to the DB2 instance owner
su - db2inst1
After this switch, confirm the environment is correctly loaded:
# These should all return valid paths
echo $DB2INSTANCE
echo $DB2DIR
db2 get instance
Expected output from db2 get instance:
The current database manager instance is: db2inst1
If db2 get instance returns an error at this stage, the DB2 instance is not running. You'll need to start it with db2start before proceeding.
Target Acquisition: Confirming the Database Name
Step 5 — List all databases managed by the instance
db2 list db directory
This command queries the system database directory and lists every database catalogued on this instance. In a MAS environment you will typically see:
- BLUDB — the primary Maximo application database (default name used by DB2U)
- Potentially TESTDB or staging databases if the instance hosts multiple environments
Sample output:
System Database Directory
Number of entries in the directory = 1
Entry 1:
Database alias = BLUDB
Database name = BLUDB
Local database directory = /mnt/blumeta0
Database release level = f.00
Comment =
Directory entry type = Indirect
Catalog database partition number = 0
Alternate server hostname =
Alternate server port number =
Note the Local database directory — this is the path inside the container where DB2 metadata and configuration files live. It is separate from the persistent volume path where backup images will be written.
Architectural note for MAS: In DB2U deployments, the database storage is backed by Persistent Volume Claims (PVCs). The database directory (
/mnt/blumeta0) and the data tablespace paths are all mounted PVs. This means the data persists across pod restarts, but the backup destination must also be on a mounted PV, not on ephemeral container storage — otherwise your backup disappears when the pod is recycled.
The Quiesce Phase: Achieving a Consistent Offline State
An offline backup requires the database to be fully deactivated — no active connections, no in-flight transactions, and all buffer pool pages flushed to disk. This is a two-step process.
Step 6 — Disconnect all active applications
db2 force applications all
This immediately terminates all connections to all databases on this instance. In a production Maximo environment, this means all MAS/Maximo application pods lose their DB2 connections simultaneously.
What happens to MAS pods when you do this:
Maximo application pods will begin logging connection errors and may start failing readiness/liveness probes. Depending on your MAS configuration, Kubernetes may begin restarting application pods. This is expected behaviour during a maintenance window. Ensure that:
- A change window has been communicated and MAS application pods are scaled down, or
- You have confirmed MAS is in maintenance mode with no active user sessions
To scale down MAS application components before backup (recommended approach):
# Scale down Maximo Manage server bundle (from Domain 1, OCP Control Plane)
# Run this BEFORE entering the DB2 pod
oc scale deployment <manage-deployment-name> --replicas=0 -n <MAS_NAMESPACE>
Step 7 — Deactivate the database
db2 deactivate db BLUDB
This command does more than just disconnect users — it triggers a clean shutdown of the database engine's services for BLUDB specifically:
- All dirty pages in buffer pools are written (flushed) to the tablespace containers on disk
- Database services (log writer, page cleaner, prefetcher threads) are stopped
- The database transitions to a fully consistent, offline state
What distinguishes deactivate from simply disconnecting: A database can be "inactive" from a connection standpoint but still have a dirty buffer pool. deactivate guarantees the on-disk state is consistent — which is the whole point of an offline backup. You're capturing a clean, point-in-time snapshot with no recovery required upon restore.
Verify the database is offline:
db2 get db cfg for BLUDB | grep -i "First active log"
If the database is properly deactivated, attempting to connect should return SQL1032N ("No start database manager command was issued").
The Extraction: Running the Backup
Step 8 — Execute the backup
db2 backup database BLUDB to /mnt/backup/
Critical prerequisite — the backup destination must be a mounted PV:
Before running this command, verify the backup target directory:
# Confirm the path exists and is writable by db2inst1
ls -la /mnt/backup/
df -h /mnt/backup/
If /mnt/backup/ is not mounted to a PVC, your backup will be written to the container's ephemeral overlay filesystem and will be permanently lost when the pod restarts. In MAS/DB2U deployments, a backup PVC is typically pre-configured and mounted at a path like /mnt/backup — but verify this in your specific deployment.
To check PVC mounts for the pod (from Domain 1):
oc describe pod c-db2oltp-wh-db2u-0 -n <NAMESPACE> | grep -A5 "Volumes:"
Understanding the backup output:
A successful backup produces a timestamped image file. The filename format is:
BLUDB.0.db2inst1.DBPART000.YYYYMMDDHHMMSS.001
The components: database name, instance number, instance owner, partition identifier, timestamp, and sequence number. This naming is automatically generated by DB2 — do not rename backup files as DB2 uses this naming convention during recovery operations.
Backup duration considerations for MAS databases:
For active Maximo environments with years of work order, asset, and inspection data, database sizes of 50–500+ GB are common. An offline backup to a local PV is typically faster than network-based methods, but plan your maintenance window accordingly. Monitor progress:
# In another terminal/session, from Domain 1
oc exec c-db2oltp-wh-db2u-0 -n <NAMESPACE> -- du -sh /mnt/backup/
Verification: Never Trust an Unverified Backup
The instinct to consider the job done when db2 backup returns "completed successfully" is a dangerous one. The DB2 backup utility writes a backup image header and data pages, but file system or storage-layer issues can produce a file that appears complete but is internally corrupt. The only way to know for certain is to validate the image.
Step 9 — Confirm the backup file exists on disk
cd /mnt/backup && ls -lh
Note the full filename — you'll need it for the next step. Confirm the file size is non-zero and appears consistent with your database size.
Step 10 — Validate the backup image integrity
db2ckbkp /mnt/backup/BLUDB.0.db2inst1.DBPART000.YYYYMMDDHHMMSS.001
Replace the filename with the actual timestamped filename from Step 9.
db2ckbkp (DB2 Check Backup) scans the internal structure of the backup image and validates:
- The backup header is intact and parseable
- The image was written completely (no truncation)
- Internal checksums match across data pages
- The backup type, database name, and timestamp metadata are consistent
Successful output looks like:
[extract log header]
[extract database configuration]
...
The backup image /mnt/backup/BLUDB.0.db2inst1.DBPART000.YYYYMMDDHHMMSS.001 is valid.
The db2ckbkp utility exited successfully.
Any output containing "invalid", "corrupted", or a non-zero exit code means the backup cannot be trusted. In that case: do not reactivate the database, investigate the storage layer, and re-run the backup.
Operational best practice: Integrate
db2ckbkpinto your backup automation or runbook as a mandatory step. A backup without integrity validation is a liability, not an asset.
Reactivation: Bringing the Database Back Online
Step 11 — Reactivate the database
db2 activate db BLUDB
This performs the inverse of deactivate: it allocates buffer pool memory, starts background threads, and makes the database available for new connections.
Verify the database is active and accepting connections:
db2 connect to BLUDB
db2 "select count(*) from syscat.tables"
db2 disconnect BLUDB
A successful SELECT confirms the database engine is running and the data is accessible. Once verified, scale MAS application pods back up (from Domain 1):
oc scale deployment <manage-deployment-name> --replicas=<original-count> -n <MAS_NAMESPACE>
Cluster-Level Confirmation: Querying the DB2 History File
After reactivation, perform a final confirmation from the OCP control plane level — without re-entering the pod interactively. This is useful for automation, auditing, and confirming that the backup is formally registered in DB2's internal recovery history.
Step 12 — Query the DB2 backup history from Domain 1
oc exec c-db2oltp-wh-db2u-0 -n <NAMESPACE> -- \
su - db2inst1 -c "db2 list history backup all for BLUDB"
This single command chains through all three domains in one shot: oc exec enters Domain 2, su - transitions to Domain 3, and db2 list history queries the recovery history file.
A registered backup entry in the history file confirms that DB2 itself recognises the backup as valid and available for restore operations. This is distinct from file-level existence — the history file is what DB2's restore and rollforward commands consult when building a recovery plan.
Sample output showing a confirmed backup:
List History File for BLUDB
Number of matching file entries = 1
Op Obj Timestamp+Sequence Type Dev Earliest Log Current Log Backup ID
--- ---- ------------------ ---- --- ------------ ----------- --------------
B D 20250518120000001 F
----------------------------------------------------------------------------
Contains 4 tablespace(s):
...
Comment: DB2 BACKUP BLUDB OFFLINE
Start Time: 20250518120000
End Time: 20250518121532
Status: A
----------------------------------------------------------------------------
EID: 1 Location: /mnt/backup
The Status: A (Active) confirms this is a usable backup. F under Type indicates a Full offline backup.
The Complete Runbook: At a Glance
| Phase | Domain | Command | Purpose |
|---|---|---|---|
| LOCATE | OCP Control Plane | oc get namespaces |
Find the DB2 namespace |
| LOCATE | OCP Control Plane | oc get pods -n <NS> | grep db2u |
Identify the primary pod |
| ISOLATE | Container OS | oc exec -it <POD> -n <NS> -- bash |
Enter the container |
| ISOLATE | Container OS | su - db2inst1 |
Switch to instance owner |
| ISOLATE | DB2 Engine | db2 list db directory |
Confirm database name |
| ISOLATE | DB2 Engine | db2 force applications all |
Disconnect all users |
| ISOLATE | DB2 Engine | db2 deactivate db BLUDB |
Quiesce the database |
| EXTRACT | DB2 Engine | db2 backup database BLUDB to /mnt/backup/ |
Run the backup |
| VERIFY | DB2 Engine | cd /mnt/backup && ls -lh |
Confirm file exists |
| VERIFY | DB2 Engine | db2ckbkp /mnt/backup/<FILENAME> |
Validate image integrity |
| RESTORE | DB2 Engine | db2 activate db BLUDB |
Bring database online |
| CONFIRM | OCP Control Plane | oc exec ... -c 'db2 list history backup all for BLUDB' |
Verify history registration |
Common Failure Scenarios and Remediation
SQL1035N: The database is currently in use when running deactivate
Cause: force applications all didn't terminate all connections cleanly — possibly due to a connection with CONNECT_TIMEOUT retrying. Wait 30–60 seconds and rerun db2 force applications all before re-attempting deactivate.
SQL2062N: An error occurred while accessing media during backup
Cause: The backup destination path is not writable, the filesystem is full, or the PVC mount is unavailable. Verify with df -h /mnt/backup/ and check PVC binding status from the OCP control plane.
db2ckbkp reports corrupt image
Cause: Storage I/O errors during write, pod resource constraints, or a storage class without proper write guarantees. Check the pod's events (oc describe pod) and the underlying storage class. Do not use this backup for restore. Re-run from Step 7.
su - db2inst1 fails with "authentication failure"
Cause: SCC restrictions on UID switching. The DB2U pod requires the anyuid SCC. Verify with oc get pod <POD> -o yaml | grep -i scc and engage your OCP administrator.
Architectural Considerations for MAS Production Environments
Backup frequency and log management: Offline backups capture a consistent point-in-time image but require downtime. For production MAS environments where zero downtime is required, consider online backups with archive logging — this allows backup during active workloads but requires log archiving to be configured (LOGARCHMETH1). The offline procedure above is ideal for scheduled maintenance windows, DR testing, and pre-upgrade snapshots.
Backup image externalisation: The backup image on /mnt/backup/ is still inside your OCP cluster. For true DR capability, the backup must be copied to an external target — object storage (IBM Cloud Object Storage, S3-compatible), a network filesystem, or a dedicated backup appliance. Automate this with a Kubernetes Job or a CronJob that runs oc exec to trigger the backup and then mounts an S3-compatible volume for externalisation.
Pre-upgrade snapshot strategy: Before applying any MAS or DB2U operator upgrade, an offline backup is your rollback safety net. Combine this with an OCP oc get export of all relevant Custom Resources (the MAS, ManageApp, and Db2uCluster CRs) to ensure full recovery capability if the upgrade fails mid-flight.
HADR considerations: If your deployment uses DB2 HADR, an offline backup against the primary (-db2u-0) also serves as the basis for re-initialising a standby from scratch. After a primary restore from backup, the standby must be re-seeded from the primary using the standard HADR initialisation procedure — the old standby's data files are no longer consistent with the restored primary.
This runbook reflects field experience with MAS 8.x and 9.x on OCP 4.x deployments. Always validate procedures against your specific DB2U operator version and test restore procedures in a non-production environment before relying on them for production recovery.