Replication Options

1) Db2 High availability disaster recovery (HADR):
Active/Passive Replication, supports up to three remote standby servers.
when active DB Down, With HADR, a standby database can take over in seconds. the original primary database can be brought back up and returned it to its primary database status, which is known as failback. A failback can be initiated when the old primary database is consistent with the new primary database. After reintegrating the old primary database into the HADR setup as a standby database, the database roles are switched to enable the original primary database as the primary database.

2) Db2 pureScale: designed for continuous availability, All software components are installed and configured from a single host. pureScale scaling your database solution using Multiple database servers, which are known as members, process incoming database requests; these members operate in a clustered system and share data. You can transparently add more members to scale out to meet even the most demanding business needs. There are no application changes to make, no data to redistribute, and no performance tuning to do.

3) IBM Info-Sphere Data Replication product (IIDR):

IIDR has three alternative components

Change data capture (CDC): for heterogeneous databases, ie, replication between Oracle and DB2.
SQL Replication: old way, used in broadcast topology, create staging tables in source DB which cost increase DB size to capture all DB changes.
Q Replication: use IBM MQ, capture all DB changes inside MQ, high volume, low latency.

Q Replication: the best solution in IIDR

Q Replication is a high-volume, low-latency replication solution that uses WebSphere MQ message queues to transmit transactions between source and target databases

Q Replication High availability scenarios

Two-nodes for failover: Update workloads execute on a primary node, Second node not available for any workload
Two-nodes with one read-only node for query offloading: Update workloads execute on a primary node, Read-only workloads are allowed on a second node
Two-nodes, Active/Active, with strict conflict rules: Update workloads execute on two different nodes, Conflicts are managed, Deployed only when conflicts can be carefully managed.
Three-nodes with at least one read-only node: Update workloads execute on a primary node, Read-only workloads execute on second and third nodes, Conflicts are tightly managed
Three-nodes, Active/Active, with strict conflict rules: Update workloads execute on three different nodes, Conflicts are managed, using data partitioning, workload distribution, use when have unstable/slow connection topologies.

Q Replication components

1) The Q Capture and Q Apply programs and their associated DB2 control tables (listed as Capture, Apply, and Contr in the diagram)

2) The Administration tools that include the Replication Center (db2rc) and the ASNCLP command-line interface

3) The Data Replication Dashboard and the ASNMON utility that deliver a live monitoring web tool and an alert monitor respectively

4) Additional utilities like the ASNTDIFF table compare program and the asnqmfmt program to browse Q Replication messages from a queue WebSphere MQ

Notes:

- The Q Capture program is log-based
- The Q Apply program applies in parallel multiple transactions to the target DB2
- The Q Capture program reads the DB2 recovery log for changes to a source table defined to replication. The program then sends transactions as WebSphere MQ messages over queues, where they are read and applied to target tables by the Q Apply program.
- Asynchronous delivery: Q Apply program receive transactions without having to connect to the source database or subsystem. Both the Q Capture and Q Apply programs operate independently of each other—neither one requires the other to be operating.

InfoSphere Information Server

InfoSphere Information Server is an IBM data integration platform that provides a comprehensive set of tools and capabilities for managing and integrating data across various sources and systems. It is designed to help organizations address data quality, data integration, data transformation, and data governance challenges.

InfoSphere Information Server enables businesses to access, transform, and deliver trusted and timely data for a wide range of data integration use cases, such as data warehousing, data migration, data synchronization, and data consolidation. It offers a unified and scalable platform that supports both batch processing and real-time data integration.

Key components of InfoSphere Information Server include:

1) DataStage: A powerful ETL (Extract, Transform, Load) tool that allows users to design, develop, and execute data integration jobs. It provides a graphical interface for building data integration workflows and supports a wide range of data sources and targets.

2) QualityStage: A data quality tool that helps identify and resolve data quality issues by profiling, cleansing, standardizing, and matching data. It incorporates various data quality techniques and algorithms to improve the accuracy and consistency of data.

3) Information Governance Catalog: A metadata management tool that enables users to capture, store, and manage metadata about data assets, including data sources, data definitions, data lineage, and data ownership. It helps organizations establish data governance practices and provides a centralized repository for managing and searching metadata.

4) Data Click: A self-service data preparation tool that allows business users to discover, explore, and transform data without the need for extensive technical skills. It provides an intuitive and user-friendly interface for data profiling, data cleansing, and data enrichment.

5) Information Analyzer: A data profiling and analysis tool that helps assess the quality, structure, and content of data. It allows users to discover data anomalies, identify data relationships, and generate data quality reports.

InfoSphere Information Server provides a comprehensive and integrated platform for managing the entire data integration lifecycle, from data discovery and profiling to data quality management and data delivery. It helps organizations improve data consistency, data accuracy, and data governance, leading to better decision-making and increased operational efficiency.

for more information visit
https://www.youtube.com/watch?v=U_PN8QLTec8

Development Tips & Tricks

Topics

Sunday, October 23, 2022

DB2 Replication Options

InfoSphere Information Server

No comments: