Mass Ingestion

Back Next

Oracle sources

To use Oracle sources in database ingestion tasks, first prepare the source database and review the usage considerations.

Source preparation

Define the ORACLE_HOME environment variable on the Linux or Windows system where the Secure Agent runs for

Mass Ingestion Databases

to use the Oracle Call Interface (OCI) to communicate with the Oracle source database.

Make sure the

Mass Ingestion Databases

user has the Oracle privileges that are required for the database ingestion load type to be performed. For more information, see Oracle privileges.

Database ingestion

jobs require read access to Oracle online and archive redo logs to read incremental change data. If the redo logs are remote from the on-premises system where the Secure Agent runs, make sure that read access to the logs is provided, for example, by using Oracle Automatic Storage Management (ASM), mounting the logs to a network file system (NFS), or configuring BFILE access to logs that are on the Oracle file system.

For incremental load or combined initial and incremental load operations, perform the following prerequisite tasks in Oracle:

Enable ARCHIVELOG mode for the Oracle database. If the database is not in an Amazon RDS environment, issue the following SQL statements:

SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;
SHUTDOWN IMMEDIATE;
STARTUP;

For an Amazon RDS for Oracle databases, set the backup retention period to place the database in ARCHIVELOG mode and enable automated backups.

Define an archive log destination.

Enable Oracle minimal global supplemental logging on the source database.

If your Oracle source tables have primary keys, ensure that supplemental logging is enabled for all primary key columns. For source tables that do not have primary keys, ensure that supplemental logging is enabled for all columns from which change data will be captured.

When you create a database ingestion task, you have the option of generating a script that implements supplemental logging for all columns or only primary key columns for the selected source tables.

Ensure that the Oracle MAX_STRING_SIZE initialization parameter is not
set to EXTENDED. If it is set to EXTENDED, Mass Ingestion Databases will not be able to replicate inserts and updates for tables containing columns defined with large (extended size) VARCHAR2, NVARCHAR2, or RAW columns.

If you do not have the authority to perform these tasks, ask your Oracle database administrator to perform them. For more information, see the Oracle documentation.

Ensure that the Oracle Database Client or Instant Client is installed and configured on the Secure Agent server for the Secure Agent to communicate with Oracle. If you do not already have an Oracle client installed, you can download a client and access installation information from the Oracle web site, or ask your Oracle DBA to download and configure an Oracle client.

Amazon Relational Database Service (RDS) for Oracle source preparation:

Create the ONLINELOG_DIR and ARCHIVELOG_DIR directories that will hold the online and archive redo logs, respectively, on the RDS file system. Use the following exec statements:

exec rdsadmin.rdsadmin_master_util.create_archivelog_dir; 
exec rdsadmin.rdsadmin_master_util.create_onlinelog_dir;

Grant the Oracle privileges that are required for the Amazon RDS for Oracle source type to the Mass Ingestion Databases user.

For more information about the privileges required for an Amazon RDS for Oracle source, see Oracle log access methods for CDC.

Define an appropriate retention time for the archived redo logs. Use the following exec statement:

exec rdsadmin.rdsadmin_util.set_configuration('archivelog retention days',number_of_days);

In the Amazon RDS console, set the backup retention period for the source database to a value greater than zero to enable automated backups of the database instance.

This step enables ARCHIVELOG mode for the database.

Ensure that supplemental logging is enabled at the database level. Use the following statement:

exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD');

When you create a database ingestion task, you can generate a script to enable supplemental logging for the selected source tables.

Optionally, in the Amazon RDS console, you can create a parameter group and define the cache sizes of the default buffer pool. The default buffer pool holds buffers that use the primary database block size. Use the following DB_CACHE_SIZE parameter values:

DB_2K_CACHE_SIZE

DB_4K_CACHE_SIZE

DB_16K_CACHE_SIZE

DB_32K_CACHE_SIZE

Then select the parameter group for the source database.

Usage considerations

Informatica recommends that each source table have a primary key.

Mass Ingestion Databases

does not honor unique indexes in place of a primary key. If no primary key is specified,

Mass Ingestion Databases

treats all columns as if they are part of the primary key.

If Oracle source CHAR or VARCHAR columns contain nulls, the

database ingestion

job does not delimit the null values with double-quotation (") marks or any other delimiter when writing data to a Amazon S3, Flat File, Microsoft Azure Data Lake, or Microsoft Azure Synapse Analytics target.

Mass Ingestion Databases supports Oracle Data Guard logical and physical standby databases as sources. For implementation details, contact Informatica Global Customer Support.

Alternative strategies for accessing the Oracle redo logs are available. For more information, see Oracle log access methods for CDC.

If a database ingestion incremental load or combined initial and incremental load task contains an Oracle source table name or one or more column names that are longer than 30 characters, Oracle suppresses supplemental logging for the entire table, including primary keys and foreign keys. As a result, most operations on the table fail. This problem is caused by an Oracle restriction. In this situation, exclude the table from capture processing or rename the long table and column names to names of 30 characters or less.

Database ingestion initial load jobs can replicate data from Oracle BLOB, CLOB, and NCLOB columns to Snowflake targets. If a BLOB column contains more than 8 MB of data, the data is truncated to 8388608 bytes before being written to a BINARY column on the target. If a CLOB or NCLOB column contains more than 16 MB of data, the data is truncated to 16777216 bytes before being written to a VARCHAR column on the target. To replicate data from BLOB, CLOB, or NCLOB columns, you must select

Include LOBs

under

Advanced

on the

Source

page when you configure the task.

Mass Ingestion Databases

does not support the following Oracle source data types with any target type or any load type:

INTERVAL

LOBs, except for BLOBs, CLOBs, and NCLOBs which are supported for initial load jobs with Snowflake targets

LONG

LONG RAW

TIMESTAMP WITH LOCAL TIME ZONE

TIMESTAMP WITH TIME ZONE

UROWID

XMLTYPE

Source columns that have unsupported data types are excluded from the target definition.

For information about the default mappings of supported data types, see the Data Type Mappings Reference.

Mass Ingestion Databases

does not support invisible columns in Oracle source columns, regardless of the target type. For these columns,

database ingestion

incremental load jobs and combined initial and incremental load jobs propagate nulls to the corresponding target columns.

For Oracle sources that use the multitenant architecture, the source tables must reside in a single pluggable database (PDB) within a multitenant container database (CDB).

Mass Ingestion Databases sources - preparation and usage considerations

Gathering Information About the Mass Ingestion Databases environment

Oracle privileges

Oracle privileges for Amazon RDS for Oracle sources

Oracle log access methods for CDC

Configuring BFILE access to Oracle redo logs in the Oracle file system

Download Guide

Watch

Comments

Communities

Knowledge Base

Success Portal