Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Bolo.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

Oracle sources

Oracle sources

To use Oracle sources in database ingestion tasks, first prepare the source database and review the usage considerations.

Source preparation

  • Define the ORACLE_HOME environment variable on the Linux or Windows system where the Secure Agent runs for
    Mass Ingestion Databases
    to use the Oracle Call Interface (OCI) to communicate with the Oracle source database.
  • Make sure the
    Mass Ingestion Databases
    user has the Oracle privileges that are required for the database ingestion load type to be performed. For more information, see Oracle privileges.
  • Database ingestion
    jobs require read access to Oracle online and archive redo logs to read incremental change data. If the redo logs are remote from the on-premises system where the Secure Agent runs, make sure that read access to the logs is provided, for example, by using Oracle Automatic Storage Management (ASM), mounting the logs to a network file system (NFS), or configuring BFILE access to logs that are on the Oracle file system.
  • For incremental load or combined initial and incremental load operations, perform the following prerequisite tasks in Oracle:
    • Enable ARCHIVELOG mode for the Oracle database. If the database is not in an Amazon RDS environment, issue the following SQL statements:
      SHUTDOWN IMMEDIATE; STARTUP MOUNT; ALTER DATABASE ARCHIVELOG; ALTER DATABASE OPEN; SHUTDOWN IMMEDIATE; STARTUP;
      For an Amazon RDS for Oracle databases, set the backup retention period to place the database in ARCHIVELOG mode and enable automated backups.
    • Define an archive log destination.
    • Enable Oracle minimal global supplemental logging on the source database.
    • If your Oracle source tables have primary keys, ensure that supplemental logging is enabled for all primary key columns. For source tables that do not have primary keys, ensure that supplemental logging is enabled for all columns from which change data will be captured.
      When you create a database ingestion task, you have the option of generating a script that implements supplemental logging for all columns or only primary key columns for the selected source tables.
    • Ensure that the Oracle MAX_STRING_SIZE initialization parameter is
      not
      set to EXTENDED. If it is set to EXTENDED, Mass Ingestion Databases will not be able to replicate inserts and updates for tables containing columns defined with large (extended size) VARCHAR2, NVARCHAR2, or RAW columns.
    If you do not have the authority to perform these tasks, ask your Oracle database administrator to perform them. For more information, see the Oracle documentation.
  • Ensure that the Oracle Database Client or Instant Client is installed and configured on the Secure Agent server for the Secure Agent to communicate with Oracle. If you do not already have an Oracle client installed, you can download a client and access installation information from the Oracle web site, or ask your Oracle DBA to download and configure an Oracle client.
Amazon Relational Database Service (RDS) for Oracle source preparation:
  1. Create the ONLINELOG_DIR and ARCHIVELOG_DIR directories that will hold the online and archive redo logs, respectively, on the RDS file system. Use the following exec statements:
    exec rdsadmin.rdsadmin_master_util.create_archivelog_dir; exec rdsadmin.rdsadmin_master_util.create_onlinelog_dir;
  2. Grant the Oracle privileges that are required for the Amazon RDS for Oracle source type to the Mass Ingestion Databases user.
    For more information about the privileges required for an Amazon RDS for Oracle source, see Oracle log access methods for CDC.
  3. Define an appropriate retention time for the archived redo logs. Use the following exec statement:
    exec rdsadmin.rdsadmin_util.set_configuration('archivelog retention days',
    number_of_days
    );
  4. In the Amazon RDS console, set the backup retention period for the source database to a value greater than zero to enable automated backups of the database instance.
    This step enables ARCHIVELOG mode for the database.
  5. Ensure that supplemental logging is enabled at the database level. Use the following statement:
    exec rdsadmin.rdsadmin_util.alter_supplemental_logging('ADD');
    When you create a database ingestion task, you can generate a script to enable supplemental logging for the selected source tables.
  6. Optionally, in the Amazon RDS console, you can create a parameter group and define the cache sizes of the default buffer pool. The default buffer pool holds buffers that use the primary database block size. Use the following DB_CACHE_SIZE parameter values:
    • DB_2K_CACHE_SIZE
    • DB_4K_CACHE_SIZE
    • DB_16K_CACHE_SIZE
    • DB_32K_CACHE_SIZE
    Then select the parameter group for the source database.

Usage considerations

  • Informatica recommends that each source table have a primary key.
    Mass Ingestion Databases
    does not honor unique indexes in place of a primary key. If no primary key is specified,
    Mass Ingestion Databases
    treats all columns as if they are part of the primary key.
  • If Oracle source CHAR or VARCHAR columns contain nulls, the
    database ingestion
    job does not delimit the null values with double-quotation (") marks or any other delimiter when writing data to a Amazon S3, Flat File, Microsoft Azure Data Lake, or Microsoft Azure Synapse Analytics target.
  • Mass Ingestion Databases supports Oracle Data Guard logical and physical standby databases as sources. For implementation details, contact Informatica Global Customer Support.
  • Alternative strategies for accessing the Oracle redo logs are available. For more information, see Oracle log access methods for CDC.
  • If a database ingestion incremental load or combined initial and incremental load task contains an Oracle source table name or one or more column names that are longer than 30 characters, Oracle suppresses supplemental logging for the entire table, including primary keys and foreign keys. As a result, most operations on the table fail. This problem is caused by an Oracle restriction. In this situation, exclude the table from capture processing or rename the long table and column names to names of 30 characters or less.
  • Database ingestion initial load jobs can replicate data from Oracle BLOB, CLOB, and NCLOB columns to Snowflake targets. If a BLOB column contains more than 8 MB of data, the data is truncated to 8388608 bytes before being written to a BINARY column on the target. If a CLOB or NCLOB column contains more than 16 MB of data, the data is truncated to 16777216 bytes before being written to a VARCHAR column on the target. To replicate data from BLOB, CLOB, or NCLOB columns, you must select
    Include LOBs
    under
    Advanced
    on the
    Source
    page when you configure the task.
  • Mass Ingestion Databases
    does not support the following Oracle source data types with any target type or any load type:
    • INTERVAL
    • LOBs, except for BLOBs, CLOBs, and NCLOBs which are supported for initial load jobs with Snowflake targets
    • LONG
    • LONG RAW
    • TIMESTAMP WITH LOCAL TIME ZONE
    • TIMESTAMP WITH TIME ZONE
    • UROWID
    • XMLTYPE
    Source columns that have unsupported data types are excluded from the target definition.
    For information about the default mappings of supported data types, see the Data Type Mappings Reference.
  • Mass Ingestion Databases
    does not support invisible columns in Oracle source columns, regardless of the target type. For these columns,
    database ingestion
    incremental load jobs and combined initial and incremental load jobs propagate nulls to the corresponding target columns.
  • For Oracle sources that use the multitenant architecture, the source tables must reside in a single pluggable database (PDB) within a multitenant container database (CDB).

0 COMMENTS

We’d like to hear from you!