Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

SAP HANA sources

SAP HANA sources

To use SAP HANA sources in database ingestion tasks, first prepare the source database and review the usage considerations.

Source preparation

  • The SAP HANA Database Ingestion connector uses JDBC to connect to the SAP HANA database to read data and metadata and to test connection properties. You must download the SAP HANA JDBC driver file, ngdbc.jar, and copy it to a specific subdirectory of the Secure Agent installation directory on the machine where the Secure Agent runs.
    1. Download the SAP HANA JDBC driver jar file, ngdbc.jar, to the Linux or Windows machine where the Secure Agent runs.
      Verify that you download the most recent version of the file. If you encounter any issues with downloading the file, contact SAP Customer Support.
    2. Copy the ngdbc.jar file to the following directory:
      <
      Secure Agent installation directory
      >/ext/connectors/thirdparty/informatica.hanami
    3. Restart the Secure Agent.
  • To deploy and run a database ingestion task that includes a SAP HANA source, the source connection must specify a database user who has the privileges to read the following monitoring and system views:
    • SYS.M_DATABASE
    • SYS.M_CS_PARTITIONS
    • SYS.SCHEMAS
    • SYS.TABLES
    • SYS.TABLE_COLUMNS
    • SYS.INDEXES
    • SYS.INDEX_COLUMNS
    For incremental load tasks, grant the following privileges:
    To enable the Mass Ingestion Databases user to write information about captured changes to the PKLOG table and to write change data to the shadow _CDC tables, execute the following grant statement:
    GRANT INSERT ON SCHEMA
    schema_name
    TO [
    user_id
    |
    user_role
    ];
    This statement grants the INSERT permission on the schema to the user that runs insert, update, and delete operations on the base source tables.
    GRANT INSERT ON SCHEMA
    schema_name
    TO [
    schema_user
    ];
    This statement grants the INSERT privilege on the schema where the PKLOG, PROCESSED, and shadow _CDC tables exist to the schema (as the user) where the triggers exist. This permission enables triggers to run with the permissions held by the schema where the triggers exist.
    If you want to capture data from all or most tables in a database, execute the following statement to grant access to all objects in the source database:
    GRANT SELECT ON SCHEMA
    schema_name
    TO [
    user_id
    |
    user_role
    ]; GRANT TRIGGER ON SCHEMA
    schema_name
    TO [
    user_id
    |
    user_role
    ];
    If you want to capture data from just a few tables, you can limit access to only those tables by executing the following statement for each selected source table:
    GRANT SELECT ON
    database
    .
    table_name
    TO [
    user_id
    |
    user_role
    ]; GRANT TRIGGER ON
    database
    .
    table_name
    TO [
    user_id
    |
    user_role
    ];

Usage considerations

  • Mass Ingestion Databases
    supports SAP HANA sources on Red Hat Linux or SUSE Linux for initial load and incremental load jobs but not for combined initial and incremental load jobs.
  • Initial load and incremental load jobs with an SAP HANA source can have any target type except for Apache Kafka or Azure Event Hubs.
  • Mass Ingestion Databases
    does not require primary keys on the source tables for initial load or incremental load jobs.
  • Mass Ingestion Databases
    does not support the following source data types, even though they're mapped to default column data types on the target:
    • ARRAY
    • BINTEXT
    • BLOB
    • CLOB
    • NCLOB
    • ST_GEOMETRY
    • ST_POINT
    • TEXT
    Mass Ingestion Databases
    jobs propagate nulls for columns that have these data types.
    For information about the default mappings of supported data types, see the Data Type Mappings Reference.
  • For incremental load jobs,
    Mass Ingestion Databases
    requires the following tables in the source database:
    • PKLOG log table. Contains metadata about captured DML changes, such as the change type and timestamp, transaction ID, schema name, and table name.
    • PROCESSED log table. Contains the maximum sequence number (SCN) for the most recent change data capture cycle.
    • Shadow <
      schema
      >.<
      tablename
      >_CDC tables. Contains before images of updates and after images of inserts, updates, and deletes captured from the source tables, with metadata such as the transaction ID and timestamp. A shadow table must exist for each source table from which changes are captured.
    Also, because SAP HANA does not provide direct access to its log files, Mass Ingestion Databases uses AFTER DELETE, AFTER INSERT, and AFTER UPDATE triggers on the source tables to get before images and after images of the DML changes for each source table and to write entries for the changes to the PKLOG table and shadow _CDC tables. Mass Ingestion Databases also writes SAP HANA sequence values to each shadow _CDC table and to the PKLOG table for each insert, update, and delete row processed. The sequence values link the rows of the shadow _CDC table to the rows of the PKLOG table during CDC processing.
    From the
    Source
    page in the task wizard, you can download or execute a CDC script that creates these tables, triggers, and sequences. If you specified a
    Trigger Prefix
    value in the SAP HANA Database Ingestion connection properties, the names of the generated triggers begin with
    prefix
    _.
    When you deploy a task,
    Mass Ingestion Databases
    validates the existence of the PKLOG, PROCESSED, and shadow _CDC tables and the triggers and sequences. The deploy operation fails if these items do not exist.
  • Database ingestion incremental load jobs support SAP HANA table names up to 120 characters in length.
  • Schema drift options are not supported for incremental load jobs with SAP HANA sources.

0 COMMENTS

We’d like to hear from you!