Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

PostgreSQL sources

PostgreSQL sources

To use PostgreSQL sources in database ingestion tasks, first prepare the source database and review the usage considerations.

PostgreSQL source preparation

To use PostgreSQL sources in database ingestion tasks, first prepare the source database and review the usage considerations.
  1. On Windows, install the latest version of the 64-bit PostgreSQL ODBC driver.
    1. Download and install the PostgreSQL ODBC driver.
      If the source database contains objects with multibyte-character names, such as table names, column names, and publication names, you must use either a PostgreSQL Unicode ODBC driver or the DataDirect ODBC for PostgreSQL driver. This requirement applies to all PostgreSQL source types, including Amazon Aurora PostgreSQL and RDS for PostgreSQL. If you do not use a Unicode-compatible ODBC driver, your incremental load jobs will fail when encountering a multibyte-character name.
    2. Set the PGSQL_ODBC_DRIVER environment variable to the driver name that is displayed by ODBC Data Source Administrator (64-bit).
    On Linux and UNIX, the DataDirect ODBC driver for PostgreSQL is delivered as part of the Linux installation. You can optionally install the unixODBC driver manager or PostgreSQL ODBC driver.
    1. Optional. Install the unixODBC or iODBC driver manager.
    2. Optional. Install the PostgreSQL ODBC driver if you do not want to use the DataDirect ODBC for PostgreSQL driver that is provided in the Linux installation.
      If the source database contains objects with multibyte-character names, such as table names, column names, and publication names, you must use either a PostgreSQL Unicode ODBC driver or the DataDirect ODBC for PostgreSQL driver. This requirement applies to all PostgreSQL source types, including Amazon Aurora PostgreSQL and RDS for PostgreSQL. If you do not use a Unicode-compatible ODBC driver, your incremental load jobs will fail when encountering a multibyte-character name.
    3. Add a PostgreSQL entry to odbcinst.ini.
      [PGSQL] Description = ODBC for PostgreSQL Driver = Setup = Driver64 = /usr/pgsql-9.6/lib/psqlodbca.so Setup64 = /usr/lib64/libodbcpsqlS.so FileUsage = 1
    4. Optional. Set the following environment variables:
      • Set the ODBCSYSINI variable to the directory where odbcinst.ini is located. If odbcinst.ini is located in the default /etc directory, you do not need to set the ODBCSYSINI variable.
      • Add the directory where the PostgreSQL ODBC driver is installed to the LD_LIBRARY_PATH variable. If the driver is installed in the default directory of /usr/lib64, you do not need to add the path to the LD_LIBRARY_PATH variable.
      • Set the PGSQL_ODBC_DRIVER parameter to the driver name that you specified in odbcinst.ini.
      For example:
      export ODBCSYSINI=/root/infaagent export LD_LIBRARY_PATH=/usr/pgsql-9.6/lib export PGSQL_ODBC_DRIVER=PGSQL
  2. For incremental load jobs, ensure that the PostgreSQL postgresql.conf configuration file specifies the wal_level=logical parameter.
    This parameter determines how much information PostgreSQL writes to the Write-Ahead Log (WAL). The setting of logical adds information that is required to support logical decoding.
    To set wal_level to logical on Amazon Aurora PostgreSQL or Amazon Relational Database Service (RDS) for PostgreSQL sources, set the rds.logical_replication parameter to 1 in the cluster parameter group.
  3. If you use the DataDirect ODBC for PostgreSQL driver, ensure that the database does not use the SCRAM-SHA-256 authentication method. Use another authentication method, such as MD5.
    The PostgreSQL ODBC driver supports the SCRAM-SHA-256 authentication method. In PostgreSQL 13, this authentication method became the default method.
  4. To deploy and run a database ingestion task that includes a PostgreSQL source, the source connection must specify a database user who has the required privileges. Create the user and grant privileges to that user in the following ways:
    • For initial load jobs, use the following SQL statements:
      CREATE USER
      dbmi_user
      WITH PASSWORD
      password
      ; GRANT SELECT ON ALL TABLES IN SCHEMA
      schema
      TO
      dbmi_user
      ;
    • For incremental load jobs, use the following SQL statement:
      CREATE USER
      dbmi_user
      WITH PASSWORD
      password
      REPLICATION;
      Also, if you use the pgoutput plugin, use the following SQL statement to grant ownership of the tables in the database that you want to add to the pgoutput publication to the
      dbmi_user
      that you created:
      GRANT CREATE ON DATABASE
      database
      TO
      dbmi_user
      ;
  5. If you plan to use the wal2json plugin for logical decoding output for incremental load jobs, install the plugin.
  6. If you plan to use the pgoutput plugin for incremental load jobs, use the following SQL statement to create publications for database ingestion jobs:
    CREATE PUBLICATION
    publication_name
    [FOR TABLE [ONLY]
    table_name
    [*] [,...] | FOR ALL TABLES ]
    Ensure that the publication includes all tables that you want to replicate to the target.
  7. For incremental load jobs with PostgreSQL 9.6 sources, ensure that the max_replication_slots parameter in the postgresql.conf configuration file has a value greater than or equal to the number of concurrent database ingestion jobs that you plan to use.
    All replication slots must be unique across all concurrent jobs.
  8. For incremental load jobs, ensure that the PostgreSQL sources use the UTF-8 encoding.

Usage considerations

  • Mass Ingestion Databases
    initial or incremental load jobs support RDS for PostgreSQL sources.
  • Mass Ingestion Databases
    incremental load jobs support Amazon Aurora PostgreSQL sources.
  • Mass Ingestion Databases supports schema drift options for PostgreSQL sources in database ingestion incremental load jobs. The following limitations apply:
    • PostgreSQL does not support renaming tables and columns for which Change Data Capture (CDC) is enabled.
    • PostgreSQL does not support changing primary keys for CDC tables.
    • Database ingestion jobs cannot capture DML changes from source tables if the table partition IDs are changed.
  • For PostgreSQL 9.6, the pgoutput plugin is not available.
  • For initial load jobs,
    Mass Ingestion Databases
    does not support the following PostgreSQL data types:
    • ABSTIME
    • Array types
    • JSON
    • NAME
    • Object identifier types
    • PG_LSN
    • RELTIME
    • Text search types:
      • TSQUERY
      • TSVECTOR
    • User-defined types
    For incremental load jobs,
    Mass Ingestion Databases
    does not support the following PostgreSQL data types, in addition to those not supported for initial load jobs:
    • BYTEA
    • MONEY
    • Spatial types
      • Box
      • Circle
      • Line
      • LSeg
      • Path
      • Point
      • Polygon
    • TEXT
    • Unbounded varying types
    • XML
    Database ingestion
    jobs either fail to deploy or propagate nulls for columns that have these data types.
    For information about the default mappings of supported data types, see the Data Type Mappings Reference.

0 COMMENTS

We’d like to hear from you!