Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Bolo.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

Kafka targets and Kafka-enabled Azure Event Hubs targets

Kafka targets and Kafka-enabled Azure Event Hubs targets

The following list identifies considerations for using Kafka targets:
  • Mass Ingestion Databases
    supports Apache Kafka, Confluent Kafka, Amazon Managed Streaming for Apache Kafka (MSK), and Kafka-enabled Azure Event Hubs as targets for incremental load jobs. All of these Kafka target types use the Kafka connection type.
    To indicate the Kafka target type, you must specify Kafka producer properties in the task definition or Kafka connection properties. To specify these properties for a task, enter a comma-separated list of
    key
    :
    value
    pairs in the
    Producer Configuration Properties
    field on the
    Target
    page of the task wizard. To specify the producer properties for all tasks that use a Kafka connection, enter the list of properties in the
    Additional Connection Properties
    field in the connection properties. You can override the connection-level properties for specific tasks by also defining producer properties at the task level. For more information about producer properties, see the Apache Kafka, Confluent Kafka, Amazon MSK, or Azure Event Hubs for Kafka documentation.
  • If you select
    AVRO
    as the output format for a Kafka target,
    Mass Ingestion Databases
    generates a schema definition file for each table with a name in the following format:
    schemaname
    _
    tablename
    .txt
    If a source schema change is expected to alter the target in an incremental load job,
    Mass Ingestion Databases
    regenerates the Avro schema definition file with a unique name that includes a timestamp:
    schemaname
    _
    tablename
    _
    YYYYMMDDhhmmss
    .txt
    This unique naming pattern preserves older schema definition files for audit purposes.
  • If you have a Confluent Kafka target that uses Confluent Schema Registry to store schemas, you must configure the following settings on the
    Target
    page of the task wizard:
    • In the
      Output Format
      field, select
      AVRO
      .
    • In the
      Avro Serialization Format
      field, select
      None
      .
  • You can specify Kafka producer properties in either the
    Producer Configuration Properties
    field on the
    Target
    page of the task wizard or in the
    Additional Connection Properties
    field in the Kafka connection properties. Enter property=value pairs that meet your business needs and are supported by your Kafka vendor.
    For example, if you use Confluent Kafka, you can use the following
    Additional Connection Properties
    entry to specify the Schema Registry URL:
    schema.registry.url=http://abcxqa01:8081 key.serializer=org.apache.kafka.common.serialization.StringSerializer value.serializer=io.confluent.kafka.serializers.KafkaAvroSerializer
    If you use Amazon MSK, you can use the following
    Additional Connection Properties
    entry to enable IAM role authentication for access to Amazon MSK targets:
    security.protocol=SASL_SSL,sasl.mechanism=AWS_MSK_IAM,sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;,sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
    Ensure that you enable IAM role authentication on the Amazon EC2 instance where the Secure Agent is installed.
    For more information about Kafka properties, see the documentation of your Kafka vendor.
  • Database ingestion incremental load jobs can replicate change data to Kafka targets that support SASL_SSL secured access, including Confluent Kafka, Amazon MSK, and Azure Event Hubs targets. In Administrator, you must configure a Kafka connection that includes the appropriate properties in the
    Additional Connection Properties
    field. For example, for Azure Event Hubs, you could use the following
    Additional Connection Properties
    entry to enable SASL_SSL:
    bootstrap.servers=NAMESPACENAME.servicebus.windows.net:9093 security.protocol=SASL_SSL sasl.mechanism=PLAIN sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="{YOUR.EVENTHUBS.CONNECTION.STRING}";

0 COMMENTS

We’d like to hear from you!