Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Bolo.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

Troubleshooting a streaming ingestion task

Troubleshooting a
streaming ingestion
task

While deploying a
streaming ingestion
task with Amazon Kinesis source, data loss is encountered when you send data immediately after the task status changes to Up and Running.
Workaround: After deploying the task to run as a job, wait for some time before sending the data.
A
streaming ingestion
job with Kinesis Streams source which consumes streams from Amazon DynamoDB fails to read the data ingested from Kinesis Streams. The
streaming ingestion
job runs with Up and Running status, but does not return any error.
Provide the following required permissions to the Amazon DynamoDB user:
dynamodb:CreateTable - dynamodb:DescribeTable - dynamodb:Scan - dynamodb:PutItem - dynamodb:GetItem - dynamodb:UpdateItem - dynamodb:DeleteItem Resource: - !Join ["", ["arn:aws:dynamodb:*:", !Ref 'AWS::AccountId', ":table/*"]]
Provide the following permissions for Amazon CloudWatch:
"Action": "cloudwatch:DescribeAlarms" "Action": "cloudwatch:PutMetricData"
While creating a Kafka connection, if the SSL mode is set as disabled, any other additional connection property values declared are not considered.
To override this issue, declare the additional connection properties in the
Additional Security Properties
field.
When you try to ingest high volume of data to a Kafka target, the
streaming ingestion
job runs with the error:
A message in the stream exceeds the maximum allowed message size of 1048576 byte.
You get this error message if the received message size is more than 1 MB, which is the maximum message size that a Kafka server can receive.
When you try to ingest high volume of data to an Amazon Kinesis Firehose or Amazon Kinesis Streams target, the
streaming ingestion
job runs with the error:
INFO - Mon Feb 04 07:01:44 UTC 2019PutKinesisStream[id=0421419e-b24f-4e3f-ad19-9a1fbc7b0f3c] Failed to publish to kinesis records StandardFlowFileRecord[uuid=36926fcd-dfed-46f9-ae41-a6d32c3d5633,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1549263211727-1, container=default, section=1], offset=2758, length=1067926],offset=0,name=testKinesis.2758-1070684.txt,size=1067926] because the size was greater than 1024000 bytes
You get this error message if the received message size is more than 1 MB, which is maximum message buffer size of Amazon Kinesis. Any message of size greater than 1 MB never reaches the target and is lost in the transition.
While deploying a
streaming ingestion
task with Amazon Kinesis source, data loss is encountered.
Workaround: To override this issue, don't send data when the data flow is in Stopped, Deploying, Edit, or Redeploy state, after deploying the task to run as a job.

0 COMMENTS

We’d like to hear from you!