Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Bolo.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

Amazon Redshift V2 target properties

Amazon Redshift V2 target properties

When you define a file ingestion task with an Amazon Redshift V2 target, you must enter target options on the
Target
tab of the task wizard.
Amazon Redshift V2 connection provides the following options that you must select to perform the copy command method:
  • Define Redshift Copy Command Properties
    . Select this option to define the Amazon Redshift copy command properties.
  • Enter Custom Redshift Copy Command
    . Select this option to provide a custom Amazon Redshift copy command that the
    file ingestion
    task uses.
The following table describes the advanced target options that you can configure in a
file ingestion
task if you select the
Define Redshift Copy Command Properties
option:
Option
Description
Target Table Name
Name of the table in Amazon Redshift to which the files are loaded.
Schema
The Amazon Redshift schema name.
Default is the schema that is used while establishing the target connection.
Add Parameters
Create an expression to add it as
Schema
and
Target Table Name
parameters. For more information, see Add Parameters.
Truncate Target Table
Truncate the target table before loading data to the table.
Analyze Target Table
The analyze command collects statistics about the contents of tables in the database to help determine the most efficient execution plans for queries.
Vacuum Target Table
You can select to vacuum the target table to recover disk space and sorts rows in a specified table.
Select one of the following recovery options:
  • Full
    . Sorts the specified table and recovers disk space occupied by rows marked for deletion by previous update and delete operations.
  • Sort
    . Sorts the specified table without recovering space freed by deleted rows.
  • Delete
    . Recovers disk space occupied by rows marked for deletion by previous update and delete operations, and compresses the table to free up used space.
File Format and Copy Options
Select the format with which to copy data. Select one of the following options:
  • DELIMITER
    . A single ASCII character to separate fields in the input file. You can use characters such as pipe (|), tilde (~), or a tab (\t). The delimiter you specify cannot be a part of the data.
  • QUOTE
    . Specifies the quote character used to identify
    nvarchar
    characters and skip them.
  • COMPUPDATE
    . Overrides current compression encoding and applies compression to an empty table.
  • AWS_IAM_ROLE
    . Specify the Amazon Redshift Role Resource Name to run on an Amazon EC2 system.
  • IGNOREHEADER
    . Select to ignore headers. For example, if you specify
    IGNOREHEADER 0
    , the task processes data from row 0.
  • DATEFORMAT
    . Specify the format for date fields.
  • TIMEFORMAT
    . Specify the format for time fields.
The following table describes the advanced target options that you can configure in a
file ingestion
task if you select the
Enter Custom Redshift Copy Command
option:
Property
Description
Copy Command
Amazon Redshift COPY command appends the data to any existing rows in the table.
If the Amazon S3 staging directory and the Amazon Redshift target belongs to different regions, you must specify the region in the COPY command.
For example,
copy public.messages from '{{FROM-S3PATH}}' credentials 'aws_access_key_id={{ACCESS-KEY-ID}};aws_secret_access_key={{SECRET-ACCESS-KEY-ID}}' MAXERROR 0 REGION '' QUOTE '"' DELIMITER ',' NULL '' CSV;
Where
public
is the schema and
messages
is the table name.
For more information about the COPY command, see the AWS documentation.
The following table describes the Amazon Redshift advanced target options that you can configure in a
file ingestion
task after you select one of the copy command methods:
Property
Description
Pre SQL
SQL command to run before the
file ingestion
task runs the COPY command.
Post SQL
SQL command to run after the
file ingestion
task runs the COPY command.
S3 Staging Directory
Specify the Amazon S3 staging directory.
You must specify the Amazon S3 staging directory in
<bucket_name/folder_name>
format.
The staging directory is deleted after the
file ingestion
task runs.
Upload to Redshift with no Intermediate Staging
Upload files from Amazon S3 to Amazon Redshift directly from the Amazon S3 source directory with no additional, intermediate staging.
If you select this option, ensure that the Amazon S3 bucket and the Amazon S3 staging directory belong to the same region.
If you do not select this option, ensure that the Amazon S3 staging directory and Amazon Redshift target belong to the same region.
File Compression*
Determines whether or not files are compressed before they are transferred to the target directory. Select one of the following options:
  • None
    . Files are not compressed.
  • GZIP
    . Files are compressed using GZIP compression.
File Encryption Type*
Type of Amazon S3 file encryption to use during file transfer. Select one of the following options:
  • None
    . Files are not encrypted during transfer.
  • S3 server-side encryption
    . Amazon S3 encrypts the file using AWS-managed encryption keys.
  • S3 client-side encryption
    . Ensure that unrestricted policies are implemented for the AgentJVM, and that the master symmetric key for the connection is set.
Client-side encryption does not apply to tasks where Amazon S3 is the source.
S3 Accelerated Transfer*
Select whether to use Amazon S3 Transfer Acceleration on the S3 bucket. To use Transfer Acceleration, accelerated transfer must be enabled for the bucket. Select one of the following options:
  • Disabled
    . Do not use Amazon S3 Transfer Acceleration.
  • Accelerated
    . Use Amazon S3 Transfer Acceleration.
  • Dualstack Accelerated
    . Use Amazon S3 Transfer Acceleration on a dual-stack endpoint.
Minimum Upload Part Size*
Minimum upload part size in megabytes when uploading a large file as a set of multiple independent parts. Use this option to tune the file load to Amazon S3.
Multipart Upload Threshold*
Multipart download minimum threshold in megabytes to determine when to upload objects in multiple parts in parallel.
*Not applicable when you read data from Amazon S3 to Amazon Redshift V2.

0 COMMENTS

We’d like to hear from you!