Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Bolo.

Table of Contents

Search

  1. Introducing Mass Ingestion
  2. Getting Started with Mass Ingestion
  3. Connectors and Connections
  4. Mass Ingestion Applications
  5. Mass Ingestion Databases
  6. Mass Ingestion Files
  7. Mass Ingestion Streaming
  8. Monitoring Mass Ingestion Jobs
  9. Asset Management
  10. Troubleshooting

Mass Ingestion

Mass Ingestion

Databricks Delta source properties

Databricks Delta source properties

In a
file ingestion
task, you can configure Databricks Delta source properties to transfer tables from a Databricks Delta source to a Microsoft Azure Data Lake Store Gen2 target and an Amazon S3 V2 target. The tables from the Databricks Delta source are stored as Parquet files in the target.
You can overwrite the file name pattern, folder, and table parameters, and define your own variable for sources by using the job resource of the Mass Ingestion Files REST API. For more information, see Mass Ingestion Files REST API.
The following table describes the source options:
Option
Description
Database
Required. Name of the Databricks Delta Lake database that contains the source table(s).
Add Parameters
Create an expression to add it as
Database
and
Table Pattern
parameters. For more information, see Add Parameters.
Table Pattern Type
Required. Type of pattern that determines how you select the tables for the transfer. Select
Wildcard
or
Regex
.
Default is Wildcard.
Table Pattern
Required. Enter the table name pattern for the pattern type you specified:
  • For a wildcard pattern, enter a pattern with the following characters:
    • An asterisk (*) to represent any number of characters.
    • A question mark (?) to represent a single character.
  • For a Regex pattern, enter a regular expression.
Batch Size
Required. The maximum number of tables that a
file ingestion
task can transfer from a Databricks Delta source to a Microsoft Azure Data Lake Store Gen2 target in a batch.
Default is 5. The maximum number of tables the task can transfer in a batch is 1000.
The task transfers tables with no intermediate staging.
If a job fails with the following error, see the cluster logs for more information:
"[ERROR] Job execution failed. State : JOB_FAILED ; State Message :"

0 COMMENTS

We’d like to hear from you!