Hi, I'm Bolo!
What would you like to know?
ASK BOLOPreview
Please to access Bolo.

Table of Contents

Search

  1. Introduction to Databricks Delta Connector
  2. Connections for Databricks Delta
  3. Mappings and mapping tasks with Databricks Delta connector
  4. Databricks Delta pushdown optimization (SQL ELT)
  5. Data type reference

Databricks Delta Connector

Databricks Delta Connector

Databricks Delta target in mappings

Databricks Delta target in mappings

In a mapping, you can configure a Target transformation to represent a Databricks Delta object.
The following table describes the Databricks Delta properties that you can configure in a Target transformation:
Property
Description
Connection
Name of the target connection. Select a target connection or click
New Parameter
to define a new parameter for the target connection.
Target Type
Target type. Select one of the following types:
  • Single Object.
  • Parameter. Select
    Parameter
    to define the target type when you configure the task.
Object
Name of the target object.
Create Target
Creates a target.
Enter a name for the target object and select the source fields that you want to use. By default, all source fields are used. You can select an existing target object or create a new target object at runtime.
You cannot parameterize the target at runtime.
Operation
Defines the type of operation to be performed on the target table.
Select from the following list of operations:
  • Insert (Default)
  • Update
  • Upsert
  • Delete
  • Data Driven
When you use an upsert operation, you must configure the
Update Mode
in target details as
Update else Insert
.
If the key column gets null value from the source, the following actions take place for different operations:
  • Update. Skips the operation and does not update the row.
  • Delete. Skips the operation and does not delete the row.
  • Upsert. Inserts a new row instead of updating the existing row.
Update Columns
The fields to use as temporary primary key columns when you update, upsert, or delete data on the Databricks Delta target tables. When you select more than one update column, the
mapping
task uses the AND operator with the update columns to identify matching rows.
Applies to update, upsert, delete and data driven operations.
Data Driven Condition
Flags rows for an insert, update, delete, or reject operation based on the expressions that you define.
For example, the following IIF statement flags a row for reject if the ID field is null. Otherwise, it flags the row for update:
IIF (ISNULL(ID), DD_REJECT, DD_UPDATE )
Required if you select the data driven operation.
The following table describes the Databricks Delta advanced properties that you can configure in a Target transformation:
Advanced Property
Description
Target Database Name
1
Overrides the database name provided in the connection and the database selected in the metadata browser for existing targets.
You cannot override the database name when you create a new target at runtime.
Target Table Name
1
Overrides the table name at runtime for existing targets.
Update Override Query
Overrides the default update query that the agent generates for the update operation specified in this field.
Use the merge command for the update operation.
Write Disposition
Overwrites or adds data to the existing data in a table. You can select from the following options:
  • Append. Appends data to the existing data in the table even if the table is empty.
  • Truncate. Overwrites the existing data in the table.
Update Mode
1
Defines how rows are updated in the target tables. Select from the following options:
  • Update As Update: Rows matching the selected update columns are updated in the target.
  • Update Else Insert: Rows matching the selected update columns are updated in the target. Rows that don't match are appended to the target.
Staging Location
Relative directory path to store the staging files.
  • If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
  • If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
When you use the unity catalog, a pre-existing location on user's cloud storage must be provided in the Staging Location.
Pre SQL
The pre-SQL command to run before the agent writes to Databricks Delta.
For example, if you want to assign sequence object to a primary key field of the target table before you write data to the table, specify a pre-SQL statement.
You can specify multiple pre-SQL commands, each separated with a semicolon.
Post SQL
The post-SQL command to run after the agent completes the write operation.
For example, if you want to alter the table created by using create target option and assign constraints to the table before you write data to the table, specify a post-SQL statement.
You can specify multiple post-SQL commands, each separated with a semicolon.
Job Timeout
1
Maximum time in seconds that is taken by the Spark job to complete processing. If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
1
Poll interval in seconds at which the Secure Agent checks the status of the job completion.
Default is 30 seconds.
DB REST API Timeout
1
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns
5xx HTTP
error code.
Default is 10 minutes.
DB REST API Retry Interval
1
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns
5xx HTTP
error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.
Forward Rejected Rows
Determines whether the transformation passes rejected rows to the next transformation or drops rejected rows. By default, the agent forwards rejected rows to the next transformation.
1
Doesn't apply to mappings in advanced mode.

0 COMMENTS

We’d like to hear from you!