Data Ingestion and Replication
- Data Ingestion and Replication
- All Products
Property
| Description
|
---|---|
Connection Name
| Required. The name of the connection. The name is not case sensitive and must be unique within the domain.
You can change this property after you create the connection. The name cannot exceed 128 characters, contain spaces, or contain the following special characters:
~ ` ! $ % ^ & * ( ) - + = { [ } ] | \ : ; " ' < , > . ? /
|
Description
| Description of the connection.
The description cannot exceed 4,000 characters.
|
Type
| Required. Select Databricks Delta.
|
Runtime Environment
| Required. Name of the runtime environment where you want to run the tasks.
|
Databricks Host
| Required. The host name of the endpoint the Databricks account belongs to.
Use the following syntax:
jdbc:spark://
<Databricks Host> :443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/<Org Id>/<Cluster ID>;AuthMech=3;UID=token;PWD=<personal-access-token>You can get the URL from the Databricks Delta analytics cluster or all purpose cluster ->
Advanced Options
->JDBC / ODBC .
The value of PWD in Databricks Host, Org Id, and Cluster ID is always
<personal-access-token> .
|
Org Id
| Required. The unique organization ID for the workspace in Databricks.
Use the following syntax:
jdbc:spark://<Databricks Host>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/
<Org Id> /<Cluster ID>;AuthMech=3;UID=token;PWD=<personal-access-token> |
Cluster ID
| Required. The ID of the Databricks analytics cluster. You can obtain the cluster ID from the JDBC URL.
Use the following syntax:
jdbc:spark://<Databricks Host>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/<Org Id>/
<Cluster ID> ;AuthMech=3;UID=token;PWD=<personal-access-token> |
Databricks Token
| Required. Personal access token to access Databricks.
You must have permissions to attach to the cluster identified in the
Cluster ID property.
|
SQL Endpoint JDBC URL
| Databricks SQL endpoint JDBC connection URL.
Use the following syntax:
jdbc:spark://<Databricks Host>:443/default;transportMode=http;ssl=1;AuthMech=3;httpPath=/sql/1.0/endpoints/<SQL endpoint cluster ID>;
The Databricks Host, Org ID, and Cluster ID properties are not considered if you configure the SQL Endpoint JDBC URL property.
For more information
on Databricks Delta SQL endpoint, contact Informatica Global Customer Support.
|
Database
| The database in Databricks Delta that you want to connect to.
|
JDBC Driver Class Name
| Required. The name of the JDBC driver class.
|
Cluster Environment
| The cloud provider where the Databricks cluster is deployed.
You can select from the following options:
Default is AWS.
|
Min Workers
| The minimum number of worker nodes to be used for the Spark job.
|
Max Workers
| The maximum number of worker nodes to be used for the Spark job.
If you do not want autoscale, set Max Workers = Min Workers or do not set Max Workers.
|
DB Runtime Version
| The Databricks runtime version.
Select 7.3 LTS from the list.
|
Worker Node Type
| Required . The instance type of the machine used for the Spark worker node.
|
Driver Node Type
| The instance type of the machine used for the Spark driver node. If not provided, the value as in worker node type is used.
|
Instance Pool ID
| The instance pool used for the Spark cluster.
|
Enable Elastic Disk
| Enable this option for the cluster to dynamically acquire additional disk space when the Spark workers are running low on disk space.
|
Spark Configuration
| The Spark configuration to be used in the Databricks cluster.
The configuration must be in the following format:
"key1"="value1";"key2"="value2";....
For example:
"spark.executor.userClassPathFirst"="False"
|
Spark Environment Variables
| The environment variables that you need to export before launching the Spark driver and workers.
The variables must be in the following format:
"key1"="value1";"key2"="value2";....
For example:
"MY_ENVIRONMENT_VARIABLE"="true"
|