Data Ingestion and Replication
- Data Ingestion and Replication
- All Products
Bucket └───connection_folder└───job_folder├───Cdc-cycle │ ├───Completed │ │ ├───completed_cycle_folder│ │ │ └───Cycle-timestamp.csv │ │ │ ... │ │ └───completed_cycle_folder│ │ └───Cycle-timestamp.csv │ └───Contents │ ├───cycle_folder│ │ └───Cycle-contents-timestamp.csv │ │ ... │ └───cycle_folder│ └───Cycle-contents-timestamp.csv └───Cdc-data └───object_name├───Data │ ├───cycle_folder│ │ └───object_name_timestamp.csv │ │ ... │ └───cycle_folder│ └───object_name_timestamp.csv └───Schema └───V1 └───object_name.schema
Folder
| Description
|
---|---|
connection_folder
| Contains the Mass Ingestion Applications objects. This folder is specified in the
Folder Path field of the Amazon S3 connection properties or in the
Directory Path field of the Microsoft Azure Data Lake Storage Gen2 connection properties.
This folder is not created for Google Cloud Storage targets.
|
job_folder
| Contains job output files. This folder is specified in the
Directory field on the
Target page of the application ingestion task wizard.
|
Cdc-cycle/Completed
| Contains a subfolder for each completed CDC cycle. Each cycle subfolder contains a completed cycle file.
|
Cdc-cycle/Contents
| Contains a subfolder for each CDC cycle. Each cycle subfolder contains a cycle contents file.
|
Cdc-data
| Contains output data files and schema files for each object.
|
Cdc-data/ object_name /Schema/V1
| Contains a schema file.
Mass Ingestion Applications does not save a schema file in this folder if the output files use the Parquet format.
|
Cdc-data/ object_name /Data
| Contains a subfolder for each CDC cycle that produces output data files.
|
[dt=]yyyy-mm-dd-hh-mm-ss
Cycle-contents-timestamp.csv
Cycle-timestamp.csv