Table of Contents

Search

  1. Abstract
  2. Supported Versions
  3. Integrating Informatica® Big Data Management 10.2.2 HF1 SP1 with Qubole

Integrating Informatica® Big Data Management 10.2.2 HF1 SP1 with Qubole

Integrating Informatica® Big Data Management 10.2.2 HF1 SP1 with Qubole

Create a Cluster Configuration

Create a Cluster Configuration

After the Hadoop administrator prepares the cluster for import, the Informatica administrator must create a cluster configuration.
A cluster configuration is an object in the domain that contains configuration information about the Hadoop cluster. The cluster configuration enables the Data Integration Service to push mapping logic to the Hadoop environment.
Usually, the Informatica administrator imports configuration properties directly from the Hadoop cluster to create a cluster configuration. In the Qubole integration process, you have already copied the *-site.xml configuration files to a directory on the domain node (see Edit *-site.xml Cluster Configuration Values from the Qubole Cluster). The import process imports values from these *-site.xml files into configuration sets that correspond to the individual *-site.xml files.
The cluster configuration wizard can create Hadoop, HDFS, and Hive connections to enable the Informatica domain to access the Hadoop environment. If you choose to create the connections, the wizard also associates the cluster configuration with the connections.
For more information about the cluster configuration, see the
Data Engineering Administrator Guide
.
  1. From the
    Connections
    tab, click the
    ClusterConfigurations
    node in the Domain Navigator.
  2. From the Actions menu, select
    New
    Cluster Configuration
    .
    The
    Cluster Configuration
    wizard opens.
  3. Configure the following properties:
    Property
    Description
    Cluster configuration name
    Name of the cluster configuration.
    Description
    Optional description of the cluster configuration.
    Distribution type
    The cluster Hadoop distribution type.
    Select Qubole.
    Distribution version
    Version of the Hadoop distribution.
    Each distribution type has a default version. This is the latest version of the Hadoop distribution that Big Data Management supports.
    When the cluster version differs from the default version, the cluster configuration wizard populates the cluster configuration Hadoop distribution property with the most recent supported version relative to the cluster version. For example, suppose Informatica supports versions 5.10 and 5.13, and the cluster version is 5.12. In this case, the wizard populates the version with 5.10.
    You can edit the property to choose any supported version. Restart the Data Integration Service for the changes to take effect.
    Method to import the cluster configuration
    Choose
    Import from file
    to import properties from an archive file.
    Create connections
    Choose to create Hadoop, HDFS, Hive, and HBase connections.
    If you choose to create connections, the Cluster Configuration wizard associates the cluster configuration with each connection that it creates.
    If you do not choose to create connections, you must manually create them and associate the cluster configuration with them.
    When the wizard creates the Hive connection, it populates the Metadata Connection String and the Data Access Connection String properties with the value from the hive.metastore.uris property. If the Hive metastore and HiveServer2 are running on different nodes, you must update the Metadata Connection String to point to the HiveServer2 host.
  4. Click
    Browse
    to select a file. Select the file and click
    Open
    .
  5. Click
    Next
    and verify the cluster configuration information on the summary page.

0 COMMENTS

We’d like to hear from you!