Integrating Informatica® Big Data Management 10.2.2 HF1 SP1 with Qubole

Back Next

Edit *-site.xml Cluster Configuration Values

Big Data Management uses the property-value pairs from a set of cluster configuration files on the Qubole cluster to create a cluster configuration object. The Data Integration Service uses the cluster configuration to run mappings on the cluster.

To create a cluster configuration, you must retrieve the set of cluster configuration files from the Qubole cluster and save them to the domain node.

In the AWS console, browse to the list of running EC2 instances. Select the instance that hosts the Qubole master node.

In the Description tab of the EC2 instance, copy the value of the IPv4 Public IP property.

The following image shows the location of the IPv4 Public IP property:

Open a shell on the Informatica domain host using Putty or a similar utility.

Type the following command to open an SSH channel to the Qubole cluster master node:

ssh ec2-user@<IPv4 address>

Browse to the Hadoop configuration directory on the cluster master node:

cd /etc/hadoop

When you list the files in the Hadoop configuration directory, note the presence of several files named like *-site.xml. These are the Hadoop cluster configuration files.

In this step and the next step, you copy files from the cluster master node to any directory on the Informatica domain node.

Copy the following files from the cluster master node to a directory on the domain node:

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

Browse to the

usr/lib/hive1.2/conf

directory and copy the hive-site.xml file to the directory on the domain node where you copied the other *-site.xml files.

Get the fully qualified host name of the Informatica domain node. To get this value, type the command

hostname

The resulting value will be like

ip-10-20-30-40.ec2.internal

On the Informatica domain node directory where you copied the *-site.xml files, open the

core-site.xml

file for editing. Make the following change:

Locate the property

fs.defaultFS

Change the value from:

hdfs://${master.hostname}:9000

to contain the fully qualified host name from the previous step.

For example, the resulting value will be like

hdfs://ip-10-20-30-40.ec2.internal:9000

Save and close the

core-site.xml

file.

Open the

hive-site.xml

file for editing. Make the following change:

Copy the following property entry and paste it to the file:

<property>
   <name>hive.metastore.uris</name>
   <value>thrift://<fully qualified master host name>:10000</value>
   <description>JDBC connect string for a JDBC metastore</description>
</property>

Replace

with the fully qualified host name from the previous step.

For example, the resulting value will be like

thrift://ip-10-20-30-40.ec2.internal:10000

Save and close the

hive-site.xml

file.

Create a .zip archive containing the five *-site.xml files.

Use this .zip archive file to create the cluster configuration on the domain.

Integrating Informatica® Big Data Management 10.2.2 HF1 SP1 with Qubole

Download Guide

Watch