Managed File Transfer

 View Only

Integration of Hadoop Distributed File System (HDFS) with Sterling Integrator

By Tanvi Kakodkar posted Thu January 30, 2020 05:54 AM

  
Originally updated on October 14, 2015 by BipinChandra


Introduction:

'Every two days now we create as much information as we did from the dawn of civilization up until 2003' - Eric Schmidt. That said, more than 90 percent of the world’s data was created in the last two years. Enterprises need to store and analyze huge amount of structured and unstructured data from different data sources—data too massive to manage effectively with traditional relational databases.

Combination of social data with traditional customer data can increase customer engagement. Enterprises that leverage 'Big Data' can become smarter about their customers, operations, and business environment. As volumes of business data continue to increase, organizations are rapidly adopting Hadoop to store, manage, and process 'Big Data' for use in analytics, business intelligence (BI), and decision support. Experts believe that the size of Hadoop installations will grow significantly as companies incorporate new data sources and business applications into their Hadoop systems.

Hadoop has become popular because it is designed to cheaply store data in the Hadoop Distributed File System (HDFS) and run large-scale MapReduce jobs for batch analysis. HDFS is a Java-based file system that provides scalable and reliable data storage across clusters of commodity servers.

It’s hard to avoid the term big data if you work in B2B domain. This documentation provides our customers a way to make use of Hadoop to store structured and unstructred data in Hadoop Distributed File System. Customers can then enhance it to do any kind of analysis or research which can be beneficial for their business and partner's business need.

This step by step documentation provides a seemles integration of HDFS with IBM Sterling B2B Integrator. Sterling Integrator is a business process-centric transaction engine which enables companies to address all of their B2B integration needs through a single, flexible B2B gateway. We make use of business processes and services to integrate with HDFS which can be enhanced to perform all the HDFS supported shell commands, like – list files, put files, get files, delete files, create a directory and many more.

By bringing together your structured information with unstructured data customers can leverage it for B2B analysis.Enterprises must find fast and cost-effective ways to move information from many different sources into the Hadoop system. This implementation provides a solutions that moves Big Data in and out of Apache Hadoop using IBM Sterling B2B Integrator. This solution enables companies to load and access massive amounts of data for analytics.

Assessing current technology can help companies understand basic functional needs of various IT initiatives, discover new capabilities they can already deliver. In many cases, existing B2B Gateway can extend itself to enable the success of other non-Ecommerce IT initiatives.

Benefits:

  • Cheap, abundant storage for structured and unstructered data

  • Enterprise connectivity to Hadoop

  • Consolidate file transfer processes under one managed solution, IBM B2Bi

  • Can move records from any database to Hadoop distribution file system

  • Can move records from local file system or SI Server to Hadoop distribution file system

  • The goal of all organizations with access to large data collections should be to harness the most relevant data and use it for optimized decision making

HDFS Client Service

HDFS Client Service from Sterling Integrator creates seamless connectivity to the Hadoop Distributed File System (HDFS). HDFS is a scalable, reliable, and portable file system file system within the Apache Hadoop framework. The HDFS Client Service allows users to interact with HDFS. SI users can easily get and put files and can perform many more functions on the HDFS server. Customer can enhance it to implement all the other supported HDFS shell cammands.

 

HDFS-SI Integration Architecture Diagram

Figure 1.0

 

Java implementation:

HDFS is a Java-based file system that provides scalable and reliable data storage across clusters of commodity servers. Download Hadoop package from Apache software foundation and make use of HDFS library to create a java implementation class - HadoopClient which implements SI's IService interface to create HDFS Client Service.

  1. Creating a configuration object: To be able to read from or write to HDFS, you need to create a Configuration object and pass configuration parameter to it using hadoop configuration files.

    Configuration conf = new Configuration();
    conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/core-site.xml"));                                                           

      conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/hdfs-site.xml"));

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

       

    1. Adding file to HDFS: Create a FileSystem object and use a file stream to add a file.

      FileSystem fileSystem = FileSystem.get(conf);                                                                                                                              
      // Check if the file already exists
      Path path = new Path("/path/to/file.ext");
      if (fileSystem.exists(path)) {
      System.out.println("File " + dest + " already exists");
      return;
      }
      // Create a new file and write data to it.
      FSDataOutputStream out = fileSystem.create(path);
      InputStream in = new BufferedInputStream(new FileInputStream(
      new File(source)));
      byte[] b = new byte[1024];
      int numBytes = 0;
      while ((numBytes = in.read(b)) > 0) {
      out.write(b, 0, numBytes);
      }
      // Close all the file descripters
      in.close();
      out.close();
      fileSystem.close();

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

       

    2. Reading file from HDFS:Create a file stream object to a file in HDFS and read it.

      FileSystem fileSystem = FileSystem.get(conf);
      Path path = new Path("/path/to/file.ext");
      if (!fileSystem.exists(path)) {
      System.out.println("File does not exists");
      return;
      }
      FSDataInputStream in = fileSystem.open(path);
      String filename = file.substring(file.lastIndexOf('/') + 1,
      file.length());
      OutputStream out = new BufferedOutputStream(new FileOutputStream(
      new File(filename)));
      byte[] b = new byte[1024];
      int numBytes = 0;
      while ((numBytes = in.read(b)) > 0) {
      out.write(b, 0, numBytes);
      }
      in.close();
      out.close();
      fileSystem.close();

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

       

    3. Deleting file from HDFS:Create a file stream object to a file in HDFS and delete it.

      FileSystem fileSystem = FileSystem.get(conf);
      Path path = new Path("/path/to/file.ext");
      if (!fileSystem.exists(path)) {
      System.out.println("File does not exists");
      return;
      }
      // Delete file
      fileSystem.delete(new Path(file), true);
      fileSystem.close();

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

       

    4. Create dir in HDFS:Create a file stream object to a file in HDFS and read it.

      FileSystem fileSystem = FileSystem.get(conf);                                                                                                                              
      Path path = new Path(dir);
      if (fileSystem.exists(path)) {
      System.out.println("Dir " + dir + " already not exists");
      return;
      }
      // Create directories
      fileSystem.mkdirs(path);
      fileSystem.close();

      -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

       

    Here is the complete working client java class.

    package com.sterlingcommerce.fg.services;                                                                                                                                                                               
    import com.sterlingcommerce.woodstock.mailbox.MailboxException;
    import com.sterlingcommerce.woodstock.mailbox.MailboxFactory;
    import com.sterlingcommerce.woodstock.mailbox.repository.IRepository;
    import com.sterlingcommerce.woodstock.mailbox.repository.Message;
    import com.sterlingcommerce.woodstock.services.IService;
    import com.sterlingcommerce.woodstock.util.frame.Manager;
    import com.sterlingcommerce.woodstock.util.frame.log.LogService;
    import com.sterlingcommerce.woodstock.util.frame.log.Logger;
    import com.sterlingcommerce.woodstock.workflow.Document;
    import com.sterlingcommerce.woodstock.workflow.DocumentNotFoundException;
    import com.sterlingcommerce.woodstock.workflow.WorkFlowContext;
    import com.sterlingcommerce.woodstock.workflow.WorkFlowException;
    import java.io.BufferedInputStream;
    import java.io.BufferedOutputStream;
    import java.io.File;
    import java.io.FileInputStream;
    import java.io.FileOutputStream;
    import java.io.IOException;
    import java.io.InputStream;
    import java.io.OutputStream;
    import java.sql.SQLException;
    import java.util.Properties;

    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FSDataInputStream;
    import org.apache.hadoop.fs.FSDataOutputStream;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;

    public class HadoopClient implements IService {
    private static Logger logger = LogService.getLogger("systemlogger");
    private static String defaultDirectory = null;
    private static boolean isOK = true;

    static {
        Properties props = Manager.getProperties("jdbcService");
        defaultDirectory = props.getProperty("document_dir");
    }

    public WorkFlowContext processData(WorkFlowContext wfc)  throws WorkFlowException {
        isOK = true;
        try {
            wfc.harnessRegister();
            String tmp = wfc.getParm("action");
            String local_path = wfc.getParm("local_path");
            String messageId = wfc.getParm("messageId");
            String hdfs_path = wfc.getParm("hdfs_path");

            if ("add".equalsIgnoreCase(tmp)) {
               System.out.println("Usage: hdfsclient add <local_path> " + "<hdfs_path>");
               System.out.println("HadoopClient.processData(): messageId:" + messageId);
               System.out.println("HadoopClient.processData(): hdfs_path:" + hdfs_path);
               if (local_path != null && !"".equalsIgnoreCase(local_path)) {
                    addFile(local_path, hdfs_path);
               } else if (messageId != null && !"".equalsIgnoreCase(messageId)) {
                    addFileFromMailbox(messageId, hdfs_path);
               }
            } else if ("read".equalsIgnoreCase(tmp)) {
                    System.out.println("Usage: hdfsclient read <hdfs_path>");
                    readFile(hdfs_path);
            } else if ("delete".equalsIgnoreCase(tmp)) {
                    System.out.println("Usage: hdfsclient delete <hdfs_path>");
                    deleteFile(hdfs_path);
            } else if ("mkdir".equalsIgnoreCase(tmp)) {
                    System.out.println("Usage: hdfsclient mkdir <hdfs_path>");
                    mkdir(hdfs_path);
            } else {
                    System.out.println("Usage: hdfsclient add/read/delete/mkdir" + " [<local_path> <hdfs_path>]");                                              
                    System.exit(1);                 
            }

            System.out.println("Done!");
       } catch (IOException e) {
            isOK = false;
            wfc.setBasicStatus(1);
            throw new WorkFlowException(e.toString());
       } finally {
            wfc.unregisterThread();
            if (isOK) {
                    wfc.setBasicStatusSuccess();
            } else {
                    wfc.setBasicStatusError();
            }
       }
       return wfc;
    }

    public void addFile(String source, String dest) throws IOException {
            Configuration conf = new Configuration();

            // Conf object will read the HDFS configuration parameters from these XML files.
            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/core-site.xml"));
            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/hdfs-site.xml"));
           
            FileSystem fileSystem = FileSystem.get(conf);

            // Get the filename out of the file path
            String filename = null;
            System.out.println("HadoopClient.addFile():dest 1:" + dest);                                                                                                                 
            if (System.getProperty("os.name").contains("Windows")) {
                    filename = source.substring(source.lastIndexOf('\\') + 1,
                    source.length());
                    // Create the destination path including the filename.
                    if (dest.charAt(dest.length() - 1) != '\\') {
                            dest = dest + "\\" + filename;
                    } else {
                            dest = dest + filename;
                    }
                    System.out.println("HadoopClient.addFile():WIN");
            } else {
                    filename = source.substring(source.lastIndexOf('/') + 1,
                    source.length());
                    // Create the destination path including the filename.
                    if (dest.charAt(dest.length() - 1) != '/') {
                            dest = dest + "/" + filename;
                    } else {
                            dest = dest + filename;
                    }
                    System.out.println("HadoopClient.addFile():Non WIN");
            }

            System.out.println("HadoopClient.addFile():filename:" + filename);
            System.out.println("HadoopClient.addFile():dest 2:" + dest);

            // Check if the file already exists
            Path path = new Path(dest);
            if (fileSystem.exists(path)) {
                    System.out.println("File " + dest + " already exists");
                    return;
            }

            // Create a new file and write data to it.
            FSDataOutputStream out = fileSystem.create(path);
            InputStream in = new BufferedInputStream(new FileInputStream(new File(source)));                                                                           
            byte[] b = new byte[1024];
            int numBytes = 0;
            while ((numBytes = in.read(b)) > 0) {
                    out.write(b, 0, numBytes);
            }

            // Close all the file descripters
            in.close();
            out.close();
            fileSystem.close();
    }

    public void addFileFromMailbox(String messageId, String dest) throws IOException {
            Configuration conf = new Configuration();

            // Conf object will read the HDFS configuration parameters from these XML files.
            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/core-site.xml"));
            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/hdfs-site.xml"));
            FileSystem fileSystem = FileSystem.get(conf);
            IRepository repos = MailboxFactory.getRepository();
            String filename = null;
            Message message = null;
            Document batchDoc = null;
            try {
                    System.out.println("HadoopClient.addFileFromMailbox():messageId:" + messageId);
                    message = repos.getMessage(Long.parseLong(messageId));// source here is messageid
                    System.out.println("HadoopClient.addFileFromMailbox(): message:" + message.toString());                                                      
                    String docId = message.getDocumentId();
                    System.out.println("HadoopClient.addFileFromMailbox(): docId:" + docId);
                    filename = message.getMessageName();
                    System.out.println("HadoopClient.addFileFromMailbox():filename:" + filename);
                    if (docId != null) {
                            batchDoc = new Document(docId, true);
                    }
                    System.out.println("HadoopClient.addFileFromMailbox(): batchDoc:" + batchDoc);
            } catch (NumberFormatException e) {
                    e.printStackTrace();
            } catch (MailboxException e) {
                    e.printStackTrace();
            } catch (DocumentNotFoundException e) {
                    e.printStackTrace();
            } catch (SQLException e) {
                    e.printStackTrace();
            }

            // Get the filename out of the file path
            System.out.println("HadoopClient.addFileFromMailbox():dest 1:" + dest);
            System.out.println("HadoopClient.addFileFromMailbox():OS:" + System.getProperty("os.name"));                                                   
            if (System.getProperty("os.name").contains("Windows")) {
                    if (dest.charAt(dest.length() - 1) != '\\') {
                            dest = dest + "\\";
                    }
                    System.out.println("HadoopClient.addFileFromMailbox():WIN");
            } else {
                    if (dest.charAt(dest.length() - 1) != '/') {
                    dest = dest + "/";
                    }
                    System.out.println("HadoopClient.addFileFromMailbox():Non WIN");
            }

            System.out.println("HadoopClient.addFileFromMailbox():dest 2:" + dest);
            dest = dest + filename;
            // System.out.println("Adding file to " + destination);

            // Check if the file already exists
            Path path = new Path(dest);
            if (fileSystem.exists(path)) {
                    System.out.println("File " + dest + " already exists");
                    return;
            }

            // Create a new file and write data to it.
            FSDataOutputStream out = fileSystem.create(path);
            InputStream in = batchDoc.getInputStream();

            byte[] b = new byte[1024];
            int numBytes = 0;
            while ((numBytes = in.read(b)) > 0) {
                    out.write(b, 0, numBytes);
            }

            // Close all the file descripters
            in.close();
            out.close();
            fileSystem.close();
    }

    public void readFile(String file) throws IOException {
            Configuration conf = new Configuration();

            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/core-site.xml"));                                                                       
            FileSystem fileSystem = FileSystem.get(conf);
            Path path = new Path(file);
            if (!fileSystem.exists(path)) {
                    System.out.println("File " + file + " does not exists");
                    return;
            }

            FSDataInputStream in = fileSystem.open(path);

            String filename = file.substring(file.lastIndexOf('/') + 1, file.length());

            OutputStream out = new BufferedOutputStream(new FileOutputStream(new File(filename)));                                                            

            byte[] b = new byte[1024];
            int numBytes = 0;
            while ((numBytes = in.read(b)) > 0) {
                    out.write(b, 0, numBytes);
            }

            in.close();
            out.close();
            fileSystem.close();
    }

    public void deleteFile(String file) throws IOException {
            Configuration conf = new Configuration();
            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/core-site.xml"));                                                                       

            FileSystem fileSystem = FileSystem.get(conf);

            Path path = new Path(file);
            if (!fileSystem.exists(path)) {
                    System.out.println("File " + file + " does not exists");
                    return;
            }

            fileSystem.delete(new Path(file), true);

            fileSystem.close();
    }

    public void mkdir(String dir) throws IOException {
            Configuration conf = new Configuration();
            conf.addResource(new Path("/ais_local/share/bchandra/hadoop/client/core-site.xml"));                                                                       

            FileSystem fileSystem = FileSystem.get(conf);

            Path path = new Path(dir);
            if (fileSystem.exists(path)) {
                    System.out.println("Dir " + dir + " already not exists");
                    return;
            }

            fileSystem.mkdirs(path);

            fileSystem.close();
    }

    }

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

     
    Service Implementation – serviceinstances.xml

    <?xml version="1.0" encoding="UTF-8"?>                                                                                                                                                    
    <services>
            <service parentdefname="TestFileServiceType" name="TestFileService"
            displayname="Simple Test File Service"
            description="Simple Test File Service"
            targetenv="all"
            activestatus="1"
            systemservice="0"
            parentdefid="-1"/>
    </services>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     
    Service Definitions – servicedefs/TestServices.xml

    <?xml version="1.0" encoding="UTF-8"?>                                                                                                                                                    

     

    <SERVICES>

            <SERVICE name="TestFileServiceType"

            description="Simple Test File Service"

            label="TestFileService"

            implementationType="CLASS"

            JNDIName="com.sterlingcommerce.fg.services.HadoopClient"

            type="Basic"

            adapterType="N/A"

            adapterClass="N/A"

            version="1.0"

            SystemService="NO" />

    </SERVICES>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Business Processes:

    Following are different BPs making using of HDFS Client Service to perform GET a file, PUT a file, DELETE a file, and create a directory on HDFS file system..

    1> Using Lwjdbc adapter – Pulling records from a relational database table and creating an xml file out of it to put it on the HDFS server.

    <process name="TestFileService">                                                                                            
      <sequence>
        <operation name="LightweightJDBCAdapterType">
          <participant name="LightweightJDBCAdapterQuery"/>
          <output message="LightweightJDBCAdapterTypeInputMessage">
            <assign to="schedHour">-1</assign>
            <assign to="result_name">result</assign>
            <assign to="schedDay">-2</assign>
            <assign to="sql">SELECT * FROM woodstock.Persons</assign>
            <assign to="pool">mysqlPool</assign>
            <assign to="row_name">row</assign>
            <assign to="schedMinute">-1</assign>
            <assign to="query_type">SELECT</assign>
            <assign to="." from="*"></assign>
          </output>
          <input message="inmsg">
            <assign to="." from="*"></assign>
          </input>
        </operation>
    
        <operation name="File System Adapter">
          <participant name="LWJDBC_TEST"/>
          <output message="FileSystemInputMessage">
            <assign to="assignedFilename">AllPersons.xml</assign>
            <assign to="schedMinute">-1</assign>
            <assign to="schedHour">-1</assign>
            <assign to="Action">FS_EXTRACT</assign>
            <assign to="schedDay">-2</assign>
            <assign to="extractionFolder">/ais_local/share/bchandra/526_61700/g/platform_core/install/regression/lightweight-jdbcadapter/
    resultdata/</assign>
            <assign to="assignFilename">true</assign>
            <assign to="." from="*"></assign>
          </output>
          <input message="inmsg">
            <assign to="." from="*"></assign>
          </input>
        </operation>
            <operation name="Extract File">
              <participant name='TestFileService'/>
              <output message='xout'>
                <assign to='action'>ADD</assign>
                    <assign to='local_path'>/ais_local/share/bchandra/526_61700/g/platform_core/install/regression/lightweight-jdbcadapter/
    resultdata/AllPersons.xml</assign>
                    <assign to='hdfs_path'>hdfs://blrgislin33.sciblr.in.ibm.com:9000/a</assign>
                <assign to='.' from='PrimaryDocument' />
              </output>
             <input message="xin">
               <assign to="." from="*"/>
             </input>
            </operation>
      </sequence>
    </process>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    2>PUT File

    <process name="TestFileService">
      <sequence>
            <operation name="Extract File">
              <participant name='TestFileService'/>
              <output message='xout'>
                <assign to='action'>ADD</assign>
                    <assign to='local_path'>/ais_local/share/bchandra/526_61700/g/platform_core/install/documents/testfile.txt</assign>
                    <assign to='hdfs_path'>hdfs://blrgislin33.sciblr.in.ibm.com:9000/a</assign>
                <assign to='.' from='PrimaryDocument' />
              </output>
             <input message="xin">
               <assign to="." from="*"/>
             </input>
            </operation>
      </sequence>
    </process>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    3>GET File

    <process name="TestFileService">
      <sequence>
            <operation name="Extract File">
              <participant name='TestFileService'/>
              <output message='xout'>
                <assign to='action'>READ</assign>
                    <assign to='hdfs_path'>hdfs://blrgislin33.sciblr.in.ibm.com:9000/a/testfile.txt</assign>                               
                <assign to='.' from='PrimaryDocument' />
              </output>
             <input message="xin">
               <assign to="." from="*"/>
             </input>
            </operation>
      </sequence>
    </process>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    4>DELETE file

    <process name="TestFileService">
      <sequence>
            <operation name="Extract File">
              <participant name='TestFileService'/>
              <output message='xout'>
                <assign to='action'>DELETE</assign>
                    <assign to='hdfs_path'>hdfs://blrgislin33.sciblr.in.ibm.com:9000/a/testfile.txt</assign>                               
                <assign to='.' from='PrimaryDocument' />
              </output>
             <input message="xin">
               <assign to="." from="*"/>
             </input>
            </operation>
      </sequence>
    </process>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

     5>MAKE DIRECTORY

    <process name="TestFileService">
      <sequence>
            <operation name="Extract File">
              <participant name='TestFileService'/>
              <output message='xout'>
                <assign to='action'>MKDIR</assign>
                    <assign to='hdfs_path'>hdfs://blrgislin33.sciblr.in.ibm.com:9000/bipin</assign>                                     
                <assign to='.' from='PrimaryDocument' />
              </output>
             <input message="xin">
               <assign to="." from="*"/>
             </input>
            </operation>
      </sequence>
    </process>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Hadoop Client Configuration:

     

    core-site.xml

    The core-site.xml file contains information such as the port number used for Hadoop instance, memory allocated for the file system, memory limit for storing the data, and size of Read/Write buffers.

    Open the core-site.xml and add the following properties in between <configuration>, </configuration> tags.

    create a file /ais_local/share/bchandra/hadoop/client/core-site.xml

    <configuration>

            <property>

                    <name>fs.default.name</name>

                    <value>hdfs://blrgislin33.sciblr.in.ibm.com:9000</value>                                                                                                             

            </property>

    </configuration>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    hdfs-site.xml

    The hdfs-site.xml file contains information such as the value of replication data, namenode path, and datanode paths of your local file systems. It means the place where you want to store the Hadoop infrastructure. 

    create a file /ais_local/share/bchandra/hadoop/client/hdfs-site.xml

    <configuration>

            <property>

                    <name>dfs.replication</name>

                    <value>1</value>

            </property>

            <property>

                    <name>dfs.name.dir</name>

                    <value>file:///home/hadoop/hadoopinfra/hdfs/namenode </value>                                                                                                

            </property>

            <property>

                    <name>dfs.data.dir</name>

                    <value>file:///home/hadoop/hadoopinfra/hdfs/datanode </value>

            </property>

    </configuration>

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Installation of all dependant client jar – list shows all the hadoop client jars need to be in dynamicclasspath.cfg

     Go to ./install/bin

    To resolve compilation issue

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/hadoop-common-2.7.1.jar                                         

     

    To resolve runtime issue

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/htrace-core-3.1.0-incubating.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar                                  

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/jsr305-3.0.0.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/netty-all-4.0.23.Final.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar                                             

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.7.1.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/commons-collections-3.2.1.jar                            

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/hadoop-annotations-2.7.1.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/hadoop-auth-2.7.1.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/htrace-core-3.1.0-incubating.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/slf4j-api-1.7.10.jar

    ./install3rdParty.sh hadoop 2_7_1 -j /home/bchandra/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar

    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Create Service Installable jar

    All of the above class files and configuration files need to be bundled in a jar file like following and then need to be installed as a service in SI

    The sample jar file – installHDFSClient.jar has been uploaded for your reference. Make sure the directory and its files matches the directory structure provided in the attached jar file.

     

    Installing HDFS Client Service jar

    Download the attached HDFS Client jar to test the sample example. Click this link to download - installHDFSClient.jar

    Now go to ./install/bin

    ./InstallService.sh /home/bchandra/hadoop/client/installHDFSClient.jar

     
    Setting up Hadoop Server

    Follow this URL step by atep to setup hadoop environment http://www.tutorialspoint.com/hadoop/hadoop_enviornment_setup.htm

     
    Execute CRUD operation.

    Execute the above BPs to perform desired action.

    References:

    http://www.tutorialspoint.com/hadoop/hadoop_enviornment_setup.htm

    http://blog.rajeevsharma.in/2009/06/using-hdfs-in-java-0200.html


    @#$%&!


    #IBMSterlingB2BIntegratorandIBMSterlingFileGatewayDevelopers
    #Highlights-home
    #DataExchange
    #Webcast
    0 comments
    340 views

    Permalink