Operation and Setup of HDFSBridge


#1

I am going to Setup and operation guide for HdfsBridge based on a document was prepared by Josh Patterson.
This is the link of the document:
https://openpdc.svn.codeplex.com/svn/Hadoop/Current%20Version/HdfsBridge/docs/openPDC%20HdfsBridge%20Setup%20Guide.pdf
Notwithstanding, I have done some parts of this configuration and setting that have been written in the document, from operation of HdfsBridge part to end I puzzled about that. For example: At the HDFS Checksumming part, I have tried to run command of “HDFSCHKSM” in WinSCP console but I encountered an error of “command not found”.
I was wondering if someone would mind guiding me how to exactly setting that and showing me how to run and send archived data to Hadoop.


#2

If the bridge is operational you can enable the Hadoop replication for the local openPDC historian in the openPDC.exe.config file (easier to edit this file just using the included XML Configuration Editor - can find shortcut in the start menu for openPDC).

Look for the setting category called ppaHadoopReplicationProvider and set the Enabled flag to true. You will also need to configure some of the other settings there, e.g., ArchiveLocation, which is typically C:\Program Files\openPDC\Archive\ as well as ReplicaLocation which is the Hadoop mirrored location…


#3

Dear ritchiecarroll,
I appreciate you. I have found the openPDC.exe.config file, but ppaHadoopReplicationProvider category does not exist that I set it to enable. I have not found ArchiveLocation as well as ReplicaLocation. Broadly speaking, there is not Archive folder in path of OpenPDC installed. Do not you think that is related to OpenPCD version or not? I installed v.2.2.
Is that dispensable when I want to make bridge between OpenPDC and HDFS, using HdfsBridge as a gateway for sending data from OpenPDC to HDFS with FTP protocol. could that mechanism be supported by OpenPDC software instead of using HdfsBridge or not either?


#4

You may need to re-run the configuration setup utility to make sure you have enabled to the PPA historian during the initial configuration process:


#5

Hello Ritchie,

I have re-run the configuration setup utility, but I didn’t find those categories in openPDC.exe.config that you mentioned.

by the way, I am beginner in this area. I am going to send data , on the presumption that openPDC received the data by PDCs, to Hadoop. let me explain my progress : firstly, I’ve done setup OpenPDC on windows 7. The second, I’v done a few configurations in package of “hdfs-over-ftp-master” based on the link of http://openpdc.codeplex.com/wikipage?title=Using%20Hadoop%20(Developers).
The third, as I mentioned I used the document of "openPDC HdfsBridge: Setup / Operation"
that I’ve done some parts of the setup and configuration, After that I encountered an issue that I mentioned.

Please guide me am i doing right or wrong. Please indicate my mistakes.
is it the other way to send the data ?


#6

You should still be able to find HDFS-Bridge code here:

See /Hadoop/Current Version/HdfsBridge

I am not sure this code was ever migrated to Github…

Thanks,
Ritchie


#7

Actually - looks like it ended up in GSF:

Thanks,
Ritchie