入口
hdfs腳本中,關(guān)于dfs的操作,統(tǒng)一走了org.apache.hadoop.fs.FsShell這個(gè)類。這個(gè)代碼在腳本第153行
elif [ "$COMMAND" = "dfs" ] ; then
CLASS=org.apache.hadoop.fs.FsShell
HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"d
這個(gè)類中的main方法,只有簡(jiǎn)單的幾行
/**
* main() has some simple utility methods
* @param argv the command and its arguments
* @throws Exception upon error
*/
public static void main(String argv[]) throws Exception {
FsShell shell = newShellInstance();
Configuration conf = new Configuration();
conf.setQuietMode(false);
shell.setConf(conf);
int res;
try {
res = ToolRunner.run(shell, argv);
} finally {
shell.close();
}
System.exit(res);
}
ToolRunner是一個(gè)公用的執(zhí)行器工具,想要使用這個(gè)工具來執(zhí)行任務(wù),對(duì)應(yīng)的任務(wù)執(zhí)行邏輯類需要實(shí)現(xiàn)Tool接口以及這個(gè)接口里的run(String[] args)方法
public class ToolRunner {
/**
* Runs the given <code>Tool</code> by {@link Tool#run(String[])}, after
* parsing with the given generic arguments. Uses the given
* <code>Configuration</code>, or builds one if null.
*
* Sets the <code>Tool</code>'s configuration with the possibly modified
* version of the <code>conf</code>.
*
* @param conf <code>Configuration</code> for the <code>Tool</code>.
* @param tool <code>Tool</code> to run.
* @param args command-line arguments to the tool.
* @return exit code of the {@link Tool#run(String[])} method.
*/
public static int run(Configuration conf, Tool tool, String[] args)
throws Exception{
if(conf == null) {
conf = new Configuration();
}
GenericOptionsParser parser = new GenericOptionsParser(conf, args);
//set the configuration back, so that Tool can configure itself
tool.setConf(conf);
//get the args w/o generic hadoop args
String[] toolArgs = parser.getRemainingArgs();
return tool.run(toolArgs);
}
/**
* Runs the <code>Tool</code> with its <code>Configuration</code>.
*
* Equivalent to <code>run(tool.getConf(), tool, args)</code>.
*
* @param tool <code>Tool</code> to run.
* @param args command-line arguments to the tool.
* @return exit code of the {@link Tool#run(String[])} method.
*/
public static int run(Tool tool, String[] args)
throws Exception{
return run(tool.getConf(), tool, args);
}
我們從Fs的main方法里看到,執(zhí)行的時(shí)候Tool接口的實(shí)現(xiàn)類傳入的是FsShell對(duì)象,所以回去看FsShell類的run方法。
/**
* run
*/
@Override
public int run(String argv[]) throws Exception {
// initialize FsShell
init();
int exitCode = -1;
if (argv.length < 1) {
printUsage(System.err);
} else {
String cmd = argv[0];
Command instance = null;
try {
/**
* 這行代碼是獲取對(duì)應(yīng)的執(zhí)行邏輯類,HDFS把所有的命令都使用Command接口做了封裝
*/
instance = commandFactory.getInstance(cmd);
if (instance == null) {
throw new UnknownCommandException();
}
exitCode = instance.run(Arrays.copyOfRange(argv, 1, argv.length));
} catch (IllegalArgumentException e) {
displayError(cmd, e.getLocalizedMessage());
if (instance != null) {
printInstanceUsage(System.err, instance);
}
} catch (Exception e) {
// instance.run catches IOE, so something is REALLY wrong if here
LOG.debug("Error", e);
displayError(cmd, "Fatal internal error");
e.printStackTrace(System.err);
}
}
return exitCode;
}
通過IDEA的工具我們可以發(fā)現(xiàn),Command內(nèi)部定義了一些抽象方法,Command的run方法按照規(guī)定的順序調(diào)用抽象方法,Command的子類是FsCommand和DFSAdminCommand,每個(gè)實(shí)現(xiàn)類下面的都有若干個(gè)子類,這是典型的模板方法模式??傮w的類結(jié)構(gòu)如下圖所示

有了這個(gè)結(jié)構(gòu),我們查看具體的dfs每個(gè)操作的時(shí)候,就可以查看對(duì)應(yīng)的具體實(shí)現(xiàn)類就可以了。
遠(yuǎn)程執(zhí)行過程
Command類是在本地機(jī)器上的Java進(jìn)程里執(zhí)行的,dfs操作是在HDFS集群上執(zhí)行。hadoop處理多個(gè)進(jìn)程間的通信使用了protobuf框架,HDFS的客戶端和服務(wù)端使用的協(xié)議是org.apache.hadoop.hdfs.protocol.ClientProtocol。代碼里客戶端的DFSClient類對(duì)這個(gè)協(xié)議的代理類做了封裝,服務(wù)端NameNodeRPCServer對(duì)這個(gè)協(xié)議做了實(shí)現(xiàn)。執(zhí)行具體命令的時(shí)候,command實(shí)現(xiàn)類通過DFSClient實(shí)例與NameNodeRPCServer做通信。具體的執(zhí)行步驟是在NameNode內(nèi)部完成的。