Hadoop Web Page Information

Sometimes, we want make sure which branch we compile the hadoop code. Normally, we can find this information through hadoop web page, for example the namenode web page:


If we want to change the information by self, we should modify the file in
$HADOOP_HOME/hadoop-common-project/hadoop-common/src/main/resources/common-version-info.properties

version=${pom.version}
revision=${version-info.scm.commit}
branch=${version-info.scm.branch}
user=${user.name}
date=${version-info.build.time}
url=${version-info.scm.uri}
srcChecksum=${version-info.source.md5}
protocVersion=${protobuf.version}

Just modify the value , you can change the web page of namenode. For example we can change the branch to branch=tag-20160817 as the image above.

Hadoop Balancer问题

最近关注了一下集群Balancer的运转,由于集群非常大,并且有一些长时任务一只运行,磁盘不均衡的问题非常严重,所以快速的均衡数据是非常必要的。
但是从日志上看,我们的Balancer经常hang住。之前的做法是通过一个cron脚本进行检查,hang住一段时间后,自动的重启balancer。
这两天特意看了一下日志和代码,发现hang住的原因是由于npe导致的。具体的代码如下:

             // update locations
             for (String datanodeUuid : blk.getDatanodeUuids()) {
               final BalancerDatanode d = datanodeMap.get(datanodeUuid);
-              if (datanode != null) { // not an unknown datanode
+              if (d != null) { // not an unknown datanode
                 block.addLocation(d);
               }
             }

进行判断的对象错误了,将datanode改为d就好了。原先的做法会导致Balncer提交的copy任务抛出npe错误。

    for (Source source : sources) {
      futures[i++] = dispatcherExecutor.submit(source.new BlockMoveDispatcher());
    }

    // wait for all dispatcher threads to finish
    for (Future<?> future : futures) {
      try {
        future.get();
      } catch (ExecutionException e) {
        LOG.warn("Dispatcher thread failed", e.getCause());
      }
    }

在这部分,由于提交的BlockMoveDispatcher任务抛出了npe,同时在future.get没有设置超时,就会一只hang住。
解决方法:除了修改npe以为,还可以在future.get()设置超时时间,确保不会一直hang住。

Some Note About Deploy Hadoop Clusters

1.Download source code, using maven to compile and package, remember to compile the native for the same OS version.

mvn package -Pdist -Pnative -Dtar -DskipTests

2.Edit the core-site.xml, hdfs-site.xml, mapred-site.xml and yarn-site.xml file
you should add some JVM parameters or log position in /etc/bashrc, just a example below:

export JAVA_HOME=/usr/local/jdk1.7.0_67
export JRE_HOME=/usr/local/jdk1.7.0_67/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export HADOOP_HOME=/usr/local/hadoop-2.4.0
export HADOOP_LOG_DIR=/data0/hadoop/log/hadoop
export HADOOP_PID_DIR=/data0/hadoop/pid/hadoop
export YARN_LOG_DIR=/data0/hadoop/log/yarn
export YARN_PID_DIR=/data0/hadoop/pid/yarn
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_NAMENODE_OPTS=" -Xmx20480m -Xms20480m -Xmn3072m -verbose:gc -Xloggc:/data0/hadoop/gclog/namenode.gc.log -XX:ErrorFile=/data0/hadoop/gclog/hs_err_pid.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=85 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=1000 -XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC -Dcom.sun.management.jmxremote.port=6000 -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.password.file=/usr/local/hadoop-2.4.0/etc/hadoop/jmxremote.password"
export YARN_RESOURCEMANAGER_OPTS=" -Xmx10240m -Xms10240m -Xmn3072m -verbose:gc -Xloggc:/data0/hadoop/gclog/yarn.gc.log -XX:ErrorFile=/data0/hadoop/gclog/hs_err_pid.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=1000 -XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC -Dcom.sun.management.jmxremote.port=6001 -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.password.file=/usr/local/hadoop-2.4.0/etc/hadoop/jmxremote.password"
ulimit -u 65535

3.Put the packaged code into all the servers, including namenodes, resourcemanagers, nodemanagers and datanodes
4.startup journalnode first, i assume you use qjournal

hadoop-daemon.sh start journalnode

5.format namenode for specific namespace

hdfs namenode -format
if you use federation, make sure the cluster id are the same,so if the first ns cluster id is abcdefg, the second ns should format with the cluster id, hdfs namenode -format -clusterId=abcdefg

6.init the standby namenode for the same namespace

hdfs namenode -bootstrapStandby

7.start namenodes and datanodes

hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode

8.transition to active namenode

for example namespace is ns, active namenode is nn1
 hdfs haadmin -ns ns -transitionToActive nn1

9.mkdir dir for hadoop user and mapred(the user to startup resource manager and history server) user
10.mkdir for history server
for example the mapred-site.xml set history directory

mapreduce.jobhistory.intermediate-done-dir
hdfs://hbasens/hadoop/history/tmp
true
mapreduce.jobhistory.done-dir
hdfs://hbasens/hadoop/history/done
true

you have to set directory like this

hdfs dfs -mkdir -p /hadoop/history/tmp
hdfs dfs -chown -R mapred:mapred /hadoop/history
hdfs dfs -chmod -R 1777 /hadoop/history/tmp
hdfs dfs -mkdir -p /hadoop/history/done
hdfs dfs -chmod -R 1777 /hadoop/history/done

11.startup resourcemanager and nodemanager and mr history server

yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
mr-jobhistory-daemon.sh start historyserver

Hadoop-pipes 编译 Native报错问题

近期编译Hadoop native代码发现,由于内核版本升级,导致了编译失败,失败的module是hadoop-pipes。编译的命令式 mvn clean package -Pnative -DskipTests。发现了大量的如下错误

[exec] /usr/bin/c++ -g -Wall -O2 -D_REENTRANT -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fPIC CMakeFiles/pipes-sort.dir/main/native/examples/impl/sort.cc.o -o examples/pipes-sort -rdynamic libhadooppipes.a libhadooputils.a -lssl -lpthread
[exec] /usr/bin/cmake: /usr/lib64/libcrypto.so.10: no version information available (required by /usr/lib64/libssl.so.10)
[exec] /usr/bin/cmake: /usr/lib64/libcrypto.so.10: no version information available (required by /usr/lib64/libssl.so.10)
[exec] /usr/bin/cmake: /usr/lib64/libcrypto.so.10: no version information available (required by /usr/lib64/libssl.so.10)
[exec] /usr/bin/cmake: /usr/lib64/libcrypto.so.10: no version information available (required by /usr/lib64/libssl.so.10)
[exec] make[2]: Leaving directory /data0/jiangyu2/compile/201504/hadoop-tools/hadoop-pipes/target/native'
[exec] make[1]: Leaving directory
/data0/jiangyu2/compile/201504/hadoop-tools/hadoop-pipes/target/native’
[exec] /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../lib64/libssl.so: undefined reference to `OPENSSL_init_library@libcrypto.so.10′

      看了下是缺少libcrypto,为解决这一问题,修改$HADOOP_PIPES/src/CMakeLists.txt,修改后加入一行加入如下即可解决。

target_link_libraries(hadooppipes
${OPENSSL_LIBRARIES}
pthread
+ crypto
)

 

Hadoop classpath问题

    近期使用一个Hadoop周边系统,druid的时候发现一个问题,在druid配置hdfs位置的时候配置如下:

# Deep storage (local filesystem for examples – don’t use this in production)
druid.storage.type=hdfs
druid.storage.storageDirectory=hdfs://ns****/druid/localStorage

    启动脚本为:

java -Xmx256m -Ddruid.realtime.specFile=examples/wikipedia/wikipedia_realtime_kafka.spec -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath config/_common:config/realtime:lib/* io.druid.cli.Main server realtime

    大多数开启HA的Hadoop集群,如果使用集群的话基本都是这种配置。

    启动Druid后发现,并没有写入我想要的集群,而写入了另一个集群,反复查找,以为打包过程中混入了错误的hdfs-site.xml文件等等,但是并没有找到。最后在同事的帮助下发现,原来我们曾经配置过一个vip域名,ns****指向了某台hadoop机器的中心机,翻过来看这个问题才想到不是由于打入了错误的hdfs-site.xml文件,而是环境变量没有加入Hadoop的conf文件。

    修改启动脚本为:

java -Xmx256m -Ddruid.realtime.specFile=examples/wikipedia/wikipedia_realtime_kafka.spec -Duser.timezone=UTC -Dfile.encoding=UTF-8 -classpath config/_common:config/realtime:$HADOOP_HOME/etc/hadoop:lib/* io.druid.cli.Main server realtime

    启动后一切正常。经验,Hadoop周边系统一定要加好CLASSPATH。

因为被扫端口导致JobTracker挂起问题

       最近一段时间我们的Hbase集群所用的JobTracker(hadoop基线版本1.0.2)天天夜里挂起,停止服务,无法回应任何TaskTracker心跳,进而无法调度任务。以下是心跳汇报报错:

                                                 Jt1

      可以看出最后出错在fairscheduler调度任务阶段,有pool name是null,导致了最后排序报NPE错误。        

      最后查找这个NULL值得来源,通过jira发现社区已经发现了这一个问题,jira地址是https://issues.apache.org/jira/browse/MAPREDUCE-4195。        简单的说一下这个问题,第一步是有人通过脚本或者工具(不是通过页面链接)调用了jobqueue_details.jsp页面,错误发生在下面代码

  String queueName = request.getParameter("queueName");
  TaskScheduler scheduler = tracker.getTaskScheduler();
  Collection jobs = scheduler.getJobs(queueName);

queueName传入的肯定是null,然后调用FairScheduler的getJobs,此时queueName是null。FairScheduler会调用PoolManager的getPool方法: 

  public synchronized Pool getPool(String name) {
    Pool pool = pools.get(name);
    if (pool == null) {
      pool = new Pool(scheduler, name);
      pool.setSchedulingMode(defaultSchedulingMode);
      pools.put(name, pool);
    }
    return pool;
  }

      通过这个方法,就产生了名称是NULL的queueName。此后就跟前文所说,调度器挂起,JT无法调度任何服务。解决方法:参见jira。

      另外,提一下我们产生这个问题的原因,是由于Sina曾经被人攻陷内网,安全组有两台服务器不停的扫内部服务器端口,hadoop bug就这样被触发了。