How to Avoid Druid Write /Tmp Directory Full

Recently, i have noticed when i started some of the realtime node, it is easy for druid to write /tmp directory full. The file is all start with filePeon.
After i investigate the code and the configuration of druid, i found druid write the index file in druid.indexer.task.baseDir, and the default value is System.getProperty(“java.io.tmpdir”).
So we can set java.io.tmpdir to another directory when we start the realtime node as below:

java -Djava.io.tmpdir=/data0/druid/tmp -Xmx10g -Xms10g -XX:NewSize=2g -XX:MaxNewSize=2g -XX:MaxDirectMemorySize=25g -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Ddruid.realtime.specFile=config/realtime/wb_ad_interest_druid.spec -classpath config/_common:config/realtime:/usr/local/hadoop-2.4.0/etc/hadoop:lib/* io.druid.cli.Main server realtime

ResourceManager dispatcher handler slow because RMStore Synchronized Method

Recently, I have noticed the Async dispatcher in our resource manager get pending for some times.
Here are some log:

2016-05-24 00:46:20,398 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 24000
2016-05-24 00:46:21,008 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 25000
2016-05-24 00:46:21,632 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 26000
2016-05-24 00:46:22,251 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 27000
2016-05-24 00:46:22,873 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 28000
2016-05-24 00:46:23,501 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 29000
2016-05-24 00:46:24,109 INFO org.apache.hadoop.yarn.event.AsyncDispatcher: Size of event-queue is 30000  

As we all know, the async dispatcher in resource manager normally handle the event quickly enough, but from the log ,we can notice the pending situation is serious.
So we investigated this problem, and jstack the rescue manager process during pending. Here is the jstack information:

"AsyncDispatcher event handler" prio=10 tid=0x00007f4d6db10000 nid=0x5bca waiting for monitor entry [0x00007f4d3aa8c000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplication(RMStateStore.java:375)
        - waiting to lock <0x00000003bae88af0> (a org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore)
        at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppNewlySavingTransition.transition(RMAppImpl.java:881)
        at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl$RMAppNewlySavingTransition.transition(RMAppImpl.java:872)
        at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
        at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
        at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        - locked <0x0000000394cbae40> (a org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine)
        at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:645)
        at org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:82)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:690)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationEventDispatcher.handle(ResourceManager.java:674)
        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:662)

"AsyncDispatcher event handler" daemon prio=10 tid=0x00007f4d6d8f6000 nid=0x5c32 in Object.wait() [0x00007f4d3a183000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:2031)
        - locked <0x000000032bc7bd58> (a java.util.LinkedList)
        at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:2015)
        at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2113)
        - locked <0x000000032bc7ba80> (a org.apache.hadoop.hdfs.DFSOutputStream)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:70)
        at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:103)
        at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.writeFile(FileSystemRMStateStore.java:528)
        at org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.storeApplicationStateInternal(FileSystemRMStateStore.java:329)
        - locked <0x00000003bae88af0> (a org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore)
        at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.handleStoreEvent(RMStateStore.java:625)
        at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:770)
        at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler.handle(RMStateStore.java:765)
        at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
        at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
        at java.lang.Thread.run(Thread.java:662)

It seemed the async dispatcher in resource manager is blocked by the method of storeNewApplication in RMSteateStore.
From the code we know there are two async dispatcher in resource manager process. One is the main dispatcher for whole resource manager to deal with applications submit, scheduler and other staff. The other is the dispatcher in rmstore, the function of rmstore can be explained in this blog. Because rmstore use hdfs or zk for backup, the process time is slow, so it have its own dispatcher in case not to pending the main dispatcher of resource manager.
Unfortunately, we use hdfs for our rmstore back up, deep inside the code

  public synchronized void storeNewApplication(RMApp app) {
    ApplicationSubmissionContext context = app
                                            .getApplicationSubmissionContext();
    assert context instanceof ApplicationSubmissionContextPBImpl;
    ApplicationState appState =
        new ApplicationState(app.getSubmitTime(), app.getStartTime(), context,
          app.getUser());
    dispatcher.getEventHandler().handle(new RMStateStoreAppEvent(appState));
  }

The main dispatcher handle the event to store the new application context information, this method only pass the event to the rmstatestore dispatcher and return immediately. But the method is sync. And in the child class of rmstatestore — Filesystemrmstatestore, the code to store application to hdfs is as follow:

  @Override
  public synchronized void storeApplicationStateInternal(ApplicationId appId,
      ApplicationStateData appStateDataPB) throws Exception {
    String appIdStr = appId.toString();
    Path appDirPath = getAppDir(rmAppRoot, appIdStr);
    fs.mkdirs(appDirPath);
    Path nodeCreatePath = getNodePath(appDirPath, appIdStr);

    LOG.info("Storing info for app: " + appId + " at: " + nodeCreatePath);
    byte[] appStateData = appStateDataPB.getProto().toByteArray();
    try {
      // currently throw all exceptions. May need to respond differently for HA
      // based on whether we have lost the right to write to FS
      writeFile(nodeCreatePath, appStateData);
    } catch (Exception e) {
      LOG.info("Error storing info for app: " + appId, e);
      throw e;
    }
  }

This method is also sync. So if dispatcher in rmstatestore occupy the lock, and write to hdfs slow, it is easy to find the main dispatcher get pending to wait for the lock. And the lock is meaningless for main dispatcher, so we can just get rid of the lock.
There are related jira for this problem, the jira is YARN-4398.

Druid 系统架构说明(二)

一.说明

续前一篇Druid 系统架构说明 ,主要介绍了druid的基本架构以及使用说明。本篇更新内容,主要介绍的是使用Realtime Index service 替代之前介绍的realtime node来完成实时ingest,index build,hand off等任务。
首先要说明一下realtime node与index server的一些区别:
Alt text
可以看出当druid集群规模增大时,使用Realtime Index Service是必须的。

二.架构与流程

相比于之前博客缩写的架构,使用Realtime Index Service的Druid系统增加了几个组件,现在的系统架构图如下:
Druid推模式
上一篇博客主要介绍的是druid的拉模式,数据通过不同的Realtime Node通过kafka等拉取数据,建立索引,handoff到Historical Node。随着Druid业务增多,规模扩大,对Realtime Node的管理变成了非常繁琐的事情,所以Druid开发了推模式,解决这一问题。相信这也是很多分布式系统应用最后都需要解决的问题,就是使部署运维简单化,自动化。
这一篇主要介绍的是推模式,推模式增加了一些角色,分别是Overlord Node, MiddleManager Node, peon以及客户端的Tranquility. 下面一一介绍各个模块的功能以及流程。

(一)角色

1.Tranquility
客户端发送工具,用户通过Tranquility将数据实时的发送到Druid中。Tranquility负责与Zk通信,与Overlord交互,根据timestamp将有效数据发送到Peon中。
2.Overlord
负责分配任务到不同的Middle Manager中,类似于ResourceManager。
3.Middle Manager
负责根据不同的任务启动Peon,并且负责Peon启动后运行的状态,类似于NodeManager。
4.Peon
Peon代替了Realtime Node的大部分功能,通过Middle Manager启动,以独立进程的形式启动。

(二)流程说明

1.用户的spec文件在Tranquility中定义,首先Tranquility通过spec初始化,获得zk中Overlord的地址,与Overlord通信。
2.Overlord得到新写入任务后,查询zk节点信息,选择一个Middle Manager节点启动来启动peon,并将信息写入到zk中。
3.Middle Manager一直监控zk,发现有新的任务分配后,启动一个Peon进程,并监控Peon进程的状态。
4.Peon与Realtime Node流程基本一致,所不同的是Peon使用的是HTTP接口来接收数据,RealTime Node更多的是内部的线程不断的拉取Kafka的数据。
5.Tranquility随后通过zk获取Peon机器地址和端口,将数据不断的发送到Peon中。
6.Peon根据spec规则,定时或者定量将数据build index,handoff到deep storage(HDFS)中。
7.随后就是Coordinator根据Peon在zk中信息,将数据写入到sql中,并分配Historical Node去deep storage拉取index数据。
8.Historical Node到deep storage拉取index数据到本地,重建index到内存中,至此数据流入完成。

三.总结

通过realtime index service的推模式,Druid的部署运维管理更加简单,易用度更高。后面一些blog会对Druid代码进行分析。

js递归实现树结构

var treeData = {
        name: 'root',
        children: [{
            name: 'child1',
            children: [{
                name: 'child1_1',
                children: [{
                    name: 'child1_1_1'
                }]
            }]
        }, {
            name: 'child2',
            children: [{
                name: 'child2_1'
            }]
        }, {
            name: 'child3'
        }]
    };
    var strArr = [treeData.name];
    //递归渲染树结构,关键在于如何抽象出递归的参数,node(叶子节点) rootOrder(记录层级) fn(用于渲染每个节点)
    function goThroughTree(node, rootOrder, fn) {
        var children = node.children || [];
        if (children.length) {
            for (var i = 0; i < children.length; i++) {
                var item = children[i];
                var index = i + 1;
                var order = rootOrder ? rootOrder + '.' + index : index;
                fn(item, order);
                goThroughTree(item, order, fn);
            }

        }
    }
    goThroughTree(treeData, 0, function (item, order) {
        strArr.push('<div>')
        strArr.push(order);
        strArr.push(item.name);
        strArr.push('</div>');
    });
    document.write(strArr.join(''))

得到结果:
root
1child1
1.1child1_1
1.1.1child1_1_1
2child2
2.1child2_1
3child3

网卡配置惊魂记

这是我近期遇到的一个集群的问题,涉及到的主要是硬件的配置导致的软件层问题。起初发现这个问题是由于我们开发的基于YARN的实时流系统某些机器的tps非常低,一开始并没有想到是硬件的问题,所以一直在排查实时框架的问题。最后发现连ping值都很低,才想到是OS配置相关的问题。
这是ping的延迟:

64 bytes from 10.39.****: icmp_seq=34 ttl=63 time=27.4 ms
64 bytes from 10.39.****: icmp_seq=35 ttl=63 time=21.2 ms
64 bytes from 10.39.****: icmp_seq=36 ttl=63 time=5.95 ms
64 bytes from 10.39.****: icmp_seq=37 ttl=63 time=16.5 ms
64 bytes from 10.39.****: icmp_seq=38 ttl=63 time=12.3 ms

底下是正常的ping的延迟

64 bytes from 10.39.****: icmp_seq=13 ttl=63 time=0.230 ms
64 bytes from 10.39.****: icmp_seq=14 ttl=63 time=0.203 ms
64 bytes from 10.39.****: icmp_seq=15 ttl=63 time=0.255 ms
64 bytes from 10.39.****: icmp_seq=16 ttl=63 time=0.229 ms

推测就是网卡的问题,我们是两个千兆网卡bonding,查看/proc/interrupt发现每个网卡只绑到了一个core上,这个core一直跑满,并且丢包,正常的网卡配置会使用多队列,将不同队列亲和到不同的core上,提高tps等。相关的资料可以参见这篇文章:网卡多队列简介.
由于实时系统发送的大多是小包,所以才会有这么大的影响,解决方法比较坑爹,配置好多队列后需要重启服务器。

Some Note About Deploy Hadoop Clusters

1.Download source code, using maven to compile and package, remember to compile the native for the same OS version.

mvn package -Pdist -Pnative -Dtar -DskipTests

2.Edit the core-site.xml, hdfs-site.xml, mapred-site.xml and yarn-site.xml file
you should add some JVM parameters or log position in /etc/bashrc, just a example below:

export JAVA_HOME=/usr/local/jdk1.7.0_67
export JRE_HOME=/usr/local/jdk1.7.0_67/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export HADOOP_HOME=/usr/local/hadoop-2.4.0
export HADOOP_LOG_DIR=/data0/hadoop/log/hadoop
export HADOOP_PID_DIR=/data0/hadoop/pid/hadoop
export YARN_LOG_DIR=/data0/hadoop/log/yarn
export YARN_PID_DIR=/data0/hadoop/pid/yarn
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
export HADOOP_NAMENODE_OPTS=" -Xmx20480m -Xms20480m -Xmn3072m -verbose:gc -Xloggc:/data0/hadoop/gclog/namenode.gc.log -XX:ErrorFile=/data0/hadoop/gclog/hs_err_pid.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=85 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=1000 -XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC -Dcom.sun.management.jmxremote.port=6000 -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.password.file=/usr/local/hadoop-2.4.0/etc/hadoop/jmxremote.password"
export YARN_RESOURCEMANAGER_OPTS=" -Xmx10240m -Xms10240m -Xmn3072m -verbose:gc -Xloggc:/data0/hadoop/gclog/yarn.gc.log -XX:ErrorFile=/data0/hadoop/gclog/hs_err_pid.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:CMSMaxAbortablePrecleanTime=1000 -XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC -Dcom.sun.management.jmxremote.port=6001 -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.password.file=/usr/local/hadoop-2.4.0/etc/hadoop/jmxremote.password"
ulimit -u 65535

3.Put the packaged code into all the servers, including namenodes, resourcemanagers, nodemanagers and datanodes
4.startup journalnode first, i assume you use qjournal

hadoop-daemon.sh start journalnode

5.format namenode for specific namespace

hdfs namenode -format
if you use federation, make sure the cluster id are the same,so if the first ns cluster id is abcdefg, the second ns should format with the cluster id, hdfs namenode -format -clusterId=abcdefg

6.init the standby namenode for the same namespace

hdfs namenode -bootstrapStandby

7.start namenodes and datanodes

hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode

8.transition to active namenode

for example namespace is ns, active namenode is nn1
 hdfs haadmin -ns ns -transitionToActive nn1

9.mkdir dir for hadoop user and mapred(the user to startup resource manager and history server) user
10.mkdir for history server
for example the mapred-site.xml set history directory

mapreduce.jobhistory.intermediate-done-dir
hdfs://hbasens/hadoop/history/tmp
true
mapreduce.jobhistory.done-dir
hdfs://hbasens/hadoop/history/done
true

you have to set directory like this

hdfs dfs -mkdir -p /hadoop/history/tmp
hdfs dfs -chown -R mapred:mapred /hadoop/history
hdfs dfs -chmod -R 1777 /hadoop/history/tmp
hdfs dfs -mkdir -p /hadoop/history/done
hdfs dfs -chmod -R 1777 /hadoop/history/done

11.startup resourcemanager and nodemanager and mr history server

yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
mr-jobhistory-daemon.sh start historyserver

使用grunt搭建前后端分离的开发环境

由于OPOA的兴起,前后端分离的开发模式已经成为web开发的趋势,搭建一个能够快速进行并行开发独立联调的环境是非常重要的,感谢grunt gulp的前端构建工具,让这个目标变为了可能。下面就介绍一种grunt配置方式能够实现如下目标,让f2er能够拷贝即用

  • 1 搭建server
  • 2 监听js html css文件变化自动刷新(不使用插件)
  • 3 实现代理功能,访问后端的api接口 调试前端src下的文件。

假设我们的web站点为 apis路径下是后台的接口,src下是所有的前端文件 index.html是入口的文件,package.json和Gruntfile文件如下:

{
  "name": "webtools",
  "version": "0.0.1",
  "description": "grunt tools for web development",
  "main": "Gruntfile.js",
  "repository": {
    "type": "git",
    "url": ""
  },
  "author": "zhangmeng<zhangmeng712@126.com>",
  "license": "ISC",
  "devDependencies": {
    "grunt": "^0.4.5",
    "grunt-contrib-connect": "^1.0.0",
    "grunt-contrib-watch": "^0.6.1",
    "grunt-connect-proxy": "",
    "connect-livereload": "^0.5.4",
    "serve-static": "^1.10.2"
  }
}

var HOSTNAME = '0.0.0.0'; var LIVERELOAD_PORT = 9991; var SERVER_PORT = 9966; var serveStatic = require('serve-static'); //如果请求header中有验证信息如cookie则需要自己配置 var cookie = 'xxxxxxx'; module.exports = function (grunt) { require('load-grunt-tasks')(grunt); grunt.initConfig({ connect: { options: { port: SERVER_PORT, hostname: HOSTNAME }, proxies: [ { context: '/apis', // 这是你希望出现在grunt serve服务中的路径,比如这里配置的是http://127.0.0.1:9000/api/ host: 'dev.alp.xx-inc.com', port: 80, // 远端服务器端口 headers: { 'cookie': cookie, 'host': 'dev.alp.xx-inc.com' }, rewrite: { '^/apis/': '/' //地址映射策略,从context开始算,把前后地址做正则替换,如果远端路径和context相同则不用配置。 } } ], livereload: { options: { open: { target: 'http://' + HOSTNAME + ':' + SERVER_PORT + '/index.html?debug&mock&local' }, // 通过LiveReload脚本,让页面重新加载。 middleware: function (connect, options) { var proxyMid = require('grunt-connect-proxy/lib/utils').proxyRequest; var livereloadMid = require('connect-livereload')({port: LIVERELOAD_PORT}); var serveMid = serveStatic(__dirname + '/'); var midArr; //判断联调还是mock开发 var isLiantiao = grunt.option('liantiao'); if (isLiantiao) { midArr= [livereloadMid, proxyMid, serveMid ]} else {midArr= [livereloadMid, serveMid ]} return midArr; } } } }, watch: { doc: { files: ['src/**/*.js', 'src/**/*.html', 'src/**/*.css'], tasks: [] }, options: { livereload: LIVERELOAD_PORT, spawn: true } } }); grunt.registerTask('server', ['configureProxies:server', 'connect:livereload', 'watch']); };
    grunt server:livereload #mock调试
    grunt server:livereload --liantiao #前后端联调

知识点

  • load-grunt-tasks 可以帮助我们把package.json中含有grunt-*的内容load下来,代替grunt.loadNpmTasks(‘grunt-xxx’)
  • var serveStatic = require(‘serve-static’); 老的connect版本中使用的是connect.static现在已经都被serve-static中间件代替
  • 如果后端是个通用的接口则可以直接使用proxies的配置方法,如果有些系统需要cookie信息(ssl证书产生)则需要先访问网站,得到对应信息,然后在手动拷贝到Gruntfile.js中去。或者命令行访问 curl -i -L ‘http://dev.alp.xxxx-inc.com/’ -k –cert cer.pem得到对应的header信息拷贝过来。

前端自动化测试(三)- angular和protracor

protractor用于前端UI自动化测试,特别为angular程序定制

特点

  • 端对端(e2e)测试
  • 采用jasmine作为测试框架
  • 基于WebDriverJS,(selenium-webdriver)
  • 针对angular应用增加定位器,更加方便实用
  • 实现自动等待,告别sleep wait,变异步为同步
  • 支持测试代码的调试
  • 支持多浏览器的并行UI测试

使用方法

准备工作

  • 1 安装protractor: npm install -g protractor
  • 2 安装selenium-standlone: webdriver-manager update
  • 3 启动selenium服务器: webdriver-manager start

spec书写

spec.js是用于书写测试用例的文件, protractor默认使用jasmine作为测试框架,举最简单的例子来说,一个spec文件可以这样写,使用describe作为
测试程序”块”, it定义一个用例,expect作为断言,其中browser这个全局的变量,用于操作浏览器.

describe('Protractor Demo App', function() {
    it('should have a title', function() {
        browser.get('http://juliemr.github.io/protractor-demo/');
        expect(browser.getTitle()).toEqual('Super Calculator');
    });
});

运行

  • 配置conf.json
    在测试之前,我们需要建立一个conf.json的文件,在这个文件中,可以配置测试的相关内容,例如:
    • multiCapabilities:使用哪些浏览器测试
    • chromeOptions:chrome浏览器的运行参数(使用哪些插件等)
    • framework:使用哪种测试框架: cucumber macha 还是jasmine
    • specs:测试哪些文件
      详细的配置信息请参考
exports.config = {
   // directConnect: true,

    // Capabilities to be passed to the webdriver instance.
    //capabilities: {
    //    'browserName': 'chrome'
    //},
    multiCapabilities: [ {
        'browserName': 'chrome',
        //'chromeOptions': {
        //    'args': ['--load-extension=/opt/local/share/nginx/html/radar/tanxtag'],
        //}
    }],
    // Framework to use. Jasmine is recommended.
    framework: 'jasmine',

    // Spec patterns are relative to the current working directly when
    // protractor is called.
    specs: ['basic/demo_spec.js','basic/angular_spec.js'],

    // Options to be passed to Jasmine.
    jasmineNodeOpts: {
        defaultTimeoutInterval: 30000
    }
};
  • 运行测试程序
protractor conf.json

grunt运行测试

有时我们需要使用grunt来配置测试任务,下面就是使用grunt-concurrent 模块实现并行运行多浏览器(也可通过conf.json中配置multiCapabilities解决)测试程序的代码:

module.exports = grunt => {
    //This module will read the dependencies/devDependencies/peerDependencies/optionalDependencies in your package.json
    // and load grunt tasks that match the provided patterns.
    require('load-grunt-tasks')(grunt);
    grunt.initConfig({
        concurrent: {
            protractor_test: ['protractor-chrome', 'protractor-firefox', 'protractor-safari']
        },
        protractor: {
            options: {
                keepAlive: true,
                singleRun: false,
                configFile: "conf.js"
            },
            run_chrome: {
                options: {
                    args: {
                        browser: "chrome"
                    }
                }
            },
            run_firefox: {
                options: {
                    args: {
                        browser: "firefox"
                    }
                }
            },
            run_safari: {
                options: {
                    args: {
                        browser: "safari"
                    }
                }
            }
        }
    });

    grunt.registerTask('protractor-chrome', ['protractor:run_chrome']);
    grunt.registerTask('protractor-firefox', ['protractor:run_firefox']);
    grunt.registerTask('protractor-safari', ['protractor:run_safari']);
    grunt.registerTask('protractor-e2e', ['concurrent:protractor_test']);
};

调试测试程序

除了非常方便的运行机制,protractor还提供便捷的调试方式, 使用selenium-webdriver操纵浏览器的时候,调试是非常困难的,在这里protractor就提供调试方式
在代码中加入 browser.pause(); 并且在终端输入 “repl” 就可以使用WebDriver commands来调试程序了:

wd-debug> repl
> element
function (locator) {
    return new ElementArrayFinder(ptor).all(locator).toElementFinder_();
  }
> 

测试非angular的应用

protractor内置方法测试angular的程序,例如它会自动检测angular页面加载完毕才会执行测试程序,当测试非angular程序的时候需要:
– 1 使用 browser.driver 代替 driver
– 2 添加 browser.driver.ignoreSynchronization = true
参考

protractor的详细使用

在protractor中,有几大类用于测试代码,详情请见protractorAPI
– browser: 浏览器的操作
– element & by: 定位获取页面元素
– ExpectedConditions:用于页面操作的逻辑函数,一般同wait连用
– webdriver: selenium 原生的语法函数
– promise:selenium内置的promise方法

浏览器的操作-browser

常用操作代码如下:

  • browser.get:
  • browser.findElement
  • browser.switchTo().frame()
  • browser.executeScript:
  • browser.executeAsyncScript
  • browser.wait:
  • browser.sleep:

选择器- by & element

支持多源选择器
– by.css()
– by.id()
– by.xpath()
– by.name()
– by.tagName()
– by.model():angular专用
– by.binding():angular专用
– by.repeater():angular专用

通过element获取:element(by.id(‘frameId’))或者element.all(by.css(‘some-css’));
在非angular应用中使用browser.driver.findElement(by.id(‘frameId’))

ExpectedConditions

预定义了wait的条件,常用的有
– elementToBeClickable: 按钮可以点击
– presenceOf: 元素出现在dom中
– titleContains: title含有某个字符串
– visibilityOf: 某个元素显示

    var EC = protractor.ExpectedConditions;
    var button = $('#xyz');
    var isClickable = EC.elementToBeClickable(button);

    browser.get(URL);
    browser.wait(isClickable, 5000); //wait for an element to become clickable
    button.click();

综合实例


it ('test login error', function () { _driver.get('http://subway.simba.taobao.com/#!/login'); _driver.wait(protractor.until.elementLocated(by.css('.login-ifr')),1000).then(function (elem) { _driver.switchTo().frame(elem); _driver.findElement(by.name('TPL_username')).sendKeys('zhangmeng1986712'); _driver.findElement(by.name('TPL_password')).sendKeys('xxxxx'); _driver.findElement(by.id('J_SubmitStatic')).click(); _driver.sleep(1000); browser.driver.findElement(by.css('.error')).then(function (elem) { return elem.getInnerHtml().then(function(text) { expect(text).toMatch('密码和账户名不匹配'); }); }); }); });

page object pattern

page object的模式大家一定不陌生,通过合理的配置可以使测试代码更容易维护,举例来说可以这样:

//书写一个input操作类
var AngularHomepage = function() {
  var nameInput = element(by.model('yourName'));
  var greeting = element(by.binding('yourName'));

  this.get = function() {
    browser.get('http://www.angularjs.org');
  };

  this.setName = function(name) {
    nameInput.sendKeys(name);
  };

  this.getGreeting = function() {
    return greeting.getText();
  };
};
//测试代码
describe('angularjs homepage', function() {
  it('should greet the named user', function() {
    var angularHomepage = new AngularHomepage();
    angularHomepage.get();
    angularHomepage.setName('Julie');
    expect(angularHomepage.getGreeting()).toEqual('Hello Julie!');
  });
});

mobile端的测试

详情参考
这个例子是使用Appium作为server端进行测试的,由于selenium-webdriver不能直接联Appium, 所以需要使用wd-bridge进行折衷.

e2e测试程序设计准则

参考

参考代码

本文的参考代码见 Github