YARN Container cleanup kill其它进程导致的NodeManager 挂起


在我最近的升级过程中,经常发现一些NodeManager无关挂起,并且挂起前没有任何日志,查看dmesg,也没有任何异常。对于这种情况,非常难查原因,经过同事排查,最后确定是由于Yarn Container的cleanup导致的bug。


1. 首先kill SIGTERM pid,让container能够优雅的退出
2. 随后kill SIGKILL pid,直接kill -9
3. 这时候可能会产生一些问题,如果在这250ms之内这个container已经退出,同时这个pid被分配给其它线程使用了,这时候kill掉新启动的线程,如果是同一个用户启动的话就可能kill掉该线程对应的整个进程。
但是这个现象产生需要一定的条件,对于Linux Container Executor,如果使用不同的用户去启动,那么即使kill掉这个pid,也不会被杀。对于Default Container Executor,则会出现这一问题。
为此,我们需要修改代码,修改方法也很简单,在kill -9之前ps一下这个进程的pid,查看一下是否是之前执行的containerId,就可以了,具体代码在github

using ssmtp to send gmail on linux server

Sometimes, we want to send email on linux server to alert some event for purpose. We can fake the sender’s email address, and send. But unfortunately, most of the email server will treat these emails as spam, that is not very convenient. So we want to use our username and password to send email through gmail. Here is some step to configure and use ssmpt to do it.

My linux distribution is centos , it is okay if you use ubuntu, just use apt instead yum.

1. #yum install ssmtp
2. #vi /etc/ssmtp/ssmtp.conf     //edit configuration
Here is the setting you should add  

Beware you have to add TLS_CA_File in the setting, if not ,you will get Cannot open smtp.gmail.com:587 Error.
After that, you can test your setting,
echo “test” | ssmtp -vvv TESTEMAIL@ADDRESS
If everything goes well, congratulations, you success. If not, check /var/log/maillog, i think most of the error is “Authorization failed (534 5.7.14 https://support.google.com/mail/answer/78754 uy4sm4234351pbc.69 – gsmtp)”.
The problem is caused by google security policy. You can resolve it as the following
1.Google will send you a email to remainder you a Sign-in attempt prevented event, login your google account, and permit the login from your server
2.then go to this https://www.google.com/settings/security/lesssecureapps and set “Access for less secure apps” to ON
You can test it using the command mentioned before. If you still can not send email, check /var/log/maillog and google the answer yourself.