My HDFS balancer is slow

An elephant keeper tells me his HDFS balancer is slow and he can’t sleep well at night. He asks me if I can help speed it up.

OK, by design the HDFS balancer runs slowly in background, balancing the whole cluster periodically. It’s fine to be slow, I tell him, so that it does not affect the normal cluster activities. Your users submit jobs, copy datas in and out, and operate the cluster for fun, without knowing that a balancer is running in the meantime. So go to sleep and sleep well. Don’t worry about slow balancer.

Read More

My Standby NameNode hangs from time to time

One elephant keeper asked me, should he be concerned if his standby NameNode hangs occasionally, from 10 seconds to 30 seconds. Sometimes he found it’s not responsive to block reports, failover requests, or other operations; fortunately the standby NN was able to recover later. Maybe there are other short hangs that he was not aware of.

Read More

Distcp to Amazon S3 reports FileNotFoundException

An elephant keeper told me that he was trying to copy the data from his HDFS to S3 and he saw quite a few FileNotFoundException. However, when he checked the failing files immediately from Amazon S3 web console, he was able to see them in S3 Bucket. I then kindly asked him one question: Did you use the -p option in your Distcp command line? He said, yes, ‘cause he does not want to lose the file metadata so he thought it’s a good practise to keep file attributes when copying files.

Read More

Hello World

Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.

Read More