Tuesday, November 05, 2013

Hadoop – Set map and reduce task from command line


A quick way to test your map and reduce task numbers for you cluster, good for cluster tuning.

You can specify map or reduce tasks from command line, and this example jar file is built in:

$ hadoop jar /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep  input output 'dfs[a-z.]+'
$ hadoop jar /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep -D mapred.map.tasks=8 -D mapred.reduce.tasks=6 input output 'dfs[a-z.]+'


For other properies:
http://archive.cloudera.com/cdh4/cdh/4/mr1/mapred-default.html

2 comments:

Sundara rami reddy said...

hi ,you have gathered a valuable information on Hadoop...., and i am much impressed with the information and it is useful for Hadoop Learners.These blogs are valuable because these are providing such informative information for all the people.
Hadoop Training in hyderabad

Tony Xu said...

Thanks Sundara for your comment.