Tuesday, November 05, 2013

Hadoop – Set map and reduce task from command line

A quick way to test your map and reduce task numbers for you cluster, good for cluster tuning.

You can specify map or reduce tasks from command line, and this example jar file is built in:

$ hadoop jar /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep  input output 'dfs[a-z.]+'
$ hadoop jar /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep -D mapred.map.tasks=8 -D mapred.reduce.tasks=6 input output 'dfs[a-z.]+'

For other properies:

1 comment:

Tony Li Xu said...

Thanks Sundara for your comment.