# B1 - Hadoop Word Count > [!NOTE] > These are generic instructions, need to refine them. --- 1. Copy and paste the following code in `WordCount.java` file: ```java import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class WordCount { // Mapper Class public static class TokenizerMapper extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken().toLowerCase().replaceAll("[^a-zA-Z0-9]", "")); if (!word.toString().isEmpty()) { context.write(word, one); } } } } // Reducer Class public static class IntSumReducer extends Reducer { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: WordCount "); System.exit(-1); } Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); // optional job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } } ``` 2. Create an `input.txt` file in the same directory as the above code: ```text This is a sample code. All the way from KSKA Git. Hello world Meow meow meow meow ``` 3. In the same directory, open a `Terminal` window and compile the Java code: ```shell javac -classpath `hadoop classpath` -d . WordCount.java jar cvf WordCount.jar *.class jar tf WordCount.jar ``` > [!NOTE] > Compiled code will be saved in the current working directory. 4. Create an input and output directory in Hadoop FS: ```shell hadoop fs -mkdir /user/hadoop/input hadoop fs -mkdir /user/hadoop/output ``` 5. Upload the `input.txt` file to Hadoop FS: ```shell hadoop fs -put input.txt /user/hadoop/input/ ``` 6. Run the WordCount job: ```shell hadoop jar WordCount.jar WordCount /user/hadoop/input/input.txt /user/hadoop/output ``` 7. View the output: ```shell hadoop fs -cat /user/hadoop/output/part-r-00000 ``` ---