hadoop自带例子wordcount的具体运行步骤
内容导读
互联网集市收集整理的这篇技术教程文章主要介绍了hadoop自带例子wordcount的具体运行步骤,小编现在分享给大家,供广大互联网技能从业者学习和参考。文章包含3766字,纯文字阅读大概需要6分钟。
内容图文
1.在hadoop所在目录“usr/local”下创建一个文件夹input
root@ubuntu:/usr/local# mkdir input
2.在文件夹input中创建两个文本文件file1.txt和file2.txt,file1.txt中内容是“hello word”,file2.txt中内容是“hello hadoop”、“hello mapreduce”(分两行)。
root@ubuntu:/usr/local# cd input
root@ubuntu:/usr/local/input# echo "hello
word" > file1.txt
root@ubuntu:/usr/local/input# echo "hello hadoop" >
file2.txt
root@ubuntu:/usr/local/input# echo "hello mapreduce" > file2.txt
(hello mapreduce 会覆盖原来写入的hello hadoop
,可以使用gedit编辑file2.txt)
root@ubuntu:/usr/local/input# ls
file1.txt
file2.txt
显示文件内容可用:
root@ubuntu:/usr/local/input# more file1.txt
hello
word
root@ubuntu:/usr/local/input# more file2.txt
hello mapreduce
hello
hadoop
3.在HDFS上创建输入文件夹wc_input,并将本地文件夹input中的两个文本文件上传到集群的wc_input下
root@ubuntu:/usr/local/hadoop-1.2.1# bin/hadoop fs -mkdir wc_input
root@ubuntu:/usr/local/hadoop-1.2.1# bin/hadoop fs -put /usr/local/input/file* wc_input
查看wc_input中的文件:
root@ubuntu:/usr/local/hadoop-1.2.1# bin/hadoop fs -ls wc_input
Found 2
items
-rw-r--r-- 1 root supergroup 11 2014-03-13 01:19
/user/root/wc_input/file1.txt
-rw-r--r-- 1 root supergroup 29
2014-03-13 01:19 /user/root/wc_input/file2.txt
4.启动所有进程并查看进程:
root@ubuntu:/# ssh localhost (用于验证能否实现无密码登陆localhost,如果能会出现下面的信息。否则需要设置具体步骤见http://blog.csdn.net/joe_007/article/details/8298814)
Welcome to Ubuntu 12.04.3 LTS (GNU/Linux 3.2.0-24-generic-pae i686)
* Documentation: https://help.ubuntu.com/
Last login: Mon Mar 3 04:44:23 2014 from localhost
root@ubuntu:~# exit
logout
Connection to localhost closed.
root@ubuntu:/usr/local/hadoop-1.2.1/bin# ./start-all.sh
starting namenode, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-root-namenode-ubuntu.out
localhost:
starting datanode, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-root-datanode-ubuntu.out
localhost:
starting secondarynamenode, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-root-secondarynamenode-ubuntu.out
starting
jobtracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-root-jobtracker-ubuntu.out
localhost:
starting tasktracker, logging to
/usr/local/hadoop-1.2.1/libexec/../logs/hadoop-root-tasktracker-ubuntu.out
root@ubuntu:/usr/local/hadoop-1.2.1/bin# jps
7847
SecondaryNameNode
4196
7634 DataNode
7423 NameNode
8319 Jps
7938
JobTracker
8157 TaskTracker
运行hadoop自带的wordcount jar包(注:再次运行时一定要先将前一次运行的输出文件夹删除)
root@ubuntu:/usr/local/hadoop-1.2.1# bin/hadoop jar
./hadoop-examples-1.2.1.jar wordcount wc_input wc_output
14/03/13 01:48:40
INFO input.FileInputFormat: Total input paths to process : 2
14/03/13
01:48:40 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
14/03/13 01:48:40 WARN snappy.LoadSnappy: Snappy native library not
loaded
14/03/13 01:48:42 INFO mapred.JobClient: Running job:
job_201403130031_0001
14/03/13 01:48:44 INFO mapred.JobClient: map 0% reduce
0%
14/03/13 01:52:47 INFO mapred.JobClient: map 50% reduce 0%
14/03/13
01:53:50 INFO mapred.JobClient: map 100% reduce 0%
14/03/13 01:54:14 INFO
mapred.JobClient: map 100% reduce 100%
... ...
5.查看输出文件夹
root@ubuntu:/usr/local/hadoop-1.2.1# bin/hadoop fs -ls wc_output
Found 3
items
-rw-r--r-- 1 root supergroup 0 2014-03-13 01:54
/user/root/wc_output/_SUCCESS
drwxr-xr-x - root supergroup 0
2014-03-13 01:48 /user/root/wc_output/_logs
-rw-r--r-- 1 root supergroup
36 2014-03-13 01:54 /user/root/wc_output/part-r-00000
(实际输出结果在part-r-00000中)
6.查看输出文件part-r-00000中的内容
root@ubuntu:/usr/local/hadoop-1.2.1# bin/hadoop fs -cat
/user/root/wc_output/part-r-00000
hadoop 1
hello 3
mapreduce 1
word 1
7.关闭所有进程
root@ubuntu:/usr/local/hadoop-1.2.1/bin# ./stop-all.sh
stopping
jobtracker
localhost: stopping tasktracker
stopping namenode
localhost:
stopping datanode
localhost: stopping secondarynamenode
原文:http://www.cnblogs.com/xuepei/p/3599202.html
内容总结
以上是互联网集市为您收集整理的hadoop自带例子wordcount的具体运行步骤全部内容,希望文章能够帮你解决hadoop自带例子wordcount的具体运行步骤所遇到的程序开发问题。 如果觉得互联网集市技术教程内容还不错,欢迎将互联网集市网站推荐给程序员好友。
内容备注
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 gblab@vip.qq.com 举报,一经查实,本站将立刻删除。
内容手机端
扫描二维码推送至手机访问。