There seems to be quite a few changes in Hadoop 1.0, which was not reflected in Hadoop's official setup guide. This document attempts to supplement Hadoop's setup guide, with updates for version 1.0.
Please follow the official Hadoop setup guide first and then check the specific sections for 1.0 update.
Standalone Operations
In Hadoop 1.0, all configuration files have been moved to etc/hadoop directory.
The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.
$ mkdir input
$ cp etc/hadoop/*.xml input
$ bin/hadoop jar share/hadoop/hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
$ cat output/*
Execution
Start the hadoop daemons:
$ sbin/start-all.sh
Copy the input files into the distributed filesystem:
$ bin/hadoop fs -put etc/hadoop/ input
$ bin/hadoop fs -put etc/hadoop/ input
Run some of the examples provided:
$ bin/hadoop jar share/hadoop/hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
$ bin/hadoop jar share/hadoop/hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
Should see something like this:
12/02/23 16:48:04 INFO mapred.FileInputFormat: Total input paths to process : 16
12/02/23 16:48:05 INFO mapred.JobClient: Running job: job_201202231031_0001
12/02/23 16:48:06 INFO mapred.JobClient: map 0% reduce 0%
12/02/23 16:48:19 INFO mapred.JobClient: map 12% reduce 0%
12/02/23 16:48:28 INFO mapred.JobClient: map 25% reduce 0%
12/02/23 16:48:34 INFO mapred.JobClient: map 25% reduce 4%
12/02/23 16:48:37 INFO mapred.JobClient: map 37% reduce 4%
12/02/23 16:48:40 INFO mapred.JobClient: map 37% reduce 8%
12/02/23 16:48:43 INFO mapred.JobClient: map 50% reduce 8%
12/02/23 16:48:50 INFO mapred.JobClient: map 56% reduce 12%
12/02/23 16:48:53 INFO mapred.JobClient: map 62% reduce 12%
12/02/23 16:48:56 INFO mapred.JobClient: map 68% reduce 18%
12/02/23 16:48:58 INFO mapred.JobClient: map 75% reduce 22%
12/02/23 16:49:01 INFO mapred.JobClient: map 81% reduce 22%
12/02/23 16:49:04 INFO mapred.JobClient: map 87% reduce 22%
12/02/23 16:49:07 INFO mapred.JobClient: map 93% reduce 27%
12/02/23 16:49:10 INFO mapred.JobClient: map 100% reduce 27%
12/02/23 16:49:13 INFO mapred.JobClient: map 100% reduce 29%
12/02/23 16:49:20 INFO mapred.JobClient: map 100% reduce 100%
12/02/23 16:49:25 INFO mapred.JobClient: Job complete: job_201202231031_0001
12/02/23 16:49:25 INFO mapred.JobClient: Counters: 30
12/02/23 16:49:25 INFO mapred.JobClient: Job Counters
12/02/23 16:49:25 INFO mapred.JobClient: Launched reduce tasks=1
12/02/23 16:49:25 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=99019
12/02/23 16:49:25 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
12/02/23 16:49:25 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
12/02/23 16:49:25 INFO mapred.JobClient: Launched map tasks=16
12/02/23 16:49:25 INFO mapred.JobClient: Data-local map tasks=16
12/02/23 16:49:25 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=60471
12/02/23 16:49:25 INFO mapred.JobClient: File Input Format Counters
12/02/23 16:49:25 INFO mapred.JobClient: Bytes Read=26852
12/02/23 16:49:25 INFO mapred.JobClient: File Output Format Counters
12/02/23 16:49:25 INFO mapred.JobClient: Bytes Written=180
12/02/23 16:49:25 INFO mapred.JobClient: FileSystemCounters
12/02/23 16:49:25 INFO mapred.JobClient: FILE_BYTES_READ=82
12/02/23 16:49:25 INFO mapred.JobClient: HDFS_BYTES_READ=28574
12/02/23 16:49:25 INFO mapred.JobClient: FILE_BYTES_WRITTEN=367327
12/02/23 16:49:25 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=180
12/02/23 16:49:25 INFO mapred.JobClient: Map-Reduce Framework
12/02/23 16:49:25 INFO mapred.JobClient: Map output materialized bytes=172
12/02/23 16:49:25 INFO mapred.JobClient: Map input records=758
12/02/23 16:49:25 INFO mapred.JobClient: Reduce shuffle bytes=166
12/02/23 16:49:25 INFO mapred.JobClient: Spilled Records=6
12/02/23 16:49:25 INFO mapred.JobClient: Map output bytes=70
12/02/23 16:49:25 INFO mapred.JobClient: Total committed heap usage (bytes)=2596864000
12/02/23 16:49:25 INFO mapred.JobClient: CPU time spent (ms)=12500
12/02/23 16:49:25 INFO mapred.JobClient: Map input bytes=26852
12/02/23 16:49:25 INFO mapred.JobClient: SPLIT_RAW_BYTES=1722
12/02/23 16:49:25 INFO mapred.JobClient: Combine input records=3
12/02/23 16:49:25 INFO mapred.JobClient: Reduce input records=3
12/02/23 16:49:25 INFO mapred.JobClient: Reduce input groups=3
12/02/23 16:49:25 INFO mapred.JobClient: Combine output records=3
12/02/23 16:49:25 INFO mapred.JobClient: Physical memory (bytes) snapshot=2788790272
12/02/23 16:49:25 INFO mapred.JobClient: Reduce output records=3
12/02/23 16:49:25 INFO mapred.JobClient: Virtual memory (bytes) snapshot=9619705856
12/02/23 16:49:25 INFO mapred.JobClient: Map output records=3
12/02/23 16:49:25 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/02/23 16:49:25 INFO mapred.FileInputFormat: Total input paths to process : 1
12/02/23 16:49:25 INFO mapred.JobClient: Running job: job_201202231031_0002
12/02/23 16:49:26 INFO mapred.JobClient: map 0% reduce 0%
12/02/23 16:49:41 INFO mapred.JobClient: map 100% reduce 0%
12/02/23 16:49:53 INFO mapred.JobClient: map 100% reduce 100%
12/02/23 16:49:58 INFO mapred.JobClient: Job complete: job_201202231031_0002
12/02/23 16:49:58 INFO mapred.JobClient: Counters: 30
12/02/23 16:49:58 INFO mapred.JobClient: Job Counters
12/02/23 16:49:58 INFO mapred.JobClient: Launched reduce tasks=1
12/02/23 16:49:58 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=13874
12/02/23 16:49:58 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
12/02/23 16:49:58 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
12/02/23 16:49:58 INFO mapred.JobClient: Launched map tasks=1
12/02/23 16:49:58 INFO mapred.JobClient: Data-local map tasks=1
12/02/23 16:49:58 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10575
12/02/23 16:49:58 INFO mapred.JobClient: File Input Format Counters
12/02/23 16:49:58 INFO mapred.JobClient: Bytes Read=180
12/02/23 16:49:58 INFO mapred.JobClient: File Output Format Counters
12/02/23 16:49:58 INFO mapred.JobClient: Bytes Written=52
12/02/23 16:49:58 INFO mapred.JobClient: FileSystemCounters
12/02/23 16:49:58 INFO mapred.JobClient: FILE_BYTES_READ=82
12/02/23 16:49:58 INFO mapred.JobClient: HDFS_BYTES_READ=296
12/02/23 16:49:58 INFO mapred.JobClient: FILE_BYTES_WRITTEN=42387
12/02/23 16:49:58 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=52
12/02/23 16:49:58 INFO mapred.JobClient: Map-Reduce Framework
12/02/23 16:49:58 INFO mapred.JobClient: Map output materialized bytes=82
12/02/23 16:49:58 INFO mapred.JobClient: Map input records=3
12/02/23 16:49:58 INFO mapred.JobClient: Reduce shuffle bytes=82
12/02/23 16:49:58 INFO mapred.JobClient: Spilled Records=6
12/02/23 16:49:58 INFO mapred.JobClient: Map output bytes=70
12/02/23 16:49:58 INFO mapred.JobClient: Total committed heap usage (bytes)=220725248
12/02/23 16:49:58 INFO mapred.JobClient: CPU time spent (ms)=1990
12/02/23 16:49:58 INFO mapred.JobClient: Map input bytes=94
12/02/23 16:49:58 INFO mapred.JobClient: SPLIT_RAW_BYTES=116
12/02/23 16:49:58 INFO mapred.JobClient: Combine input records=0
12/02/23 16:49:58 INFO mapred.JobClient: Reduce input records=3
12/02/23 16:49:58 INFO mapred.JobClient: Reduce input groups=1
12/02/23 16:49:58 INFO mapred.JobClient: Combine output records=0
12/02/23 16:49:58 INFO mapred.JobClient: Physical memory (bytes) snapshot=235311104
12/02/23 16:49:58 INFO mapred.JobClient: Reduce output records=3
12/02/23 16:49:58 INFO mapred.JobClient: Virtual memory (bytes) snapshot=1171832832
12/02/23 16:49:58 INFO mapred.JobClient: Map output records=3
Examine output.
Copy the output files from the distributed filesystem to the local filesytem and examine them:
$ bin/hadoop fs -get output output
$ cat output/*
or
View the output files on the distributed filesystem:
$ bin/hadoop fs -cat output/*
$ bin/hadoop fs -cat output/*
When you're done, stop the daemons with:
$ sbin/stop-all.sh
$ sbin/stop-all.sh
Hadoop is creating more opportunities to every one. And thanks for sharing best information about hadoop in this Thanks so very much for taking your time to create this very useful and informative site. I have learned a lot from your site. Thanks!!
ReplyDeleteHadoop Training in hyderabad
Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing. VMWare Training in chennai | VMWare Training chennai | VMWare course in chennai | VMWare course chennai
ReplyDeletevery nice blogs!!! i have to learning for lot of information for this sites...Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing. AWS course chennai | AWS certification in chennai | AWS cerfication chennai
ReplyDeleteNice article i was really impressed by seeing this article, it was very interesting and it is very useful for me.. cloud computing training in chennai | cloud computing training chennai | cloud computing course in chennai | cloud computing course chennai
ReplyDeleteTruely a very good article on how to handle the future technology. This content creates a new hope and inspiration within me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks :)
ReplyDeleteSoftware testing training in chennai | Software testing course in chennai | Testing training in chennai
lucky patcher custom patches
ReplyDeleteThanks for the content loaded with lots of new info.
ReplyDeleteSelenium Training in Chennai
Best Selenium Training Institute in Chennai
ios developer training in chennai
Best ios Training institute in Chennai
Android Classes in Chennai
App development course in chennai
Excellent post! keep sharing such a post
ReplyDeleteGuest posting sites
Education
Thanks for sharing,this blog makes me to learn new thinks.
ReplyDeleteinteresting to read and understand.keep updating it.
Android Training in Padur
Android Training Institutes in Vadapalani
Android Training courses near me
android development training in bangalore
Really it was an awesome article. very interesting to read.
ReplyDeleteThanks for sharing.
Tableau Classes
Tableau Courses
Tableau Classroom Training
Tableau Advanced Training
Tableau Software Training
Informative post, thanks for sharing.
ReplyDeleteBlockchain Training in Chennai
Blockchain Training in Tambaram
AWS course in Chennai
RPA courses in Chennai
Python Training in Chennai
Python course in Chennai
This comment has been removed by the author.
ReplyDeleteYour blog is nice. I believe this will surely help the readers who are really in need of this vital piece of information. Thanks for sharing and kindly keep updating.
ReplyDeleteBest English Speaking Course in Mumbai
English Classes in Mumbai
Best Spoken English Classes in Mumbai
English Speaking Training Center in Mumbai
Spoken English Coaching Institute in Mumbai
Best English Classes in Mumbai
Best English Speaking Training near me
Wonderful piece of work. Master stroke. I have become a fan of your words. Pls keep on writing.
ReplyDeleteDrupal Training in Chennai
Drupal Software
Drupal Training
Drupal 8 Training
Drupal Classes
Drupal 7 Training
Drupal Certification Training
Drupal Training Course
Drupal 7 Certification
You are an awewsome writer. The way you expess things is very interesting. Waiting for your next post.
ReplyDeleteNode JS Training in Chennai
Node JS Course in Chennai
Node JS Advanced Training
Node JS Training Institute in chennai
Node JS Training Institutes in chennai
Node JS Course
Node JS Training in Velachery
Node JS Training in Tambaram
Node JS Training in Adyar
very good
ReplyDeleteinplant training in chennai
inplant training in chennai for it
Bermuda web hosting
Botswana hosting
armenia web hosting
dominican republic web hosting
iran hosting
palestinian territory web hosting
iceland web hosting
It is a great pleasure to know the information of this blog.
ReplyDeleteBig Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery
The clients and customers can see the articles through Customer or Partner people group. Inward clients anyway will have direct admittance to the articles in Salesforce. what is the best institute for Salesforce course in Noida?
ReplyDeletevirtual event When asked about the effect the pandemic has had on their tech proficiency, 55.4 percent of planners said they were more proficient than they were prior to the pandemic and 24 percent said they were much more so. virtual gifts, quick icebreakers for virtual meetings and free event ticket software
ReplyDelete