Difference between revisions of "Hadoop Hands-on"
From Gridkaschool
Line 1: | Line 1: | ||
− | |||
− | |||
− | Hadoop hands on |
||
− | |||
28.8.2012 – 13:30 |
28.8.2012 – 13:30 |
||
− | Session 1 |
+ | ==Session 1== |
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie |
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie |
||
Line 13: | Line 9: | ||
Installation, starting and basic configurations of a small cluster |
Installation, starting and basic configurations of a small cluster |
||
− | Session 2 |
+ | ==Session 2== |
HDFS intro (Name Node, Data Node, Secondary Name Node) |
HDFS intro (Name Node, Data Node, Secondary Name Node) |
||
Line 23: | Line 19: | ||
HDFS commands |
HDFS commands |
||
− | Session 3 |
+ | ==Session 3== |
Working with the webbased-GUI |
Working with the webbased-GUI |
||
Line 33: | Line 29: | ||
Streaming API sample |
Streaming API sample |
||
− | Session 4 |
+ | ==Session 4== |
Map Reduce details, Java-API and Streaming (awk sample) |
Map Reduce details, Java-API and Streaming (awk sample) |
||
Line 41: | Line 37: | ||
Breaking down a cluster and heal it |
Breaking down a cluster and heal it |
||
− | Session 5 |
+ | ==Session 5== |
Intro to Hive and Sqoop |
Intro to Hive and Sqoop |
||
Line 49: | Line 45: | ||
Hive scripts |
Hive scripts |
||
− | Session 6 (optional) |
+ | ==Session 6 (optional)== |
SerDe and UDF with Hive |
SerDe and UDF with Hive |
Revision as of 19:53, 25 August 2012
28.8.2012 – 13:30
Session 1
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie
What is CDH and the Cloudera-Manager?
Installation, starting and basic configurations of a small cluster
Session 2
HDFS intro (Name Node, Data Node, Secondary Name Node)
How is data stored in HDFS?
Properties and configurations, relevant for efficient working with HDFS.
HDFS commands
Session 3
Working with the webbased-GUI
Running and tracking jobs
Java-API and samples
Streaming API sample
Session 4
Map Reduce details, Java-API and Streaming (awk sample)
HDFS details, using the webbased-GUI for deeper insights
Breaking down a cluster and heal it
Session 5
Intro to Hive and Sqoop
Dataimport via Sqoop
Hive scripts
Session 6 (optional)
SerDe and UDF with Hive
Workflows with oozie