Difference between revisions of "Hadoop Hands-on"
Line 1: | Line 1: | ||
28.8.2012 – 13:30 |
28.8.2012 – 13:30 |
||
− | ==Session |
+ | ==Session A== |
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie |
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie |
||
Line 9: | Line 9: | ||
Installation, starting and basic configurations of a small cluster |
Installation, starting and basic configurations of a small cluster |
||
− | ==Session |
+ | ==Session B== |
HDFS intro (Name Node, Data Node, Secondary Name Node) |
HDFS intro (Name Node, Data Node, Secondary Name Node) |
||
Line 19: | Line 19: | ||
HDFS commands |
HDFS commands |
||
− | ==Session |
+ | ==Session C== |
Working with the webbased-GUI |
Working with the webbased-GUI |
||
Line 29: | Line 29: | ||
Streaming API sample |
Streaming API sample |
||
− | ==Session |
+ | ==Session D== |
Map Reduce details, Java-API and Streaming (awk sample) |
Map Reduce details, Java-API and Streaming (awk sample) |
||
Line 37: | Line 37: | ||
Breaking down a cluster and heal it |
Breaking down a cluster and heal it |
||
− | ==Session |
+ | ==Session E== |
Intro to Hive and Sqoop |
Intro to Hive and Sqoop |
||
Line 45: | Line 45: | ||
Hive scripts |
Hive scripts |
||
− | ==Session |
+ | ==Session F (optional)== |
− | + | + Serialisation / Deserialisation and user defined functions with Hive |
|
− | Workflows with oozie |
+ | + Workflows with oozie |
Revision as of 19:54, 25 August 2012
28.8.2012 – 13:30
Session A
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie
What is CDH and the Cloudera-Manager?
Installation, starting and basic configurations of a small cluster
Session B
HDFS intro (Name Node, Data Node, Secondary Name Node)
How is data stored in HDFS?
Properties and configurations, relevant for efficient working with HDFS.
HDFS commands
Session C
Working with the webbased-GUI
Running and tracking jobs
Java-API and samples
Streaming API sample
Session D
Map Reduce details, Java-API and Streaming (awk sample)
HDFS details, using the webbased-GUI for deeper insights
Breaking down a cluster and heal it
Session E
Intro to Hive and Sqoop
Dataimport via Sqoop
Hive scripts
Session F (optional)
+ Serialisation / Deserialisation and user defined functions with Hive
+ Workflows with oozie