Hadoop Hands-on: Difference between revisions
From Gridkaschool
Jump to navigationJump to search
No edit summary |
No edit summary |
||
Line 46: | Line 46: | ||
==Session F (optional)== |
==Session F (optional)== |
||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ |
Revision as of 19:55, 25 August 2012
28.8.2012 – 13:30
Session A
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie
What is CDH and the Cloudera-Manager?
Installation, starting and basic configurations of a small cluster
Session B
HDFS intro (Name Node, Data Node, Secondary Name Node)
How is data stored in HDFS?
Properties and configurations, relevant for efficient working with HDFS.
HDFS commands
Session C
Working with the webbased-GUI
Running and tracking jobs
Java-API and samples
Streaming API sample
Session D
Map Reduce details, Java-API and Streaming (awk sample)
HDFS details, using the webbased-GUI for deeper insights
Breaking down a cluster and heal it
Session E
Intro to Hive and Sqoop
Dataimport via Sqoop
Hive scripts
Session F (optional)
- Serialisation / Deserialisation and user defined functions with Hive
- Workflows with oozie