Hadoop Hands-on

From Gridkaschool

Revision as of 19:56, 25 August 2012 by Kamir1604 (talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to navigation Jump to search

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

28.8.2012 – 13:30

Session A

The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie
What is CDH and the Cloudera-Manager?
Installation, starting and basic configurations of a small cluster

Session B

HDFS intro (Name Node, Data Node, Secondary Name Node)
How is data stored in HDFS?
Properties and configurations, relevant for efficient working with HDFS.
HDFS commands

Session C

Working with the webbased-GUI
Running and tracking jobs
Java-API and samples
Streaming API sample

Session D

Map Reduce details, Java-API and Streaming (awk sample)
HDFS details, using the webbased-GUI for deeper insights
Breaking down a cluster and heal it

Session E

Intro to Hive and Sqoop
Dataimport via Sqoop
Hive scripts

Session F (optional)

Serialisation / Deserialisation and user defined functions with Hive
Workflows with oozie

Retrieved from "https://wiki.scc.kit.edu/gridkaschool/index.php?title=Hadoop_Hands-on&oldid=472"

Navigation menu