Hadoop Hands-on: Difference between revisions

From Gridkaschool
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:


Hadoop hands on

28.8.2012 – 13:30
28.8.2012 – 13:30


Session 1
==Session 1==


The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie
The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie
Line 13: Line 9:
Installation, starting and basic configurations of a small cluster
Installation, starting and basic configurations of a small cluster


Session 2
==Session 2==


HDFS intro (Name Node, Data Node, Secondary Name Node)
HDFS intro (Name Node, Data Node, Secondary Name Node)
Line 23: Line 19:
HDFS commands
HDFS commands


Session 3
==Session 3==


Working with the webbased-GUI
Working with the webbased-GUI
Line 33: Line 29:
Streaming API sample
Streaming API sample


Session 4
==Session 4==


Map Reduce details, Java-API and Streaming (awk sample)
Map Reduce details, Java-API and Streaming (awk sample)
Line 41: Line 37:
Breaking down a cluster and heal it
Breaking down a cluster and heal it


Session 5
==Session 5==


Intro to Hive and Sqoop
Intro to Hive and Sqoop
Line 49: Line 45:
Hive scripts
Hive scripts


Session 6 (optional)
==Session 6 (optional)==


SerDe and UDF with Hive
SerDe and UDF with Hive

Revision as of 19:53, 25 August 2012

28.8.2012 – 13:30

Session 1

The hadoop ecosystem: HDFS, MR, HUE, Sqoop, Hive, Pig, HBase, Flume, Oozie

What is CDH and the Cloudera-Manager?

Installation, starting and basic configurations of a small cluster

Session 2

HDFS intro (Name Node, Data Node, Secondary Name Node)

How is data stored in HDFS?

Properties and configurations, relevant for efficient working with HDFS.

HDFS commands

Session 3

Working with the webbased-GUI

Running and tracking jobs

Java-API and samples

Streaming API sample

Session 4

Map Reduce details, Java-API and Streaming (awk sample)

HDFS details, using the webbased-GUI for deeper insights

Breaking down a cluster and heal it

Session 5

Intro to Hive and Sqoop

Dataimport via Sqoop

Hive scripts

Session 6 (optional)

SerDe and UDF with Hive

Workflows with oozie