Data Analysis in Python

From Gridkaschool

Content

Python is a high-level dynamic object-oriented programming language. It is easy to learn, intuitive, well documented, very readable and extremely powerful. Python is packaged with an impressive standard library following the so called "batteries included" philosophy. Together with the large number of additionally available scientific packages like NumPy, SciPy, pandas, matplotlib, etc., Python becomes a very well suited programming language for data analysis.

One more thing to mention is the possibility to easily integrate C, C++ or even FORTRAN code into Python, which can be used to optimize computational bottlenecks by moving the code to a lower-level compiled language. Cython, a compiler for Python code, is one of the standard ways to transform Python code into fast compiled low-level extensions and to interface already existing C/C++ code.

This hands-on session introduces the pythonic way of programming and demonstrates the power of Python in data analysis.

Required Programming Skills

  • Knowledge of basic concepts of a programming language. (i.e. for-loops, while-loops, if-else-statements, ...)

Technical Requirements for the Course

You can either use the pre-installed nodes at GridKa or your private notebook, which will be probably more comfortable for you.

The pre-installed nodes at GridKa

You need:

  • a ssh client installed on your laptop (For Linux and Mac it is usually already pre-installed. I would recommend putty for Windows users)
  • a web browser installed on your laptop
  • More instructions will be handed over in the lecture itself
  • Windows users should make themselves familar on how to do port forwarding in there clients. For putty that is described in the following document by the University of Utrecht.

Your private notebook

You can download the anaconda package from http://continuum.io/downloads and follow the installation instructions. In addition, you should also have a git client installed.

Material

Lecture

Exercises

Solutions of the Exercises

  • The solutions of the exercises are available from the solutions branch in the git repository or you can download a tarball, too.

Thank you very much for your attendance!